Thanks to the development of sequencing technology we are now able to study organisms that we were never able before, due to fact that are not culturable. Can you imagine the great diversity of microorganisms in our world and we know almost nothing about most of them? Well, thanks to the new sequencing technologies we are starting to see a fraction of this diversity, and understand more about these communities. I believe that knowing how these “unknowns” impact the environment will bring us new advantages in protecting endangered environments, understanding nutrient cycles (giving us a new weapon to fight climate change) and also developing new drugs – and why not help on oil spills?!

But, without getting off topic… let’s talk about these new sequencing technologies and why I have decided to blog about it. First of all, this wasn’t an easy post, I’ve kept changing it: more technical, less technical, more journal article like… in the end I decided to take it as a “conversational post”, a start for, hopefully, an open discussion on next generation sequencing (NGS) and how to choose the best one available for a specific project. Which characteristics are you able to compromise for? But mainly: have you considered choosing the technology during the project planning, before starting the sampling? Deciding which technology at the beginning of your project is very important; I have been asked various times now why I have choose Illumina and not 454 or “Aren’t they all the same? You could have choose the cheaper one.” Well… not really, and the main difference is that the technology you choose will depend on the type of analyses you are planning to do, the aim of the project, the sampling and, of course, funding. They all play an important part in the decision.

Nowadays there are so many different sequencing technologies and little research funding, thus it is very important to carefully choose the more advantageous technology for a specific project (please do not choose just because one is cheaper! There are differences in what you will be able to achieve). For example, the other day I was asked what might be the best for a specific project, and I was able to say Illumina but then I couldn’t give much information on the new MiSeq, would that be better? There is lots to consider in order to achieve the best results: it is the same as considering a kit for a DNA extraction: which one of these will give me higher DNA concentrations, the purity of the DNA, affordability and so on.

Different technologies require different amount of DNA and RNA, so you will need to plan your sampling accordingly: will you need to amplify? If not, how much sample would you have to collect to acquire enough DNA for the run? Is it going to be an environmental, human, bacterial, or viral sample? Can you concentrate without altering or biasing the sample? Depending on the type of bioinformatics analyses you want to run on your data, there are differences in the NGS technology, i.e. if you are looking for novel genomes/genes and you are thinking to run de novo assembly you might want to prefer one NGS (for example Illumina) while if you will be working with known genomes you might prefer 454-pyrosequencing and obtain longer sequences. There are tons (unfortunately I mean kind of literally this time) of things to consider; so, how do we choose? Well…as always literature helps, there is no easy way out, but lately I have heard of consulting facilities where a consultant will help you choose the best technology for your project and hopefully deal with quotas from the different companies, this seems a very smart move since it is becoming very difficult to keep up with all the new technologies.

So talking about platforms the major ones at the moment are:

  1. Illumina:
  • Illumina MiSeq
  • Illumina HighSeq
  • Illumina GAIIx
  1. 454:
  • GS Junior
  • GS FLX
  1. Pacific Biosciences (PacBio):
  • PacBio RS II
  1. Life- Technologies -Ion Torrent :
  • Ion AmpliSeq
  • Ion Proton
  • Ion PGM

All these technologies have different bias which, you will have to consider before choosing the right one for your project, what are you willing to “sacrifice”: false SNPs, uneven coverages, short sequences, homopolymer errors? Which one has the lowest error rate? I found some interesting papers while searching for this information, so here are the links:

Unfortunately I couldn’t find any more recent comparison papers but depending on the field and type of project there are various papers comparing results by using different platforms. I really hope that this post will open up some discussion on how to choose which NGS for various projects and what others have encountered in the path to the planning of a sequencing study.

So … how will you or have you choose the right technology for your project?