Method of the Month — Illumina Sequencing

April 2022

Emma Kinnear
The Eta Zeta Biology Journal

--

The next method in our sequencing series is known as Illumina sequencing, also referred to as Solexa sequencing. This technique is a second generation technology, as it was developed after Sanger sequencing, the first developed method.

Photo by Joyce McCown on Unsplash

Illumina sequencing is a relatively new form of DNA sequencing which researchers at the University of Cambridge first developed in 1998. With over twenty years of technological advancements between the introduction of Sanger Sequencing and its development, Illumina Genome Sequencing has several qualities that make it superior to earlier methods. Its main advantages are that it is high throughput and massively parallel, which means that the technology is automated and able to be used for large-scale repetition. This makes it a much more efficient process than previous techniques, and allows for a variety of applications in modern biology labs such as whole genome resequencing, transcriptome sequencing, gene expression profiling, and epigenomic sequencing. Sanger Sequencing is most effective at sequencing small sections of DNA, while Illumina Sequencing is best for sequencing an entire genome.

To understand the process of Illumina Sequencing, we must first begin by preparing the DNA that is about to be sequenced. The strand of DNA you’re trying to analyze is cut into smaller pieces (around 100 base pairs is a common length), and each fragment is capped at both ends by adaptor sequences, one sequence for each end. These adaptors are oligonucleotides, or short strands of synthetic DNA with a specific sequence for each fragment end. Your DNA fragments are now ready for attachment to the flow cell, a small glass slide used for sequencing. The flow cell is coated in small oligonucleotides complimentary to the adaptor sequences, allowing each DNA fragment to bind to it. This first step of preparing the DNA is known as library preparation.

Once the DNA fragments are bound to the flow cell at one end, a polymerase is used to generate a copy of each bound fragment, forming a double strand. The original fragment is attached only by its interaction with the adaptor sequence that is stuck to the flow cell, but the newly synthesized complimentary strand is covalently attached to the flow cell, making it a much more stable target for analysis. Because of this, the original fragments are washed off, leaving only their complimentary strands.

Following this wash, the unbound ends of the fragments will fold over and their adaptor sequences will bind to the corresponding adaptor on the flow cell, forming bridge like structures. For this reason, this step is known simply as bridge formation.

The bridged DNA must then be copied to create a reverse strand, so once again, the necessary enzymes and molecules are added to copy the DNA. This copying process is a form of PCR specific to Illumina sequencing that leads to bridge amplification. Repeating this process leads to the synthesis of many copies of each fragment. These groups of copies will appear as clusters on the flow cell, with each individual cluster representing an original fragment of DNA. This step is very creatively named cluster formation.

Once the clusters are formed, the actual sequencing can begin. By doing PCR with specialized reversible terminator nucleotides(RTNs), a full sequence of the target DNA segment can be generated. These RTNs allow the synthesis of complimentary DNA strands to proceed one nucleotide at a time. Additionally, these terminators have a fluorescent label that can produce a signal which is detectable by a computer imaging system. In broad terms, the steps of this “sequencing by synthesis” process are as follows:

  1. RTN added
  2. Fluorescent signal detected by computer
  3. Terminator cleaved from nucleotide
  4. Repeat

This process repeats until a full complimentary strand is synthesized. The powerful part of this technique is the fact that this process is occurring simultaneously for every fragment in every cluster on the flow cell. This is where the term “massively parallel” comes from. By the end of the sequencing run, the computer has captured millions of reads for each ~100 base pair fragment. With all that data, the computer is able to assemble the full sequence by matching up overlapping sequences of each fragment. Now that you understand how this system works, it is easy to see how this technique and techniques like it have revolutionized the field of genetics, allowing us to sequence whole genomes at a rate that was previously thought impossible.

For more information on Illumina Sequencing, check out the following links!

--

--