Sanger sequencing revolutionized genomics by enabling DNA sequence determination. Developed in the 1970s, it uses chain termination to generate DNA fragments of varying lengths, which are then separated and analyzed to reveal the sequence.

This method played a crucial role in the Human Genome Project. While newer technologies offer higher throughput, Sanger sequencing remains valuable for its accuracy, long read lengths, and targeted sequencing capabilities, especially in clinical settings.

History of Sanger sequencing

  • Developed by and colleagues in the 1970s, Sanger sequencing revolutionized the field of genomics by enabling the determination of DNA sequences
  • The method was based on the principle of chain termination, which involved the incorporation of modified nucleotides (dideoxynucleotides) during DNA synthesis
  • Sanger sequencing played a crucial role in the Human Genome Project, which aimed to sequence the entire human genome and was completed in 2003

Overview of Sanger sequencing workflow

  • Sanger sequencing involves several key steps, including DNA template preparation, annealing, dideoxy chain termination, separation of DNA fragments, and detection and visualization
  • The workflow begins with the isolation and purification of the DNA template to be sequenced, followed by the annealing of a specific primer to initiate the sequencing reaction
  • The incorporation of dideoxynucleotides during DNA synthesis leads to the generation of fragments of varying lengths, which are then separated by size and detected to determine the DNA sequence

DNA template preparation

Top images from around the web for DNA template preparation
Top images from around the web for DNA template preparation
  • Involves the isolation and purification of the DNA to be sequenced, typically using methods such as plasmid extraction or PCR amplification
  • The DNA template must be of high quality and free from contaminants to ensure accurate sequencing results
  • Quantification of the DNA template is performed to determine the optimal amount for the sequencing reaction

Primer annealing

  • A specific primer, complementary to a known region of the DNA template, is annealed to the single-stranded DNA
  • The primer serves as a starting point for DNA synthesis during the sequencing reaction
  • Primers are typically 18-25 nucleotides in length and are designed to have a melting temperature (Tm) suitable for the sequencing conditions

Dideoxy chain termination

  • The key principle behind Sanger sequencing, dideoxy chain termination involves the incorporation of modified nucleotides (dideoxynucleotides or ddNTPs) during DNA synthesis
  • ddNTPs lack a 3' hydroxyl group, which prevents the formation of a phosphodiester bond with the next nucleotide, resulting in the termination of the growing DNA strand
  • Four separate sequencing reactions are performed, each containing a different ddNTP (ddATP, ddCTP, ddGTP, or ddTTP) in addition to the regular deoxynucleotides (dNTPs)

Separation of DNA fragments

  • The DNA fragments generated during the sequencing reaction are separated by size using gel or
  • In gel electrophoresis, the fragments are loaded onto a polyacrylamide gel and separated based on their length, with shorter fragments migrating faster than longer ones
  • Capillary electrophoresis uses a thin capillary filled with a polymer matrix to separate the fragments, offering higher resolution and automation compared to gel electrophoresis

Detection and visualization

  • The separated DNA fragments are detected and visualized to determine the DNA sequence
  • Traditionally, radioactive labeling (32P or 35S) was used to label the fragments, which were then visualized by autoradiography
  • Modern Sanger sequencing employs fluorescent labeling, where each ddNTP is labeled with a different fluorescent dye, allowing for the detection of the fragments using a laser and a CCD camera

Advantages of Sanger sequencing

  • Sanger sequencing has several advantages that have contributed to its widespread use and impact on genomics research
  • The method is known for its high accuracy and reliability, making it a gold standard for DNA sequencing
  • Sanger sequencing also generates long read lengths, enabling the sequencing of larger contiguous regions of DNA compared to some other sequencing technologies

High accuracy and reliability

  • Sanger sequencing is considered the gold standard for DNA sequencing due to its high accuracy and reliability
  • The method typically achieves an accuracy of 99.999%, meaning that only one base in 100,000 is likely to be incorrectly identified
  • The high accuracy of Sanger sequencing is attributed to the use of high-fidelity DNA polymerases and the ability to generate multiple reads of the same region for consensus building

Long read lengths

  • Sanger sequencing can generate read lengths of up to 1,000 base pairs (bp) or more, depending on the specific platform and chemistry used
  • These long read lengths are advantageous for sequencing larger contiguous regions of DNA, such as entire genes or small genomes
  • Long read lengths also facilitate the assembly of complex genomes and the identification of structural variations, such as insertions, deletions, and rearrangements

Limitations of Sanger sequencing

  • Despite its advantages, Sanger sequencing has some limitations that have led to the development of alternative sequencing technologies
  • The method has a relatively low throughput, making it less suitable for large-scale sequencing projects
  • Sanger sequencing also has a higher cost per base compared to next-generation sequencing technologies, which can be a limiting factor for some applications

Low throughput

  • Sanger sequencing has a lower throughput compared to next-generation sequencing (NGS) technologies
  • The method typically generates hundreds to a few thousand reads per run, depending on the specific platform and setup
  • This low throughput makes Sanger sequencing less suitable for large-scale sequencing projects, such as whole-genome sequencing or transcriptome analysis, which require millions to billions of reads

High cost per base

  • Sanger sequencing has a higher cost per base compared to NGS technologies
  • The cost of Sanger sequencing is influenced by factors such as the cost of reagents, labor, and equipment maintenance
  • While the cost of Sanger sequencing has decreased over time, it remains higher than that of NGS methods, particularly for large-scale projects

Sanger sequencing vs next-generation sequencing

  • Sanger sequencing and next-generation sequencing (NGS) are two distinct approaches to DNA sequencing, each with its own strengths and limitations
  • The two methods differ in their underlying technologies, throughput, cost, and applications
  • Understanding the differences between Sanger sequencing and NGS is important for selecting the most appropriate method for a given research question or application

Differences in technology

  • Sanger sequencing relies on the principle of dideoxy chain termination and the separation of DNA fragments by size, while NGS technologies employ various strategies for massively parallel sequencing
  • Common NGS technologies include Illumina (sequencing by synthesis), Ion Torrent (semiconductor sequencing), and Pacific Biosciences (single-molecule real-time sequencing)
  • NGS technologies typically generate shorter read lengths (100-600 bp) compared to Sanger sequencing but offer much higher throughput and lower cost per base

Comparison of applications

  • Sanger sequencing is well-suited for targeted sequencing of specific regions, such as individual genes or small genomes, and for the validation of NGS results
  • NGS technologies are more appropriate for large-scale sequencing projects, such as whole-genome sequencing, exome sequencing, transcriptome analysis, and metagenomics
  • The choice between Sanger sequencing and NGS depends on factors such as the research question, the size of the target region, the required accuracy and depth of coverage, and the available budget and resources

Computational analysis of Sanger sequencing data

  • The analysis of Sanger sequencing data involves several computational steps to convert the raw data into a high-quality DNA sequence
  • Key steps in the analysis pipeline include base calling, quality score assessment, and and assembly
  • Computational tools and algorithms play a crucial role in ensuring the accuracy and reliability of the final sequencing results

Base calling algorithms

  • Base calling is the process of determining the identity of each nucleotide in the DNA sequence based on the fluorescent signals generated during the sequencing reaction
  • Various base calling algorithms have been developed, such as Phred, which assigns a quality score to each base call based on the probability of an error
  • Advanced base calling algorithms incorporate machine learning techniques to improve accuracy and handle complex signal patterns

Quality score assessment

  • Quality scores, such as Phred scores, provide a measure of the reliability of each base call in the DNA sequence
  • Higher quality scores indicate a lower probability of an error, while lower scores suggest a higher likelihood of an incorrect base call
  • Quality score assessment is essential for filtering out low-quality reads, trimming low-quality bases, and ensuring the overall accuracy of the final sequence

Sequence alignment and assembly

  • Sequence alignment involves comparing the generated DNA sequence to a reference genome or other sequences to identify similarities and differences
  • Assembly is the process of merging overlapping sequence reads into larger contiguous sequences (contigs) to reconstruct the original DNA sequence
  • Computational tools, such as BLAST (Basic Local Alignment Search Tool) and CAP3 (Contig Assembly Program), are commonly used for sequence alignment and assembly, respectively

Applications of Sanger sequencing

  • Sanger sequencing has a wide range of applications in genomics research and clinical settings
  • The method is particularly useful for targeted sequencing of specific genes or regions of interest, as well as for the validation of results obtained from other sequencing technologies
  • Sanger sequencing also plays a crucial role in the identification of mutations and polymorphisms associated with genetic disorders and other phenotypes

Targeted gene sequencing

  • Sanger sequencing is commonly used for the targeted sequencing of specific genes or regions of interest
  • This approach is particularly useful for the identification of mutations associated with genetic disorders, such as cystic fibrosis or sickle cell anemia
  • Targeted gene sequencing using Sanger sequencing allows for the accurate and reliable determination of the DNA sequence of the region of interest, facilitating the diagnosis and management of genetic diseases

Validation of NGS results

  • Sanger sequencing is often used to validate results obtained from next-generation sequencing (NGS) experiments
  • NGS technologies can generate large amounts of data but may be prone to errors or biases, particularly in regions with low coverage or complex sequence features
  • Targeted Sanger sequencing of specific regions can provide an independent confirmation of the NGS results, increasing the confidence in the findings and reducing the risk of false positives or false negatives

Identification of mutations and polymorphisms

  • Sanger sequencing is a powerful tool for the identification of mutations and polymorphisms in DNA sequences
  • Mutations, such as single nucleotide variants (SNVs) and small insertions or deletions (indels), can be detected by comparing the generated sequence to a reference genome or wild-type sequence
  • Polymorphisms, such as single nucleotide polymorphisms (SNPs), can be identified by sequencing multiple individuals and comparing their sequences to identify variation within a population

Automation and high-throughput Sanger sequencing

  • Advancements in automation and high-throughput technologies have greatly enhanced the efficiency and scalability of Sanger sequencing
  • Capillary electrophoresis has replaced traditional gel-based methods for the separation of DNA fragments, enabling faster and more automated sequencing
  • Multiplexing strategies have been developed to increase the throughput of Sanger sequencing by allowing multiple samples to be sequenced simultaneously

Capillary electrophoresis

  • Capillary electrophoresis (CE) has become the standard method for separating DNA fragments in modern Sanger sequencing
  • CE uses a thin capillary filled with a polymer matrix to separate the fragments based on their size, with smaller fragments migrating faster than larger ones
  • Automated CE systems, such as the Applied Biosystems 3730xl DNA Analyzer, can process up to 96 or 384 samples simultaneously, greatly increasing the throughput of Sanger sequencing

Multiplexing strategies

  • Multiplexing strategies have been developed to further increase the throughput of Sanger sequencing by allowing multiple samples to be sequenced in a single run
  • One common approach is to use different fluorescent labels for each sample, enabling the simultaneous sequencing and detection of multiple DNA templates
  • Another strategy is to use barcodes or unique identifiers ligated to each DNA template, which can be used to demultiplex the sequencing data and assign the reads to their respective samples

Future of Sanger sequencing

  • Despite the advent of next-generation sequencing technologies, Sanger sequencing remains an important tool in genomics research and is likely to continue to play a significant role in the future
  • The integration of Sanger sequencing with other sequencing technologies, such as NGS and third-generation sequencing methods, can provide a more comprehensive and accurate view of genome sequences
  • Sanger sequencing is expected to maintain its relevance in specific applications, such as targeted sequencing, validation studies, and the characterization of complex genomic regions

Integration with other sequencing technologies

  • The integration of Sanger sequencing with other sequencing technologies can leverage the strengths of each method to provide a more complete and accurate picture of genome sequences
  • For example, NGS can be used to generate high-throughput data for whole-genome or transcriptome analysis, while Sanger sequencing can be employed to validate specific regions or resolve complex sequence features
  • Third-generation sequencing technologies, such as Pacific Biosciences and Oxford Nanopore, can generate ultra-long reads that can be used to scaffold and improve the assembly of genomes sequenced using NGS and Sanger sequencing

Continued relevance in genomics research

  • Sanger sequencing is expected to maintain its relevance in specific applications within genomics research
  • The high accuracy and reliability of Sanger sequencing make it well-suited for the validation of variants identified through NGS or other high-throughput methods
  • Sanger sequencing will likely continue to be the method of choice for targeted sequencing of specific genes or regions, particularly in clinical settings where accuracy is paramount
  • The ability of Sanger sequencing to generate long reads will remain valuable for the characterization of complex genomic regions, such as repetitive elements or structural variations

Key Terms to Review (19)

Automated sequencers: Automated sequencers are advanced laboratory instruments designed to rapidly determine the order of nucleotides in DNA or RNA samples. These machines streamline the sequencing process by using fluorescent dye terminators and capillary electrophoresis, allowing for high-throughput sequencing, reduced manual labor, and increased accuracy compared to traditional methods. This technology plays a vital role in molecular biology and genomics, particularly in the context of Sanger sequencing.
Base-calling accuracy: Base-calling accuracy refers to the precision with which individual nucleotide bases are identified and recorded from DNA sequencing data. In the context of Sanger sequencing, this accuracy is crucial since it directly affects the reliability of the sequence data generated. High base-calling accuracy ensures that the resulting genetic information is trustworthy and can be used confidently in various applications, including genetic research and diagnostics.
Capillary electrophoresis: Capillary electrophoresis is a technique used to separate charged particles, such as DNA fragments, based on their size and charge, within a narrow capillary tube filled with an electrolyte solution. This method offers high resolution and speed, making it especially useful for analyzing small amounts of genetic material, which is critical in sequencing applications like Sanger sequencing.
Chain termination method: The chain termination method, also known as Sanger sequencing, is a technique used to determine the nucleotide sequence of DNA. It involves the incorporation of modified nucleotides that halt DNA synthesis at specific points, creating fragments of varying lengths that can be analyzed to reveal the original sequence. This method revolutionized molecular biology by enabling accurate and efficient sequencing of genetic material.
Clean-up Procedures: Clean-up procedures refer to the steps taken after Sanger sequencing to remove unwanted components from the reaction mixture, ensuring the purity and integrity of the DNA fragments for subsequent analysis. These procedures are crucial because they eliminate residual enzymes, unincorporated nucleotides, and other contaminants that could interfere with downstream applications such as gel electrophoresis or sequencing. Proper clean-up enhances the reliability of results and is essential for accurate data interpretation.
Contamination: Contamination refers to the unwanted introduction of extraneous biological material, such as DNA or other nucleic acids, into a sample during various stages of genomic analysis. This can lead to inaccurate results in sequencing and analysis, which is critical in methods like Sanger sequencing and across different sequencing platforms and instrumentation. The integrity of genomic data is heavily reliant on minimizing contamination to ensure that results reflect the true genetic information present in the sample.
Dideoxynucleotide: A dideoxynucleotide is a type of nucleotide that lacks the 3' hydroxyl (-OH) group, which is essential for DNA strand elongation. This structural difference prevents further nucleotides from being added during DNA synthesis, making dideoxynucleotides crucial in chain-terminating applications such as DNA sequencing. Their incorporation into a growing DNA strand leads to termination, allowing researchers to determine the sequence of nucleotides in a DNA molecule.
DNA Fragment Analysis: DNA fragment analysis is a technique used to assess the size and quantity of DNA fragments generated during various molecular biology processes. This method allows researchers to separate and visualize DNA fragments based on their length, which is crucial for applications like sequencing, genotyping, and mutation detection. It often involves gel electrophoresis or capillary electrophoresis to achieve high-resolution separation of the fragments.
DNA Polymerase: DNA polymerase is an enzyme that plays a crucial role in DNA replication by synthesizing new DNA strands complementary to the template strands. This enzyme adds nucleotides to the growing DNA chain in a sequence-specific manner, ensuring that the genetic information is accurately copied. DNA polymerases are essential for cellular processes such as DNA repair and the replication of DNA during cell division.
Double-stranded sequencing: Double-stranded sequencing is a method used to determine the nucleotide sequence of both strands of a DNA molecule simultaneously. This approach provides more accurate and complete genetic information by analyzing both complementary strands, which helps identify variations and mutations that might be present in the DNA. The technique is particularly useful for resolving ambiguities that can arise when sequencing single strands, enhancing the reliability of genomic studies.
Electrophoresis: Electrophoresis is a laboratory technique used to separate charged particles, like DNA or proteins, based on their size and charge by applying an electric field. This method allows for the visualization and analysis of biomolecules, which is crucial for various applications, including genetic analysis and protein characterization. The separation process occurs as molecules move through a gel matrix, with smaller fragments migrating faster than larger ones, thus facilitating the identification of specific nucleic acid sequences or proteins.
Fluorescent dye terminator sequencing: Fluorescent dye terminator sequencing is a method of DNA sequencing that uses fluorescently labeled dideoxynucleotides to terminate DNA strand elongation at specific bases. This technique allows for the simultaneous detection of multiple fluorescent dyes, enabling the identification of the DNA sequence in a highly efficient manner. By incorporating different dyes for each of the four nucleotides, this method enhances the accuracy and speed of sequencing compared to traditional methods.
Frederick Sanger: Frederick Sanger was a British biochemist renowned for his pioneering work in the field of DNA sequencing. He developed the Sanger sequencing method, a revolutionary technique that allowed for the determination of nucleotide sequences in DNA, forming the backbone of modern genomics. His contributions significantly advanced the field and paved the way for the development of various sequencing platforms and instrumentation used today.
Genomic sequencing: Genomic sequencing is the process of determining the complete DNA sequence of an organism's genome, which includes all of its genetic material. This technique provides crucial insights into the structure, function, and evolution of genomes, and it can help identify genetic variations that may contribute to diseases or traits. By revealing the complete genetic blueprint, genomic sequencing has transformative applications in medicine, biology, and biotechnology.
Primer: A primer is a short, single-stranded nucleic acid sequence that serves as a starting point for DNA synthesis during the process of DNA replication or amplification. In Sanger sequencing, primers are essential for providing a complementary sequence to which DNA polymerase can attach and begin adding nucleotides, allowing for the accurate sequencing of DNA strands.
Reaction Conditions: Reaction conditions refer to the specific environmental factors that influence a chemical reaction, including temperature, pH, concentration, and the presence of catalysts or inhibitors. These conditions are crucial in determining the efficiency and outcome of processes like DNA sequencing, particularly in methodologies such as Sanger sequencing, where optimal conditions lead to accurate and efficient results.
Sequence Alignment: Sequence alignment is a method used to identify similarities and differences between biological sequences, such as DNA, RNA, or protein sequences. This technique is crucial in various areas of genomics and bioinformatics, as it helps researchers understand evolutionary relationships, functional similarities, and structural characteristics among sequences.
Template DNA: Template DNA is the single-stranded DNA molecule that serves as a blueprint during the process of DNA replication and sequencing. In Sanger sequencing, the template DNA is essential because it provides the sequence of nucleotides that will be copied to create complementary strands. This process involves synthesizing new DNA strands by adding nucleotides complementary to the bases on the template strand, ultimately allowing for the determination of the DNA sequence.
Walter Gilbert: Walter Gilbert is an American biochemist and molecular biologist known for his pioneering work in DNA sequencing and genetics. He significantly contributed to the development of Sanger sequencing techniques, which became foundational in the field of genomics, enabling scientists to determine the precise order of nucleotides in DNA strands.
© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.