To Tackle the Accuracy Problem in Next-Generation Sequencing

TAIPEI, TAIWAN, Dec.30, 2021 - Last time, we discussed the challenge of accuracy in NGS. As a consequence, enhancing the result performance in difficult regions (e.g. repetitive or highly polymorphic regions) is improving patient outcomes. However, the difficult regions in the genome with clinically important variants are not covered with standard NGS strategies. To tackle this issue, scientists and research institutions have formulated restricted regulations and guidelines to implement the quality control of procedures, library preparations, and data analysis.

The Errors in library preparation and sequencing The first step in a clinical sequencing assay would be library preparation. Through library preparation, DNA molecules are fragmented, ligated to adapters suitable for the particular sequencer used, size selected, and amplified using PCR to build a “cluster” of clones. Although many enzymatic steps within library construction protocols have the potential to introduce sample composition bias, nowadays, new tools optimization are available to overcome the difficulty for sample quality.

At the sequencing step, the clusters are mixed with color-labeled nucleotides and DNA polymerase. When a complementary nucleotide is added to a cluster, the corresponding color of light is emitted. Capture images of this as it happens. Errors creep in when some templates get out of sync by missing an incorporation or by incorporating two or more nucleotides at once. Thus, each location is sequenced many times that is called read depth. For instance, 30x means that on average, each position is sequenced 30 times. That is the reason NGS technology will generate a huge amount of data.

Value of Benchmarking Genomics Analysis Tools Scientists will use computational mathematical analyses to compare with a reference human genome. However, the errors around large structural variants or large repeats are difficult to align with the reference and identify single nucleotide variants (SNVs) and tandem repeats. The standard reference genome samples (See More) are provided by the Genome Reference Consortium (GRC), an international collective of academic and research institutes with expertise in genome mapping, sequencing, and informatics, formed to improve the representation of reference genomes, and then standardized for benchmarking through the Genome in a Bottle Consortium (GIAB). It’s a public-private-academic consortium hosted by NIST to develop the technical infrastructure (e.g., reference genome materials, reference standards, and methods). The Global Alliance for Genomics and Health (GA4GH) also has developed benchmarking performance tools to help establish analytical validity for different types of variants and repetitive and non-repetitive regions.

The US Food and Drug Administration (FDA) has recognized the need for innovation in regulatory science and has launched a series of precision FDA community challenges using GIAB data to benchmark algorithms. The latest Challenge held in 2020 showed higher error rates in the difficult regions covered by the new benchmarks, but new NGS technologies and algorithms are improving characterization of these difficult regions. All these efforts are enabled to help make high quality analysis reports for accurate diagnoses.

The Importance to Understand the Cons and Pros of NGS Overall NGS technologies are more mature and the cost is lower than long-read technologies. The accuracy of detecting small variants in non-repetitive regions of the human genome is high. It is crucial for physicians to understand both the strengths and cons of any particular sequencing test. From the cost point of view, NGS is still the best choice for clinical diagnosis. The excavation of finding methods to increase the accuracy in difficult regions will still continue to reach the full potential of clinical sequencing assays to detect important variants.


[1] Challenges of Accuracy in Germline Clinical Sequencing Data

[2] Genome In A Bottle

[3] About PrecisionFDA

About WASAI Technology Inc.

WASAI Technology's mission is to deliver acceleration technologies of High-Performance Data Analysis (HPDA) in future data centers for targeted vertical applications with massive volumes and high velocities of scientific data. To strengthen and advance scientific discovery and technological research via big data-intensive acceleration in high-performance computing, WASAI Technology aims to improve commercialization and commoditization of scientific and technological applications.


WASAI Tecnology Inc.

4F, No. 6, Zhiyuan 3rd Rd., Beitou Dist., Taipei 112025, Taiwan