The first complete, gapless sequence of a human genome has been published in a groundbreaking development.
The first gapless sequence of a human genome comes two decades after the Human Genome Project, which produced the first draft human genome sequence. According to the researchers, the complete, gap-free sequence of the predicted three billion bases in our DNA is critical for understanding the full spectrum of human genomic variation. Furthermore, this groundbreaking development could revolutionise the world’s understanding of how genetics contribute to certain diseases.
The work was done by the Telomere to Telomere (T2T) Consortium, which included leadership from researchers at the National Human Genome Research Institute (NHGRI), part of the National Institutes of Health, University of California, Santa Cruz, and the University of Washington, Seattle. NHGRI was the primary funder of the study.
Exploring the knowledge of chromosomes
Analyses of the complete genome sequence will significantly add to the existing knowledge of chromosomes, opening new lines of research utilising accurate maps for five chromosome arms. This will help answer basic biology questions about how chromosomes properly segregate and divide. The T2T consortium employed the now-complete human genome sequence as a reference to discover more than two million additional variants in the human genome. These studies provide more accurate information about the genomic variants within 6,222 medically relevant genes.
“Generating a truly complete human genome sequence represents an incredible scientific achievement, providing the first comprehensive view of our DNA blueprint,” said Eric Green, MD, PhD, Director of NHGRI. “This foundational information will strengthen the many ongoing efforts to understand all the functional nuances of the human genome, which in turn will empower genetic studies of human disease.”
The complete genome sequence will provide valuable data for studies that focus on human genomic variation. Thus, understanding how genetics contribute to certain diseases will transform routine clinical care in the future.
Understanding the full spectrum of the human genome
The full sequencing is built upon the work from the Human Genome Project, which mapped around 92% of the genome. Thousands of researchers developed better laboratory tools, computational methods, and strategic approaches to interpreting complex sequences.
The last 8% was generated by researchers to complete the sequence. They utilised a special cell line that has identical copies of each chromosome, unlike most human cells, which carry two slightly different copies. The researchers noted that most of the newly added DNA sequences were near the repetitive telomeres (long, trailing ends of each chromosome) and centromeres (dense middle sections of each chromosome).
“Ever since we had the first draft human genome sequence, determining the exact sequence of complex genomic regions has been challenging,” said Evan Eichler, PhD, a researcher at the University of Washington’s School of Medicine, and T2T consortium co-chair. “I am thrilled that we got the job done. The complete blueprint is going to revolutionise the way we think about human genomic variation, disease and evolution.”
Transforming DNA sequencing
The cost of sequencing a human genome utilising ’short-read’ technologies is only a few hundred dollars, having fallen significantly since the completion of the Human Genome Project. However, utilising this technology still leave some gaps in assembled genome sequences. The reduction in DNA sequencing costs follows innovations and investments into new DNA technologies that produce longer sequences.
Two state-of-the-art DNA sequencing approaches have emerged that produce longer sequence reads. The Oxford Nanopore DNA sequencing method can read up to one million DNA letters in a single read with modest accuracy, and the PacBio HiFi DNA sequencing method can read about 20,000 letters with nearly perfect accuracy. Researchers in the T2T consortium employed both DNA sequencing methods to generate the complete genome sequence.
“Using long-read methods, we have made breakthroughs in our understanding of the most difficult, repeat-rich parts of the human genome,” concluded Karen Miga, PhD, a co-chair of the T2T consortium whose research group at the University of California, Santa Cruz is funded by NHGRI. “This complete human genome sequence has already provided new insight into genome biology, and I look forward to the next decade of discoveries about these newly revealed regions.”