
Illustration: Sarah Grillo/Axios
Scientists have now drafted a complete version of the human genome sequence — but the job of deciphering our DNA has only just begun.
Why it matters: The bulk of the human genome is noncoding regions, some of which play an important role in how genes are expressed. New tools are allowing scientists to test exactly how these elements — once called "junk DNA" — work, which could lead to new drug targets.
Driving the news: A team of 99 scientists completed the human genome sequence last week, filling in gaps in the draft sequence published 20 years ago using some new technologies.
- They reported the human genome is 3.05 billion base pairs long and consists of 19,969 protein-coding genes, including more than 100 newly deciphered genes that can likely produce proteins.
- The completed genome sequence also now has 189 million base pairs of large swathes of highly repetitive DNA that doesn't encode for genes, including 5.5 million base pairs of newly discovered DNA repeats.
- These and other noncoding DNA have been ignored over the past two decades, says Karen Miga, a genomics researcher at the University of California, Santa Cruz who co-founded the consortium of scientists who completed the sequence.
- While genome sequencing is "definitely important for the etiology of cancer, it really is only the tip of the iceberg," says Andrew Hsieh, a physician-scientist at Fred Hutchinson Cancer Research Center.
The big picture: Scientists have known for decades that the bulk of the human genome — roughly 98% — doesn't encode the genes for proteins that power biology and also underlie disease when their function is altered.
- Instead, some of these noncoding regions likely regulate whether genes are on or off or alter their activity — roles that researchers are now trying to test.
- They're also trying to understand how these elements vary between individuals and how they guide the course of a disease or what treatments might be effective.
Zoom in: Tools like the gene editor CRISPR are allowing researchers to test the roles of noncoding DNA in specific diseases or disorders.
1) The course of a cancer or the effectiveness of treatment for it may be affected by different types of noncoding RNA.
- Using CRISPR and Plumage, an assay to determine if cancer-associated mutations in noncoding regions have functional consequences, Hsieh and his colleagues found variations within the noncoding region can co-opt gene expression and benefit cancer growth.
- About 35% of the mutations in human patient tissue samples they tested were functional in either increasing oncogenic genes that may cause cancers to grow or decreasing tumor-suppressive genes, Hsieh says.
- Mutations in regulatory parts of the genome may also play a role in cancers like basal cell carcinoma.
2) Limb and head formation during human development may also be guided by the noncoding genome.
- The Pierre Robin sequence that can affect a person's jaw, tongue and airways and sometimes cause a cleft palate, is suspected to be at least partially due to changes in DNA near the SOX9 gene.
- Severe congenital limb malformation may be caused by a long noncoding RNA being modified.
3) A new therapy for sickle cell disease and beta-thalassemia, which affect hemoglobin in red blood cells, acts on a noncoding part of the genome.
- Neville Sanjana, who studies genomics at New York University and the New York Genome Center, and his colleagues identified a region involved in repressing the production of fetal hemoglobin, which the body typically stops making just after birth.
- The gene-editing therapy based on that discovery targets that region so fetal hemoglobin is produced and can make up for the defective hemoglobin produced in patients with sickle cell and the protein shortfall in people with beta-thalassemia.
- Sanjana's team also used CRISPR to knock out every gene in lung cells and search for ways to block infection with the SARS-CoV-2 virus, per Genetic Engineering and Biotechnology News.
The challenge: With coding regions of the genome, scientists can see the effects of changing a base pair of DNA in the protein it forms.
- "In noncoding regions, we don’t have a similar Rosetta Stone to translate," Sanjana tells Axios.
- That's why tools like CRISPR are necessary but he says they'll need to be improved and combined with single-cell technologies to screen the massive noncoding genome and understand the network of genes it regulates.
What to watch: Gene editing can cause unwanted edits, or "off-target effects," often in the noncoding portion of the genome.
- Hsieh says those effects can't be assumed harmless in the long run — and is another reason it's important to determine the functions of the noncoding regions.
The bottom line: "We need to understand how these regions change or vary between individuals in the human population, and how that organization influences genome regulation and function," Miga says.