Apr 8, 2021 - Science

How AI could revolutionize biology — and vice versa

Illustration of a scientist holding a petri dish full of binary code

Illustration: Sarah Grillo/Axios

Cutting-edge, machine-learning techniques are increasingly being adapted and applied to biological data, including for COVID-19.

The big picture: Discovering and developing a new drug typically takes more than a decade and costs on average close to $1 billion, making it difficult to build a cache of pharmaceuticals to fight future pandemics or stop intractable diseases.

  • But scientists are combining two scientific leaps — in machine-learning algorithms and powerful biological imaging and sequencing tools — to try to spur progress in understanding diseases and advance AI itself.

What's happening: Last month, researchers reported using a new technique to figure out how genes are expressed in individual cells and how those cells interact in people who had died with Alzheimer's disease, allowing scientists to better understand the development of a condition that afflicts nearly 6 million Americans.

  • Machine-learning algorithms can also be used to compare the expression of genes in cells infected with SARS-CoV-2 to cells treated with thousands of different drugs in order to try to computationally predict drugs that might inhibit the virus, says Kris White, who studies antiviral treatments at the Icahn School of Medicine at Mount Sinai and is conducting research that uses this approach and has not yet been published.
  • Yes, but: Algorithmic results alone don't prove the drugs are potent enough to be clinically effective. But they can help identify future targets for antivirals or they could reveal a protein researchers didn't know was important for SARS-CoV-2, providing new insight on the biology of the virus, says White.

These and other advances are fueling centers and startups that apply AI to drug discovery, including a $400 million investment last month in the company insitro, which uses machine learning and genomics data to identify new drug candidates or existing ones that can be repurposed for treating neurodegenerative diseases.

  • They're not alone: At the end of 2019, 40% of 180 startups applying AI to drug discovery specifically focused on creating new drug candidates or repurposing existing drugs, according to Deloitte.
  • Machine learning is "going to really drastically change timelines on at least some parts of the process," insitro CEO Daphne Koller told the Financial Times.

Between the lines: AI's power lies in its ability to sift through data and find useful patterns far more quickly than humans could do alone. So far that's mostly been used to turbocharge movie, song and product recommendations on the web.

  • But "biology requires more than that," says Caroline Uhler, the co-director of the Eric and Wendy Schmidt Center, a $300 million initiative launched last month at the Broad Institute to focus on the intersection of machine learning and biology.
  • Predicting whether someone will click on an ad, for example, doesn't require understanding the interaction of all the contributing factors behind that decision, she says. But creating a drug to target a protein involved in a disease does require understanding how the genes that give rise to that protein are regulated.
  • "In biology, you really need to understand the underlying mechanisms," says Uhler.

One of the challenges is that biological data comes in many forms: DNA code of course, but also the expression of genes, electrical signals, images and more that are all captured at different snapshots in time and then have to be stitched together to get a full picture of what is happening in a patient.

What to watch: As with AI's application in other fields, there are real concerns about bias in datasets used to build machine-learning models.

  • An overreliance on genomic data from people of a particular race, gender or geography can skew an algorithm's predictions for other people, raising concerns about misdiagnoses or unequal health care.
  • Methods that explain or interpret how models arrive at results are being developed and can help to see if biased decisions are made and avoid any potential bias issue, says Jay Shah, a graduate student at Arizona State University and IEEE member who is developing deep-learning models to detect biological markers for Alzheimer's disease.

The intrigue: Much of the focus is on what AI can do for medicine and biology, but Uhler and her Broad Institute co-director Anthony Philippakis — who both have backgrounds in statistics — say the nuanced, complex biological data captured by imaging and sequencing could help to create powerful algorithms that capture cause and effect in a system.

  • That would represent a leap forward for AI, which remains best at identifying correlations, while leaving the question of cause to human scientists.
  • "Some of the specific problems we are working on relating to biology are stimulating us to advance the science of machine learning faster in some areas," Yoshua Bengio, an AI pioneer and professor at the University of Montreal who is collaborating with the Broad Institute researchers, said in an email, citing his work on applying algorithms to search for drug candidates to the discovery of new materials.

The bottom line: Melding AI and biology may not just be another tool for understanding medicine. Biology could be "a driver for the next generation of advances in machine learning," says Philippakis.

Go deeper