
Illustration: Aïda Amer/Axios
AI is speeding up the discovery of the structure of proteins that drive biological processes across organisms.
Why it matters: If researchers can predict what shape a protein will take, they can better understand how it works — and potentially target medicines for proteins that cause disease or create antibiotics that can disable resistant bacteria's proteins.
The big picture: Determining the structure of proteins is typically done through painstaking experiments involving crystallizing proteins and analyzing them with X-rays. That's yielded the shapes of a small fraction of proteins in humans.
- But new machine-learning systems like AlphaFold and RoseTTAFold are making a "once in a generation" advance and have been able to speed up the discovery process greatly.
- "If you really want to understand how biology works at the molecular level — and that is really where it works, with little machines interacting with each other — you need to know the shape of the protein molecules," said John Moult, a structural biologist at the University of Maryland, Shady Grove.
- "For a long time we've known the sequences at the DNA level of all the genes, and those sequences dictate what these shapes are. But the way in which you can get from 'here's the sequence' to 'what's the detailed shape' has been an outstanding computational problem for more than 50 years," Moult told Axios.
- Moult is also co-founder of the Critical Assessment of Structure Prediction, or CASP, which is a challenge that's been run for 25 years to test modeling programs predicting protein structures.
The latest: Google DeepMind's AlphaFold2 system is able to issue a “good prediction" of protein structures about 95% of the time, scientists said at a joint press conference announcing DeepMind and the European Molecular Biology Laboratory (EMBL)'s collaboration.
- Using AlphaFold 2, scientists were able to generate 3D models of 350,000 proteins, 36% of which have a "high confidence," they said. The researchers have opened these structures to use via the AlphaFold database in an "effort to move the science forward," said DeepMind's Demis Hassabis, co-author of the paper in Nature.
- Another co-author, Centre for Enzyme Innovation director John McGeehan, said they had DeepMind test seven enzymes in their experiments to break down plastics, two of which they had already found experimental structures. DeepMind quickly confirmed the two structures and gave further information on the others, too.
- "It's one of those moments, to be honest, where the hair stood up at the back of my neck," McGeehan said. "The structures that [GoogleMind] produced were identical to our crystal structures. In fact, they contained even more information than the crystal structures were able to provide [using traditional methods]. ... The acceleration to our project is multiple years."
The interpretation of rare genetic mutations is another area AI is expected to target, said EMBL deputy director and co-author Ewan Birney.
- "This is a very practical problem in clinical genetics, where you have a suspected series of mutations or changes in an affected child, and you want to try and work out which one is likely to be the reason why your child has got a particular genetic disease."
Context: The AlphaFold2 findings follow last week's announcement that the University of Washington created a neural network, RoseTTAFold, to determine protein structures and published several openly via GitHub.
What's next: "This is just the first clear demonstration of the power of AI in biology," Moult said.
- "There are obvious next things that are likely to happen, to do with how proteins interact and drug development."
- "Our understanding of protein structures and associated biology will take a big leap forward as people build on these resources," Moult said.