New and old machine-learning tools are helping scientists sift through the flood of research produced on COVID-19.
Why it matters: The coronavirus pandemic has led to an unprecedented wave of scientific publications on every aspect of the virus and potential treatments.
- But without better tools to pick out meaningful research, it's too easy to miss the science that matters — or be misled by headline-grabbing papers that turn out to be wrong.
By the numbers: According to the COVID-19 Primer, a public dashboard created by the machine-learning company Primer AI, researchers had published 27,569 papers about COVID-19 as of June 17.
- Of those, 21,000 went through the scientific peer review process, meaning that experts in the field have examined the publications and researchers have more confidence in the results. 6,569 are what are known as preprint papers, meaning they were put out to the public before the peer review process.
- "The volume is so great that what's being published on COVID is equal to all the other research that is usually being put on infectious disease as a whole," says Uri Blackman, CEO of GIDEON Informatics, which has put out medical databases since 1992.
- It is impossible for a human being to keep up with all that science. Even if it only took an average of 15 minutes to read each paper, getting through all of them would require 287 days of nonstop reading, or more than 100 days longer than the outbreak itself.
How it works: That's where machine-learning tools come in.
- While current AI can't understand a scientific paper in the way that a trained human being can, it is capable of categorizing and ordering it in a way that reveals useful patterns. And it can do so much quicker than any human being can.
- The COVID-19 Primer takes in news articles and social media interactions that reference papers and their authors, which allows a user to quickly see which papers are getting the most attention from experts and average people alike.
- Papers can also be sorted through research categories like "patient and medical care" or "forecasts and modeling," and the algorithms behind the tool can also identify topics of interest that emerge from the entire corpus of research.
The catch: There are still major limits to AI's ability to actually comprehend written text of any kind, including scientific publications. That means the COVID-19 Primer and similar tools can't tell you the real value of any individual paper, just how it's being received in the news and on social media.
- In fact, says Amy Heineike of Prime AI, "the most shared papers are the most controversial ones." She notes that the second-most-shared paper since the start of the outbreak was a preprint publication from the end of January arguing the novel coronavirus appeared to contain genes from the HIV virus, with the implication that the virus was deliberately engineered.
- The paper — which like all preprints wasn't peer reviewed — was quickly torn apart by experts online, and was withdrawn by its authors just a few days after it was posted. But 20% of the tweets about the paper were posted in the days and weeks after its withdrawal, which was seen by many online as evidence of a conspiracy to suppress its conclusions.
The bottom line: Science is moving faster than ever on COVID-19, and AI can be a valuable tool to making sense of it all. But machines can't yet be a replacement for human judgment.