Ever since science became a formal discipline some five centuries ago, academic research — a fundamental driver of innovation — has, on and off, seemed broken: Scientists have cranked out too many incremental advances, fallen behind on the best research in their field and produced unreplicable work.
- Axios' Kaveh Waddell writes: Now, some are again rethinking the process, hoping that artificial intelligence could be the long-sought highway to faster and more reliable scientific discovery.
Why it matters: The U.S. government spends billions on academic research each year — and companies toss in billions more. Yet science can appear to be treading water, turning out a similar scale of breakthroughs as when funding was lower and the number of researchers smaller.
One problem: A combination of factors — higher funding, faster computers and far more data — results in researchers spending much precious time sorting through a relentless avalanche of scholarship.
- They can't read everything that is out there or attend every conference. It’s easy to miss a solution that’s already borne fruit in another field, or even an adjacent sub-discipline.
- In order to connect the dots and come up with the best possible research path, they can only hope that they have read the right articles or heard the right public speaker.
- "We need automatic techniques to see what’s missing," said Hannaneh Hajishirzi, an AI expert at the University of Washington.
Language is the core of the problem. Papers are ostensibly written for other scientists to read and understand, but the sheer volume of information means the scientists are in serious need of help.
The answer, some think, is simply to do a better job of sorting, cataloging and assessing papers as they are published.
- We’ve reported on efforts to monitor social media activity to boost the best papers — but the next step is to engage with the text itself.
- Several databases already link papers based on citations. Now, some are using natural language processing to extract actual meaning from research — a remarkably difficult task.
A first step is to automatically check facts and compare results against previous work.
- Scite, a new website that catalogs academic papers, uses machine learning to understand the context in which research is cited. For each paper, Scite lists other work that mentions it neutrally, supports it or contradicts its findings.
- Companies are also using language understanding in the laborious peer review process that precedes publication, reports Douglas Heaven for Nature.
This is the tip of the arrowhead.
- Scientists imagine a future where research results are fed into a unified database that is constantly being updated with the latest work.
- Rather than printing numbers in a table, results would go straight into this database — formatted for computers, not people, to read — and immediately be checked against other researchers’ findings.
"The model of referring to a text-based paper for the purpose of communicating experimental results will probably disappear."— Robert Murphy, professor of computational biology, Carnegie Mellon
But, but, but: This automated utopia is a long way off. Natural language processing is still hard for computers, and a system trained to understand papers in a particular field might fail when reading another field’s work.