May 3, 2024 - Science

Exclusive: Inside the AI research boom

A bar chart shows the top AI research areas based on the number of research articles as of May 1, 2024. 'Large language models' leads with 14,264 articles, followed by 'Distributed machine learning' with 10,250.
Data: Emerging Technology Observatory Map of Science; Chart: Axios Visuals

China leads the U.S. as a top producer of research in more than half of AI's hottest fields, according to new data from Georgetown University's Center for Security and Emerging Technology (CSET) shared first with Axios.

Why it matters: The findings reveal important nuances about the global race between the U.S. and China to lead AI advances and set crucial standards for the technology and how it is used around the world.

Key findings: CSET's Emerging Technology Observatory team found global AI research more than doubled between 2017 and 2022.

  • Roughly 32% of AI research focused on computer vision, which grew 121% in those five years.
  • Natural language processing — what large language models do in ChatGPT and other generative AI tools — accounted for another 11% of AI papers and grew 104%.

Research in robotics grew slower than in vision and natural language processing — by just 54% — and made up about 15% of all AI research.

  • That tracks with the fact that anecdotally "a lot of the topics open in robotics have proven really hard to fix," Zachary Arnold, the analytic lead for the CSET team, said. "At the same time, there has been very rapid progress in language tasks, for example."
  • And AI safety research made up just 2% of all research, despite growing 315% between 2017 and 2022.

What they're saying: "The fact that research is growing so quickly, in so many directions, underscores the need for federal investment in basic measurement evaluation on the scientific techniques we need to ensure that AI getting deployed in the real world is safe, secure and understandable," said Arnold. But appropriations for the National Institutes of Standards and Technology, which is tasked with identifying those measurements, were recently cut.

The big picture: The top five producers of sheer numbers of AI research papers in the world are Chinese institutions, led by the Chinese Academy of Sciences.

  • The dominant narrative for years has been that while Chinese institutions generated the greatest quantity of papers, the quality of those papers wasn't as high and research in the country largely came from applying fundamental advances made by researchers in the U.S., Europe and elsewhere.
  • But when CSET researchers narrowed their analysis to highly cited papers, the Chinese Academy of Sciences was still the leader. Google is second, followed by China's Tsinghua University, Stanford and then MIT.

Yes, but: At the country level, the U.S. had the top spot in producing highly cited articles.

"China is absolutely a world leader in AI research, and in many areas, likely the world leader," Arnold said, adding the country is active across a range of research areas, including increasingly fundamental research.

  • The U.S. still has an edge on China in natural language processing. Google and Microsoft were the top organizations in this cluster of research.
  • But researchers in China produce more papers on computer vision than other countries in the world. Tsinghua University was the top organization in the world on this topic. China's strategic priorities for AI include autonomous vehicles, manufacturing, surveillance and other applications that require advances in computer vision.
  • India — and three Indian institutions, including Chitkara University —was the top producer of AI applications for plant disease detection.

Caveat: The data only accounts for research papers published in English, and doesn't capture scientific work in other languages.

How it works: CSET's Map of Science groups together articles that cite each other often, because they have topics or concepts in common, into clusters of research. (It doesn't mean all papers on LLMs, for example, are in the top cluster. Some may appear in other clusters.)

  • The CSET team defined "top" clusters as those with a large number of papers (at least 2,000 new articles in the last five years) that are growing fast (they have a higher share of recently published articles than 90% or more of the other clusters).
Go deeper