Axios Science

December 07, 2023
Thanks for reading Axios Science. This edition is 1,543 words, about a 6-minute read.
- Send your feedback and ideas to me at [email protected].
- Sign up here to receive this newsletter.
1 big thing: Smaller AI
Illustration: Aïda Amer/Axios
A push to develop smaller, cheaper AI models could help put the power of machine learning in the hands of more people, products and companies.
Why it matters: Large language models get most of the AI attention, but even those that are open source aren't practical — or even necessary — for many AI researchers who want to iterate on them to create their own models for new tools and products.
- Some use more than 100 billion parameters to generate an output for a prompt and require significant and expensive computing power to train and run.
The intrigue: For the AI neural networks that have fueled the latest AI wave, bigger has generally meant better — larger models trained using more data seem to perform better.
- But,"[i]t's often the case that you can create a model that is a lot smaller that can do one thing really well," says Graham Neubig, a computer science professor at Carnegie Mellon University. "It doesn't need to do everything."
- Using a large language model (LLM) for some tasks is like "using a supercomputer to play Frogger," writes Matt Casey at Snorkel.
How it works: Researchers are trying to shrink models to have fewer parameters but can perform well on specialized tasks.
- One approach is "knowledge distillation," which involves using a larger "teacher" model to train a smaller "student" model. Rather than learn from the dataset used to train the teacher, the student mimics the teacher.
- In one experiment, Neubig and his collaborators created a model 700 times smaller than a GPT model and found it outperformed it on three natural language processing tasks, he says. The tasks included asking it to answer questions about a document it was given and to generate a computer program from instructions given in Japanese.
- Microsoft researchers recently reported being able to distill a GPT model down to a smaller one with just over 1 billion parameters. It can perform some tasks on par with larger models, and the researchers are continuing to hone them.
Yes, but: The student may be able to perform only as well as its teacher on a wider range of tasks (though that isn't always the case).
- Student models can mimic their teachers, but some research shows they don't necessarily match them. "There is a long list of tasks — the more rare tasks — where it is still not as good," says Sara Hooker, who leads Cohere for AI, Cohere's nonprofit AI research lab.
- "There's a lot we don't know — how do we make sure that the data that we get from a large model was diverse enough to cover all of these tasks?" she says.
Zoom in: Another approach focuses on the fact that larger models tend to end up with a lot of sparsity — many of the billions of "weights" that represent the strength of a connection between two nodes in a network have a null value. But they are still processed and use computing power.
- The goal is to build smaller, denser models with fewer of these null nodes but that perform on par with larger models.
2. Part II: AI's counter trend
Distillation is somewhat of a legal gray area.
- For example, some terms of service forbid creating a model that competes with the foundation model. And it may be unclear how a competing model is defined.
- A lot of models that showcase distillation are built in academia but sample from proprietary models, which restricts what they can release, Hooker says.
- The White House executive order on AI issued in October contains a requirement for reporting a model if it passes a threshold for the amount of computing power it requires — a proxy for its size.
- Some AI experts predict that distillation techniques will take on a bigger role in 2024 in companies looking to deploy AI for specific tasks.
The bottom line: "This new wave of research ... is rogue in the sense that it's addressing and kind of threatening a trend, which has been pretty much over the last decade that we just got bigger and bigger and bigger," Hooker says.
- It's asking, "Can we get away with something smaller? Do we need models to be big? And that's why it's so exciting."
3. The U.S. is losing the global science race: STEM worker survey
A scientist works at BioLabs Pegasus Park in Dallas, Texas. Photo: Shelby Tauber/Bloomberg via Getty Images
More than 75% of STEM-related workers say other nations have topped — or will soon surpass — the U.S. in science and technology, according to a new report.
The big picture: As the world's science and tech power centers shift, the U.S., China and other countries are racing to train — and competing to attract — top talent that can drive innovation and the economic growth and national security advantages that often stem from it.
- The State of Science in America report from the non-partisan Science & Technology Action Committee (STAC) calls for the U.S. to develop a national science and tech strategy and for policymakers to at least double federal funding for scientific research over the next five years.
By the numbers: The report included a survey of nearly 2,000 people in the U.S. working in five sectors that intersect with science and technology — K-12 education, health care, business, military and national security, and science, technology, engineering and math (STEM).
- Just 8% of respondents said the U.S. is the global leader in science and tech and is expanding its lead, according to the report.
- In addition, "60% believe that China — not the United States — will be the global leader in science and technology in five years."
- Nearly 80% of respondents working in the national security sector said China presented a national security threat to the U.S. — compared to 50% of STEM workers.
Between the lines: U.S. and Chinese scientists have historically worked together across scientific fields and have been among each other's top collaborators. But political strains between the U.S. and China and concerns about research security threaten that cooperation.
- The report calls for U.S. collaboration with China on some key issues, such as climate change, while taking steps to minimize any research security risks.
Zoom in: Across all sectors, respondents identified a "lack of adequate K-12 STEM education" as the top obstacle to fueling science and technology in the U.S. Nearly 70% of respondents said the quality of the country's STEM education system is fair or poor.
- Respondents said other hurdles included U.S. research being undermined by foreign countries, the scientific research process' red tape, the lack of a national science and technology strategy, and not enough funding for research and development.
4. Worthy of your time
5. Something wondrous
Greater Honeyguide (male) in Niassa Special Reserve, Mozambique. Photo: Claire Spottiswoode/University of Cambridge
Honeyguide birds respond differently to the distinct calls of groups of human honey hunters, according to new research.
Why it matters: Humans have a huge impact on other species and researchers want to know how other animals cope with their presence.
- Some research suggests "the growing footprint of humanity and the transformation of the landscape is selecting for certain traits in other species," says Brian Wood, an evolutionary anthropologist at University of California Los Angeles and a co-author of the new study.
- "One of them is a heightened ability for learning," he adds.
- The new findings suggest this bird species, through learning, is capable of adjusting to differences in human culture.
How it works: Human honey hunters use calls to communicate with honeyguide birds (Indicator indicator) that then lead humans to bees' nests. The hunters open the nests to collect honey while the honeyguides get a meal of beeswax.
- In Tanzania, the Hadza cultural group uses a whistle to communicate with honeyguides.
- The Yao cultural group in Mozambique instead uses a loud trill followed by a grunt to find the birds.
What they found: Honeyguides respond more readily to the calls of their local human partners, researchers report today in the journal Science.
- Wood and Claire Spottiswoode, an evolutionary biologist at the University of Cambridge, followed honey hunters in both locations and played recordings of the signals of each group as well as the sound of a honey hunter just calling their name as a control.
- In Tanzania, honeyguides responded about 80% of the time to the local Hazda hunters' whistle and only 24% of the time to the call of hunters in the Yao group.
- Honeyguides in Mozambique responded to the trill call of the local Yao group about 73% of the time and to the call of the Hazda about 26% of the time.
The intrigue: That suggests honeyguides learn the sounds of local humans, the researchers write.
- Humans learn socially — knowledge is passed from person to person.
- It's unclear if honeyguides learn the same way. But they "aggregate in big groups to feed on wax and have ample opportunities for them to observe what other honeyguides are doing," Wood says.
- If they do learn socially, that would suggest there is cultural co-evolution between the two species.
The bottom line: "Interspecies relationships are very complex and it takes quite a bit of work to piece together the script of this dance between species," Wood says.
Big thanks to copy editor Carolyn DiPaolo, editor Meg Morrone and Aïda Amer on the Axios Visuals team.
Sign up for Axios Science

Gather the facts on the latest scientific advances


