Apr 23, 2024 - Technology

Microsoft releases open-source small language model

headshot
Illustration of an open house window with panes resembling the Microsoft Windows logo. Outside the open window is a bright blue sky filled with glowing binary code.

Illustration: Annelise Capossela/Axios

Microsoft on Tuesday began publicly sharing Phi-3, an update to its small language model that it says is capable of handling many tasks that had been thought to require far larger models.

Why it matters: Smaller models use less computing power than larger ones and, in many cases, can run on smartphones or laptops, offering additional performance and privacy benefits.

Driving the news: Microsoft is adding Phi-3 to its own Azure model gallery as well as releasing it on open source model site Hugging Face and Ollama, a platform designed to help people run models on their own machines.

  • Microsoft is starting with Phi-3 mini, a version of the model trained on a smaller amount of data ( 3.8 billion parameters).
  • Two other models — still considered lightweight by today's standards — are coming shortly: Phi-3 small is trained on 7 billion parameters and the largest, Phi-3 medium, is trained on 14 billion parameters.
  • The company says Phi-3 can outperform similar-size models and even slightly larger ones, thanks in part to the high-quality data on which it is trained.

What they're saying: Microsoft says Phi-3 isn't designed to replace large language models, but can work in places large models don't, including running on devices.

  • "If you have a very, very high-stakes application, let's say in a health care scenario, then I definitely think that you should go with the frontier model — the best, most capable, most reliable," Microsoft VP Sébastien Bubeck told Axios.
  • For other uses, other factors matter more, including speed and cost. "That's where you want to go with Phi-3," Bubeck said.
Go deeper