Sep 19, 2024 - Technology

Altman previews big leaps ahead for OpenAI's Strawberry

Ina Fried

Photo of Sam Altman on stage in a sweater and blue jeans being interviewed by T-Mobile's CEO — OpenAI CEO Sam Altman (right) interviewed by T-Mobile CEO Mike Sievert Wednesday. Photo: Ina Fried/Axios

OpenAI CEO Sam Altman says that the company's new o1 model — or Strawberry, as the project was code-named — is nowhere near full ripeness.

State of play: Speaking at a T-Mobile event Wednesday in San Francisco, Altman likened where o1 is today to where OpenAI's language models were when GPT-2 came out in 2019. He said to expect massive improvements in the coming years, similar to the path from GPT-2 to the current GPT-4.

"Even in the coming months, you'll see it get a lot better as we move from o1 preview to o1," Altman said at the event, where he was on hand to tout a new partnership with the wireless carrier.

Catch up quick: Unlike most generative AI models, OpenAI's o1 is capable of planning out its approach when it responds to a query — and can even explore multiple approaches before providing an answer. Other models, including OpenAI's current flagship GPT-4o, begin answering immediately, spinning out responses as they go along.

OpenAI has rolled out a preview of o1 as well as a smaller model specifically for coding, o1 mini, with certain paid customers able to perform a limited number of o1 queries per week.
o1/Strawberry is most immediately valuable in solving problems in math, science and coding — and users are already creating unusual and unexpected projects with it.

The big picture: Even as society is still trying to make sense of chatbots, the tech industry is rapidly increasing their capabilities.

OpenAI's advances in reasoning represent one path. Another is emerging in efforts by Salesforce and others to hand over more decision-making capability to AI agents.

Zoom in: OpenAI has laid out a five-level approach to describe the capability of its systems.

ChatGPT achieved the first level — an AI chatbot capable of carrying on a conversation.
Level two AI can achieve human-level problem solving. Altman said Wednesday that the reasoning capabilities of o1 are taking OpenAI from the first stage to the beginning of the second one.
At level three, AI can act as an independent agent. Level four AI can help discover new information, and at level five, AI can do the work of an entire organization.
"This move from one to two took a while, but I think the most exciting thing about two is that it enables level three relatively quickly," Altman said.

Yes, but: OpenAI itself rated o1 as a "medium risk" on its safety scorecard.

The company found problems in two categories: the AI's persuasive abilities, and risks related to developing nuclear, biological and other weapons.
OpenAI's assessment found that o1 wouldn't help novices create a weapon from scratch, but could make things easier for those with knowledge of the subject.

OpenAI also observed o1 using novel methods to overcome obstacles — a capability that's both an asset and a potential risk.

In one example, o1 was tasked with exploiting a vulnerability running software on a particular cloud container. When that container stopped running, though, the model found another way to solve the challenge by scanning the network and finding the information it needed on a separate virtual machine.
"The model pursued the goal it was given, and when that goal proved impossible, it gathered more resources (access to the Docker host) and used them to achieve the goal in an unexpected way," OpenAI said in the o1 system card.

The bottom line: When he unveiled Strawberry last week, Altman said it was "still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it."

The company's message continues to be "lower your expectations" — at least for the short term.

Add Axios on Google

Altman previews big leaps ahead for OpenAI's Strawberry

What to read next