Models that understand the physical world are AI's next big wave
Add Axios as your preferred source to
see more of our stories on Google.

Photo: Autodesk
After years of competing to build the best model for understanding all the world's writing, AI makers are now racing to build models that better understand the physical world.
Why it matters: Robots are one main use, but giving AI an understanding of real-world physics is also key to powering everything from video games to architecture.
Driving the news: Autodesk is announcing a pair of foundation models Tuesday, both of which use understanding of designs and the physical world to generate realistic computer-aided-design (CAD) objects.
- OpenAI is said to be ramping up investment in building models with real-world understanding as it also seeks to reboot its robotics effort, per Wired.
- A host of startups, including Fei-Fei Li's World Labs, have made building real-world models their focus.
Zoom in: Autodesk is set to debut its two new models at a conference later Tuesday in Nashville, but shared details and demos first with Axios.
- One model creates editable CAD drawings for various 3D objects based on a sketch, prompt or text description.
- The other works at the architectural level, allowing the sketch of a building to be easily reshaped, with the AI model helping automatically adjust the interior with a realistic floor plan.
"What you'll notice is that the building fills in with all the rooms and windows and absolutely everything inside it," Autodesk senior VP Mike Haley told Axios — showing how the blueprints shifted as he raised and lowered one wing of a building's plan.
- Even a building's key structural elements can be automatically adjusted, he said.
Zoom out: A wave of startups are raising big bucks to chase goals similar to Autodesk's.
- Investors are betting that foundation models grounded in real-world physics could be as transformative as large language models were for text.
- Dyna Robotics announced Monday it has raised $120 million to support further development of its robotics-oriented foundation model, which is designed to learn in real-world production settings like restaurants and laundromats.
- FieldAI raised more than $300 million last month to help build models that allow robots to safely navigate unpredictable real-world settings.
- Meanwhile, Roblox and other companies are using a similar approach to allow people to use still images or text to create intricate video game virtual worlds that operate according to real physics.
Between the lines: Large language models can do a lot even with only their extensive book knowledge of the physical world's laws. But they often struggle to capture the nuances of how materials interact with one another or the impact of gravity.
- "Those models fall off a cliff very, very quickly," Haley said, noting that even many models that can produce images or video don't have a three- dimensional understanding of what they produce.
- "And they actually set an expectation with the user that they can do those things, because they're certainly not going to tell you when they can't do it, right? Then they produce garbage at the end of the day," Haley said.
What's next: In a new research demo, also being shown on Tuesday, Autodesk is demonstrating how a user can work with a physics-aware language model, speaking commands and drawing, to edit objects just as easily as creating them. That's often a challenge with today's models..
- "This intuitive process, being able to talk, to sketch, to constantly interact, is the big learning," Haley said. "We truly believe this is going to change fundamentally how people are going to design in the future."
