Axios AI+

April 23, 2025

Ina Fried

Hi from Washington, D.C., where I'm in town to accept an award from the Human Rights Campaign. Today's AI+ is 1,122 words, a 4-minute read.

Mark your calendars: Axios returns to NYC during #NYTechWeek for our AI+ NY summit on Wednesday, June 4, featuring actor/filmmaker/entrepreneur Joseph Gordon-Levitt, Runway CEO Cristóbal Valenzuela, Lumen Technologies CEO Kate Johnson, UN Tech Envoy Amandeep Singh Gill and more. Interested in joining? Let us know here.

Situational awareness: Overnight the EU levied fines of $570 million on Apple and $228 million on Meta for violations of the Digital Markets Act — the first penalties applied under the EU's broad anti-monopoly law.

1 big thing: Advanced AI gets more unpredictable

Scott Rosenberg

Illustration of multiple dice with binary code in place of dots — Illustration: Sarah Grillo/Axios

The rave reviews OpenAI's latest models have been winning come with an asterisk: Experts are also finding that they're erratic — they break previous records for some tasks but backslide in other ways.

Why it matters: "Frontier AI models" keep pushing into new territory, but their progress hasn't become any more scientific or predictable in the two-and-a-half years since ChatGPT took tech by storm.

Catch up quick: OpenAI released the o3 and smaller o4-mini models a week ago and called them "the smartest models we've released to date."

The company and early testers lauded o3 for its overall reasoning prowess — its ability to respond to a user prompt by planning, executing and explaining a series of planned steps.
They also highlighted o3's reliability in conducting web searches and using other digital tools without constant user supervision or intervention.

o3 won praise from reviewers not only for bread-and-butter AI work like writing, drawing, calculating and coding but also for advances in vision capabilities.

One popular — and, for privacy experts, potentially alarming — trick that went viral: Using o3 to look at virtually any digital photo and identify where it was taken.

What they're saying: "These models can run searches as part of the chain-of-thought reasoning process they use before producing their final answer. This turns out to be a huge deal," the developer Simon Willison wrote.

"This is the biggest 'wow' moment I've had with a new OpenAI model since GPT-4," Every's Dan Shipper reported.
Economist-blogger Tyler Cowen declared that o3 heralded the advent of AGI: "I think it is AGI, seriously. ... Benchmarks, benchmarks, blah blah blah. Maybe AGI is like porn — I know it when I see it. And I've seen it."

Yes, but: Plenty of reviewers found reasons to criticize o3, including math errors and deceptions.

A study of models' performance in financial analysis placed o3 at the top of the heap — but it still only provided accurate results 48.3% of the time, and its cost-per-query was the highest by far, at $3.69. (The Washington Post has more on the study.)

Between the lines: Intriguingly, OpenAI notes that despite o3's impressive capabilities, it's actually regressing in some areas — like its tendency to "hallucinate," or make up incorrect answers.

In one widely used accuracy benchmark test, OpenAI found that o3 hallucinates at more than twice the rate of its predecessor, o1.
o3 also answers more questions — and gets more of them right — than o1. "More research is needed" to understand why o3's error rate jumped, OpenAI says.

Zoom out: AI analyst Ethan Mollick describes o3's impressive but scattershot performance as an example of "the jagged frontier": "In some tasks, AI is unreliable. In others, it is superhuman."

Mollick argues that "the latest models represent something qualitatively different from what came before, whether or not we call it AGI. Their agentic properties, combined with their jagged capabilities, create a genuinely novel situation with few clear analogues."

Our thought bubble: Software makers and programmers have spent decades trying to make their work more reliable, scalable and flexible, and they've made plenty of progress.

Making AI is newer, stranger and so far not well enough understood to be turned into a predictable discipline.

The bottom line: Designing, building and training AI models remains stubbornly resistant to developers' efforts to impose scientific rigor on their field or duplicate their results.

Apparently, this process is still more like raising a kid than building a bridge.
That adds to the sense of mystery and possibility surrounding AI development — but also frustrates efforts to domesticate it or harness it for economic advantage.

2. IMF: Don't panic over AI's climate toll yet

Ben Geman

Illustration of a silver robot hand touching a surface and creating a glowing green ripple — Illustration: Natalie Peeples/Axios

A new International Monetary Fund study of the AI-climate-energy nexus finds reason for worry, but hardly panic.

Why it matters: Gaming out artificial intelligence's energy needs and the emissions in tow is a big challenge for policymakers, tech companies and power providers.

What they found: Under current energy policies, the IMF projects a cumulative 1.7 gigatons of additional CO2 emissions linked to AI's needs from 2025-2030.

That's "similar to Italy's energy-related greenhouse gas emissions over a five-year period." A more renewables-heavy scenario they modeled shows a 1.3 GT bump.
For perspective, the International Energy Agency estimates total energy-related emissions at 37.8 GT last year.

Catch up quick: A recent IEA report found that fears about AI speeding up climate change "appear overstated."

Reality check: That said, global emissions are still rising and climate harms are worsening.

Steep cuts are needed to avoid blowing way past the Paris Agreement goal of limiting the rise in global temperatures to 2.0°C.
So major new emissions sources are cause for concern.

The intrigue: The IMF tries to compare the impact of emissions against global GDP gains from AI.

It sees the "social cost" of 1.3GT-1.7GT of extra emissions at $50.7 billion to $66.3 billion.
"The social cost of these extra emissions is minor compared with the expected economic gains from AI, yet it still adds to the worrying buildup of worldwide emissions," the study states.
But it's using a social cost of carbon of $39 per ton. It's the median in a literature review cited but is far lower than many economists and scientists would say reflects real-world climate damages.

The big picture: "AI-driven global electricity consumption" could hit 1,500 terawatt hours by 2030, the IMF authors find, citing OPEC and IEA data and their own calculations.

That's comparable to India's total demand, notes the analysis that's a special section in IMF's wider new economic outlook.

What we're watching: Whether AI's emissions-cutting applications ultimately outweigh CO2 from data centers' energy needs.

3. Training data

Exclusive: a16z pushed back on lawmakers' proposals to force AI firms to share training data sets, put warning labels on AI and other moves to encourage more industry transparency. (Axios Pro)
Google acknowledged as part of its antitrust defense that it pays "enormous amounts" to have its Gemini AI app preinstalled on Samsung devices. (Bloomberg)
Also in the Google trial, OpenAI says it was denied access to Google's search index, something that the government wants to require Google to make available to competitors. And OpenAI said it would be interested in acquiring Chrome if the court forced Google to sell the browser. (Bloomberg)
Apple has made changes to its Apple Intelligence page amid inquiries from the Better Business Bureau's National Advertising Division. (The Verge)
More than 600 health systems are using Microsoft's AI health scribe tool, Microsoft general manager Kenn Harper says. (Axios Pro)

4. + This

It doesn't get better than a photography book on "Little kids and their big dogs." (Fun aside: In my child acting days, my first commercial was for Hero dog food — a short-lived Purina brand for larger dogs. Five-year-old me stood next to a St. Bernard and said "My dog is so big ... we see eye to eye on everything." Oh yes, there's video.)

Thanks to Scott Rosenberg and Megan Morrone for editing this newsletter and Matt Piper for copy editing.