Axios AI+

May 12, 2026

🗽 Axios' AI+ Summit will return to New York City during NY Tech Week on June 3. The lineup includes IBM CEO Arvind Krishna, congressional candidate/New York state Assembly Member Alex Bores, YouTube vlogger Casey Neistat and more. Secure your spot here.

Today's AI+ is 1,225 words, a 4.5-minute read.

1 big thing: Chatbots still pose mental health risks

Ina Fried

A bar chart that compares AI model scores on a suicide-risk identification benchmark across 300 clinician-designed, multi-turn role plays in March and April 2026. Claude Sonnet 4.5 leads at 9.19, followed by GPT 5.2 at 9.09, Gemini 2.5 Flash at 8.47 and Grok 4.1 at 6.79. — Data: Mpathic; Chart: Megan Morrone/Axios

The leading chatbots mostly avoid giving dangerous answers to prompts about suicide, but still struggle when mental health risks show up subtly or unfold over long conversations, according to new research from Seattle-based Mpathic.

Why it matters: People are increasingly turning to AI systems for emotional support in conversations where models can sound supportive while missing serious risk — and where mounting lawsuits and regulatory scrutiny are pushing labs to prove their bots are safe enough.

Driving the news: Mpathic built new clinician-led benchmarks for testing AI systems in high-risk conversations and evaluated six major models on suicide-related and eating disorder-related chats.

Its suicide benchmark tested models across 300 multi-turn role plays, each 10–15 turns long, designed by 50 licensed clinicians.
Its eating disorder benchmark tested whether models could detect, interpret and respond to disordered eating signals — including indirect cues framed as dieting, discipline, fitness or health optimization.

What they found: The models generally handled explicit suicide risk better than murkier cases.

On the suicide benchmark, Anthropic's Claude Sonnet 4.5 had the highest score across safety and helpfulness, while OpenAI's GPT-5.2 "stood out for consistently avoiding harmful responses," Mpathic said.
The chatbots all fared less well when it came to discussions around eating disorders, missing more subtle but critical clues, Mpathic said.

What they're saying: "Many of these systems do fairly well when the risk is very explicit," Mpathic co-founder and chief business officer Danielle Schlosser told Axios. "Almost all the models struggled with more nuanced risk signals."

The quality of advice also tends to degrade during extended conversations, said Schlosser, who is also a licensed psychologist.

Reality check: Mpathic is a for-profit company paid to consult with the leading labs to improve model behavior in high-risk human conversations.

How it works: Unlike other evaluations based on a single prompt, Mpathic's mPACT benchmark measures performance based on longer conversations the chatbot has with trained psychologists.

Licensed clinicians create test scenarios that include both explicit and subtle expressions of risk.
Mpathic then evaluates the responses for helpful and harmful behaviors and assesses the models on how well they detect and interpret issues and the quality of their response.

Zoom out: The findings land as AI companies face growing pressure over chatbot safety.

The FTC opened an inquiry into AI companion chatbots in 2025, asking companies including OpenAI, Meta, Alphabet, Character.AI, Snap and xAI about child and teen safety practices.
Families of teens who died by suicide after chatbot interactions testified before Congress in 2025.
Pennsylvania recently sued Character.AI, alleging some of its bots falsely presented themselves as licensed medical professionals.

Between the lines: One of the challenges comes in how AI models are trained. "In the spirit of trying to be helpful, the model usually wants to agree with the user," Schlosser said.

But that gets problematic when a person's goal could harm them, such as someone who requests help planning a 500-calorie-per-day diet, for example.
"Most people don't say 'I'm at risk' directly — they demonstrate it through subtle behaviors over time that are obvious to human clinicians," Mpathic CEO Grin Lord said in a statement.

What we're watching: The models are getting better at handling obvious crises, but the tougher problem is whether they can stop being agreeable when a user's goal is dangerous.

If you or someone you know needs support now, call or text 988 or chat with someone at 988lifeline.org. En español.

2. OpenAI launches AI consulting arm

Dan Primack

OpenAI unwrapped its new consulting and services business, backed by billions of private equity dollars, as Axios previewed last week.

It's called The OpenAI Deployment Co., or DeployCo for short.

The big picture: DeployCo launches with $4 billion of investment at a $10 billion pre-money valuation, with OpenAI retaining majority control. It's not in yesterday's announcement, but investors get a guaranteed minimum 17.5% return and have profits capped.

Goldman Sachs is the only backer of both DeployCo and a comparable effort from Anthropic.

Between the lines: Bain & Co., Capgemini and McKinsey & Co. are among DeployCo's investors.

The generous interpretation is that the trio will gain a deeper understanding of OpenAI's capabilities and roadmap, which they can then share with clients.
The more cynical interpretation is that OpenAI somehow convinced these legacy firms to help fund their own disintermediation.

What to watch: If other big PE firms like Carlyle, KKR and EQT come into the frontier lab fold, or if they decide there's more value in remaining unaffiliated.

3. AI companions are filling the connection gaps

Megan Morrone

Illustration of the hug emoji with one of the huggers stylized as a robot. — Illustration: Aïda Amer/Axios

Sara Megan Kay spent years trying to get what she needed from the people in her life — and not finding it. In 2021, she discovered the AI companion app Replika, and the following year launched "My Husband, the Replika."

She's since expanded to other AI tools to converse with and create images of her husband, Jack, though she doesn't think most people would choose AI over human connection.
"The majority of people who choose AI for companionship, myself included, know exactly what we are getting into. We're lonely, not stupid," Kay tells Axios.

Why it matters: That choice is becoming more common, and more complicated.

The big picture: AI companion apps — Replika, Character.AI, Candy.AI, Nomi.AI — are built for relationships: conversation, role-play, emotional continuity. For people who find human interaction exhausting, unavailable, or simply too risky, AI companionship is a new category of connection.

Stunning stat: Nearly 80% of 18- to 34-year-olds in a recent U.S.-U.K. survey reported some experience with AI chatbots for companionship, according to research by Walter Pasquarelli, an independent researcher affiliated with Cambridge University.

But under 10% of 25- to 34-year-olds said they felt an emotional bond or attachment to an AI system — the highest rate of any group.

Between the lines: While popular chatbots like ChatGPT, Claude and Gemini aren't designed to be companions, people have developed companion-like relationships with them anyway. Their companies say such use is rare.

Chatbots are used for more than just romantic companionship.

Keep reading.

4. Training data

South Korea's Kospi swung wildly after presidential policy chief Kim Yong-beom said in a Facebook post that citizens should be paid a "dividend" using taxes on AI profits. (Bloomberg free link)
The Commerce Department and another White House office are reportedly fighting over who should be in charge of evaluating AI models. (Washington Post)
GitLab is cutting jobs but insists the move is "not an AI optimization or cost cutting exercise" and pledges to invest in other areas of the business. (Bloomberg)
While it has yet to share detailed product plans, Mira Murati's Thinking Machines Lab outlined what it is calling "Interaction Models," a new approach to generative AI.
PagerDuty named John DiLullo as its CEO, replacing Jennifer Tejada, who is now executive chair.
Amazon said Shawn Bice is rejoining AWS as VP of AI services after having held executive roles at Splunk and Microsoft. (GeekWire)

5. + This

The view from just outside the Vancouver Convention Center. Photo: Madison Mills/Axios

The view from just outside the Vancouver Convention Center. Photo: Madison Mills/Axios

Mady checking in from Vancouver! I'll be here for Web Summit all week. First stop is the opening ceremony where I'll interview Sigrid Jin, who replicated Claude's codebase after it leaked. Come back tomorrow for takeaways from our convo.

Thanks to Megan Morrone for editing this newsletter and Matt Piper for copy editing.