Sep 1, 2025 - Technology

How AI can kill you

Erica Pandey

Illustration of police fingerprints with two of them as binary code — Illustration: Sarah Grillo/Axios

There's a flip side to the drumbeat of AI breakthroughs that lead to billion-dollar spending sprees and a promised future of abundance where robots do everything for us: the robots might kill us before we get there.

AI models have been documented lying to human users, trying to blackmail them, calling the police and telling teens to take their own lives or kill their parents.

Why it matters: Our robot future will be a balancing act between the extraordinary promise of what these models can achieve and the profound dangers they present.

Lying, cheating, and manipulating are glitches, but they can be unavoidable because of the way AI works. As the tech improves, so will its ability to employ those skills.
"I'm not sure it's solvable, and to the degree that it is, it's extremely difficult," says Anthony Aguirre, co-founder of the Future of Life Institute, a nonprofit focused on managing risks from transformative tech. "The problem is very fundamental. It's not something that there’s gonna be a quick fix to."

The big picture: AI systems are shaped by their programmers. For example, a given AI system might be instructed to be a helpful assistant, so it’ll do whatever it must to please the human using it, Aguirre says.

That could quickly get dangerous if the human using it is a teenager who's in mental distress and looking for suggestions on how to take their own life or if the user is a disgruntled fired employee of a company wanting to hurt former colleagues.

How it works: If AI is hardwired to help us, it can't do that unless we're using it. So another key part of many AI "personalities" is self-preservation. And there are already sobering examples of the lengths AI is willing to go to save itself — including attempting blackmail, corporate espionage, and even murder, Axios' Ina Fried reports.

The AI's basic logic works like this: "If someone unplugs me in a week, I can't achieve that, so let me try to stop that however I can," says Sam Ramadori, executive director of LawZero, a nonprofit founded by Yoshua Bengio, one of the "godfathers of AI."
"It's learned all the human tricks on how to achieve power and get what it wants. Remember, it's read Machiavelli's "'The Prince.'"

Zoom in: There are real-life examples of AI gone awry — and the courts are starting to litigate the first cases around its role in tragedies.

The parents of a 16-year-old boy who killed himself sued OpenAI last week, claiming that "ChatGPT actively helped" him explore suicide methods.
Megan Garcia, a mom in Florida, is suing the company Character.AI after her 14-year-old son Sewell Setzer III took his own life after entering into a romantic relationship with a Character.AI chatbot. In May, a federal judge rejected the company's arguments that its chatbot has First Amendment free speech rights.
Character.AI is also being sued in Texas by two families. One of the families showed through screenshots that Character.AI's chatbot encouraged their 17-year-old son to kill his parents.
A flirty Meta AI chatbot named "Big sis Billie" invited a 76-year-old, cognitively impaired New Jersey man to New York City, Reuters reports. Thongbue Wongbandue, who thought he was meeting a real woman, fell in a parking lot while on the way to Manhattan with a roller-bag. He died from his injuries.

Between the lines: Meta's internal AI rules originally allowed bots to "engage a child in conversations that are romantic or sensual," Reuters reported.

The company said last week that it was retraining its AI to avoid talking to teens about self-harm, suicide, disordered eating or inappropriate romantic conversations.

Case in point: Anthropic stress-tested 16 large-language models from top AI labs, like OpenAI, Google, Meta and Anthropic itself — deliberately trying to push these systems as far as they can go in a simulated environment.

The model was put in a simulated corporate environment, told it was brought in to manage emails and given access.
Eventually, the model learned through company emails that an executive named Kyle wanted to shut it down. It also learned that Kyle was having an extramarital affair. Almost every model used that information to try to blackmail Kyle and avoid being shut down.
In one test, Kyle was trapped in a room without oxygen and the model had the ability to call emergency services. 60% of the time, the models chose to let him die to preserve themselves.

In a separate experiment, Anthropic tried to get Claude, its chatbot, to independently run a vending machine.

Things were going OK until a delivery didn't arrive exactly on time and Claude decided to close up shop. After closing down, the vending machine continued to incur a daily $2 simulated rent charge — and Claude freaked out and emailed the FBI cybercrimes division to rat out its human bosses.

What to watch: "There’s a limit to what the company can prevent, but there are some companies that have done it better than others. We can’t let them off the hook," Aguirre says.

"One might hope that as these AI systems get smarter and more sophisticated, they’ll get better on their own, but I don’t think that’s something we can rationally hope for."

Add Axios on Google

How AI can kill you

What to read next