Feb 28, 2025 - Technology

Untangling safety from AI security is tough, experts say

Sam Sabin

Illustration of the American flag with a warning pop up window in front of it. — Illustration: Allie Carl/Axios

Recent moves by the U.S. and the U.K. to frame AI safety primarily as a security issue could be risky, depending on how leaders ultimately define "safety," experts tell Axios.

Why it matters: A broad definition of AI safety could encompass issues like AI models generating dangerous content, such as instructions for building weapons or providing inaccurate technical guidance.

But a narrower approach might leave out ethical concerns, like bias in AI decision-making.

Driving the news: The U.S. and the U.K. declined to sign an international AI declaration at the Paris summit this month that emphasized an "open," "inclusive" and "ethical" approach to AI development.

Vice President JD Vance said at the summit that "pro-growth AI policies" should be prioritized over AI safety regulations.
The U.K. recently rebranded its AI Safety Institute as the AI Security Institute.
And the U.S. AI Safety Institute could soon face workforce cuts.

The big picture: AI safety and security often overlap, but where exactly they intersect depends on perspective.

Experts universally agree that AI security focuses on protecting models from external threats like hacks, data breaches and model poisoning.
AI safety, however, is more loosely defined. Some argue it should ensure models function reliably — like a self-driving car stopping at red lights or an AI-powered medical tool correctly identifying disease symptoms.
Others take a broader view, incorporating ethical concerns such as AI-generated deepfakes, biased decision-making, and jailbreaking attempts that bypass safeguards.

Yes, but: Overly rigid definitions could backfire, Chris Sestito, founder and CEO of AI security company HiddenLayer, tells Axios.

"We can't be flippant and just say, 'Hey, this is just on the bias side and this is on the content side,'" Sestito says. "It can get very out of control very quickly."

Between the lines: It's unclear which AI safety initiatives may be deprioritized as the U.S. shifts its approach.

In the U.K., some safety-related work — such as preventing AI from generating child sexual abuse materials — appears to be continuing, says Dane Sherrets, AI researcher and staff solutions architect at HackerOne.
Sestito says he's concerned that AI safety will be seen as a censorship issue, mirroring the current debate on social platforms.
But he says AI safety encompasses much more, including keeping nuclear secrets out of models.

Reality check: These policy rebrands may not meaningfully change AI regulation.

"Frankly, everything that we have done up to this point has been largely ineffective anyway," Sestito says.

What we're watching: AI researchers and ethical hackers have already been integrating safety concerns into security testing — work that is unlikely to slow down, especially given recent criticisms of AI red teaming in a DEF CON paper.

But the biggest signals may come from AI companies themselves, as they refine policies on whom they sell to and what security issues they prioritize in bug bounty programs.

Add Axios on Google

Untangling safety from AI security is tough, experts say

What to read next