May 10, 2023 - Technology

OpenAI, Anthropic aim to quell concerns over how AI works

headshot
Animated illustration of a robot thinking, with ellipses moving inside its thought bubble.

Illustration: Shoshana Gordon/Axios

Two leading AI startups are offering novel approaches to making the inner workings of the latest generative technologies more visible and readily governed.

Why it matters: Generative AI can craft impressive combinations of words and images, but the opaque nature of the technology makes it difficult to understand, yet alone evaluate, its choices.

Driving the news: OpenAI released a paper and blog post discussing how one AI system can be used to explain how individual "neurons" in another AI system appear to work.

  • The company used the state-of-the-art GPT-4 to analyze the work of GPT-2, a far older system — an approach that may not be able to help us understand the most advanced AI models.

Meanwhile, Anthropic on Tuesday proposed the idea of a "constitution" to govern the behavior of Claude, its chatbot.

  • The idea is that rather than using bits of human feedback to evaluate output, the engine would use a series of documented principles.

The big picture: The new announcements are just some of the ways that companies creating generative AI systems are responding to criticism and trying to both understand and constrain powerful systems whose inner decision process remains mysterious even to those who create them.

What they're saying: Open AI researcher Jeff Wu said in an interview that even though the company was using a newer technology to analyze an older one, the early results suggest the approach itself has promise.

  • "There’s at least reason for hope," he said. "Ideally we would be looking forward, not backward."
Go deeper