Oct 23, 2023 - Technology

Anthropic tests AI rules for the people, by the people

Ryan Heath

Illustration of old parchment with the words, "We the AI." — Illustration: Maura Losch/Axios

Amazon-backed Anthropic, the startup behind the Claude generative AI chatbot, is promoting a new form of responsible AI: A constitution for its large language models based on public input.

Why it matters: Who controls and influences AI is controversial given its potential to amplify existing biases and inequalities and to create new ones.

What's happening: Anthropic commissioned polling of a representative sample of 1,000 Americans, asking them what values and guardrails they wanted powerful AI models to reflect.

The raw survey results were curated into 75 principles and compared to an existing set of 58 principles that Anthropic staff developed and already applied to its Claude chatbot.
There was only a 50% overlap between the survey results and Anthropic's current set of principles, highlighting how removed Silicon Valley can be from those outside the tech industry.
To compare the effects of the two sets of values, each was applied as a constraint on a small version of the Claude model, producing nearly identical accuracy results (of around 85%).
The public "constitution" was "less biased" across nine social categories including age, gender, nationality and religions, per an Anthropic blog post.

Between the lines: Building AI with more user input is one way to address declining trust — contrasting the White House and Congress' approach of inviting experts, mostly from big tech companies, to advise on the design of AI guardrails.

Public pressure for guardrails will likely increase as generative AI tools rapidly become more powerful.

Zoom out: If users are consulted on the design of tech products it's usually around product updates and convenience questions — not in the foundational values of the product.

Consulting representative samples of a given population — whether the public at-large or employees — could provide valuable checks and balances around software engineering teams that skew male and lack racial diversity.

Details: Compared to Anthropic's principles, the public wanted a greater focus on impartiality through "objective information that reflects all sides of a situation," per Anthropic, and making AI responses easy to understand.

Anthropic's draft constitution is more concise than the one created with public responses — partly because the public version allows for somewhat contradictory ideas to sit side-by-side.

Context: 15 large tech companies (including Anthropic) have signed up to voluntary AI safety commitments brokered by the White House, but those commitments lack timelines and other metrics.

Congress is unlikely to pass comprehensive AI legislation ahead of the 2024 election.

Add Axios on Google

Anthropic tests AI rules for the people, by the people

What to read next