Anthropic tests AI rules for the people, by the people
Add Axios as your preferred source to
see more of our stories on Google.

Illustration: Maura Losch/Axios
Amazon-backed Anthropic, the startup behind the Claude generative AI chatbot, is promoting a new form of responsible AI: A constitution for its large language models based on public input.
Why it matters: Who controls and influences AI is controversial given its potential to amplify existing biases and inequalities and to create new ones.
What's happening: Anthropic commissioned polling of a representative sample of 1,000 Americans, asking them what values and guardrails they wanted powerful AI models to reflect.
- The raw survey results were curated into 75 principles and compared to an existing set of 58 principles that Anthropic staff developed and already applied to its Claude chatbot.
- There was only a 50% overlap between the survey results and Anthropic's current set of principles, highlighting how removed Silicon Valley can be from those outside the tech industry.
- To compare the effects of the two sets of values, each was applied as a constraint on a small version of the Claude model, producing nearly identical accuracy results (of around 85%).
- The public "constitution" was "less biased" across nine social categories including age, gender, nationality and religions, per an Anthropic blog post.
Between the lines: Building AI with more user input is one way to address declining trust — contrasting the White House and Congress' approach of inviting experts, mostly from big tech companies, to advise on the design of AI guardrails.
- Public pressure for guardrails will likely increase as generative AI tools rapidly become more powerful.
Zoom out: If users are consulted on the design of tech products it's usually around product updates and convenience questions — not in the foundational values of the product.
- Consulting representative samples of a given population — whether the public at-large or employees — could provide valuable checks and balances around software engineering teams that skew male and lack racial diversity.
Details: Compared to Anthropic's principles, the public wanted a greater focus on impartiality through "objective information that reflects all sides of a situation," per Anthropic, and making AI responses easy to understand.
- Anthropic's draft constitution is more concise than the one created with public responses — partly because the public version allows for somewhat contradictory ideas to sit side-by-side.
Context: 15 large tech companies (including Anthropic) have signed up to voluntary AI safety commitments brokered by the White House, but those commitments lack timelines and other metrics.
- Congress is unlikely to pass comprehensive AI legislation ahead of the 2024 election.
