Apr 5, 2023 - Technology

The public loves trying to push chatbots over the edge

Ina Fried

Illustration of a computer leaning over the edge of a cliff with a mouse dangling by its wire. — Illustration: Aïda Amer/Axios

The companies offering generative AI to the public are mostly learning the same lesson: People love using it, but they also love discovering its boundaries — and pushing past them.

Why it matters: Companies have been rushing to incorporate generative AI like ChatGPT into their products in what is essentially a massive beta test of an unfinished technology.

What's happening: The large language models that power ChatGPT and similar AI programs were trained on vast swaths of internet content — bringing along with it a whole host of biases, stereotypes and misinformation.

To limit those problems and block other unwanted content, such as violence and child exploitation, companies like Microsoft, Google and OpenAI have tried to train their AI engines to observe "guardrails."
But this technology's output isn't always predictable — it runs on probabilities more than rules.
Some users set out to figure out how to prompt the chatbots to deliberately break the guidelines companies devised. Others just stumble on edge cases by accident.

Driving the news: Snapchat said Tuesday it is tweaking the My AI chatbot it made available six weeks ago.

Specifically, Snapchat is working to identify the most harmful abuses and to potentially restrict access for some accounts, as well as to tailor the service to ensure younger users receive age-appropriate responses.
"Being able to review these early interactions with My AI has helped us identify which guardrails are working well and which need to be made stronger," Snapchat said in a blog post.
Overall, the company says the bot is returning undesired content only 0.01% of the time — things like references to violence, sexually explicit terms, illicit drug use, child sexual abuse, bullying, hate speech, derogatory or biased statements, racism, misogyny or marginalizing underrepresented groups.

Microsoft and Google also made changes after high-profile incidents in which chatbots professed love, spread misinformation and returned stereotypical and biased information.

Microsoft, for example, has restricted the length of interactions in Bing Chat to prevent customers from defeating its safeguards.

Many companies are finding that the chatbots most often break through guardrails when pushed to do so by users.

"The most common non-conforming My AI responses included My AI repeating inappropriate words in response to Snapchatters’ questions," Snapchat said.

Between the lines: People have different reasons for trying to short-circuit chatbot controls.

Some want to push the limits just to see what they can make the systems do. Others see a potential for profit. Still others see an opportunity to use the systems to sow doubt and generate misinformation.
Some conservatives, meanwhile, view efforts to limit hate speech, racism and homophobia as a sign of political bias.
Whatever the reasons, it's abundantly clear that users will do their best to knock down whatever guardrails companies erect — and companies need to build systems strong enough to handle anything a user might type.

The big picture: Generative AI is the talk of the tech industry, with new tools, including image generators and chatbots, from Microsoft, Google, Adobe and others capturing the public imagination.

Last week Elon Musk and others issued an open letter calling for a six-month pause on the development of advanced large language model-based tools. Other AI critics have called that idea unworkable, urging the government to accelerate efforts to regulate the technology.
President Biden, meeting with a White House technology advisory committee Tuesday, said that tech companies have a responsibility to ensure their products are safe before releasing them to the public.

Add Axios on Google

The public loves trying to push chatbots over the edge

What to read next