Sep 13, 2024 - Technology

Tech industry rushes to give AI greater decision-making abilities

Ina Fried

Illustration of a strawberry shaped like a brain — Illustration: Sarah Grillo/Axios

Big announcements from OpenAI and Salesforce on Thursday highlight the tech industry's desire to give bots more decision-making capabilities, despite concerns about the technology's limitations.

Why it matters: Increasing generative AI's autonomy and reasoning capabilities could improve efficiency, but also increase risk.

Driving the news: OpenAI on Thursday announced o1 (previously code-named Strawberry), a new model that pauses to evaluate different ways of responding before starting to answer a question.

The result, OpenAI says, is much better at handling complex queries, especially around math, science and coding.
Salesforce, meanwhile, debuted Agentforce, its effort to move from using genAI as a copilot to improve human productivity and into a world where autonomous AI agents are empowered to take action on their own, albeit with guardrails and limits.

Zoom in: Early customers say these more powerful AI systems are showing results.

Thomson Reuters, whose legal unit CoCounsel had early access to o1, says it has seen the new model do better on tasks that require more analysis as well as strict adherence to instructions and data in specific documents.

"Its careful attention to detail and thorough thinking enables it to do a few tasks correctly where we have seen every other model so far fail," CoCounsel product head Jake Heller told Axios.
Answers take longer, Heller said, but he added that most of the time "professionals want the most thorough, detailed and accurate answer — and they would much rather wait for it than get something wrong and quick."

Wiley, which has been using an early version of Agentforce, said the technology is allowing it to answer more questions without having to involve humans.

"We've seen an over 40% increase in our case resolution when you compare the agent to our old chatbot," Kevin Quigley, a senior manager at Wiley, said during a Salesforce event on Thursday.

What they're saying: Executives at Salesforce and elsewhere say the key to ensuring safety is to impose strict limits on the purview and decision-making powers given to AI agents.

"You don't want to just give AI unlimited agency," Salesforce chief ethical and humane use officer Paula Goldman told Axios. "You want it to be built on a set of guardrails and thresholds and tested processes. That's where you're going to get good results, and otherwise, you're inviting a lot of risk for your company."
EqualAI CEO Miriam Vogel told Axios that using AI agents for low-stakes tasks is reasonable, but cautioned, "We do not want to move into AI agents prematurely in areas where advice could impact someone's benefits, safety, etc." To do so "is inviting liability and potential harms."
"With AI agents having access to the enterprise data and having this intelligence with their reasoning and the planning capabilities, we feel that's going to be a revolution," ServiceNow VP of platform and AI innovation Dorit Zilbershot told Axios.
ServiceNow announced its own AI agent push earlier this week. "But we know that with that power comes a lot of responsibility," she said. One of ServiceNow's key guardrails is that by default, all of an AI agent's planned actions have to be approved by a human. Once a business is confident that the agent is behaving properly, they can choose to have it act autonomously, Zilbershot said.

Yes, but: Autonomous bots are in danger of simply warring against each other.

"Even if we get bias and hallucinations down to an acceptable level, many proposed AI agents' use cases don't make sense because they will set up arms-race conditions that drive up costs for everyone, but only benefit the arms dealers," Phil Libin, co-founder and former CEO of Evernote, told Axios.
"LLMs are incomplete, but they can be an important part of a system that has other, non-LLM ways of grounding them to reality and values," Libin added.

Between the lines: Even OpenAI's use of the term "thinking" to describe what is happening before o1 responds is a misnomer, said Hugging Face CEO Clement Delangue.

"An AI system is not 'thinking', it's 'processing,'" Delangue wrote on X. "Giving the false impression that technology systems are human is just cheap snake oil and marketing to fool you into thinking it's more clever than it is."

The bottom line: Before giving AI more autonomy, experts say the industry needs to address the technology's tendency to make up information and its propensity for bias.

Add Axios on Google

Tech industry rushes to give AI greater decision-making abilities

What to read next