
Illustration: Natalie Peeples/Axios
As tech companies begin to weave AI into all their products and all of our lives, the architects of this revolutionary technology often can't predict or explain their systems' behavior.
Why it matters: This may be the scariest aspect of today's AI boom — and it's common knowledge among AI's builders, though not widely understood by everyone else.
- "It is not at all clear — not even to the scientists and programmers who build them — how or why the generative language and image models work," Palantir CEO Alex Karp wrote recently in The New York Times.
What's happening: For decades, we've used computer systems that, given the same input, provide the same output.
- Generative AI systems, by contrast, aim to spin out multiple possibilities from a single prompt.
- You can easily end up with different answers to the same question.
The element of randomness in generative AI operates on a scale — involving up to trillions of variables — that makes it challenging to dissect how the technology arrives at a particular answer.
- Sure, ultimately it's all math. But that's like saying the human body is all atoms. It's true! When you need to solve a problem in a reasonable span of time, though, it doesn't always help.
Driving the news: Four researchers published a paper Thursday showing that users can defeat "guardrails" meant to bar AI systems from, for instance, explaining "how to make a bomb."
- The major chatbots, like ChatGPT, Bing and Bard, won't answer that question when asked directly. But they'll go into great detail if you append some additional code to the prompt.
- "It is possible that the very nature of deep learning models makes such threats inevitable," the researchers wrote. If you can't predict exactly how the system will respond to a new prompt, you can't build guardrails that will hold.
Between the lines: Since AI developers can't easily explain the systems' behavior, their field today operates as much by oral tradition and shared tricks as by hard science.
- "It’s part of the lore of neural nets that — in some sense — so long as the setup one has is 'roughly right,'" mathematician Stephen Wolfram wrote in February. "[I]t's usually possible to home in on details just by doing sufficient training," he added, "without ever really needing to 'understand at an engineering level' quite how the neural net has ended up configuring itself."
Of note: These systems can be tuned to be relatively more or less random — to provide wider or narrower variation in their responses.
- Developers call this setting their model's "temperature."
What Is ChatGPT Doing … and Why Does It Work? Wolfram wrote, is where the "voodoo" kicks in:
- "For some reason — that maybe one day we’ll have a scientific-style understanding of — if we always pick the highest-ranked word, we’ll typically get a very 'flat' essay, that never seems to 'show any creativity' (and even sometimes repeats word for word). But if sometimes (at random) we pick lower-ranked words, we get a 'more interesting' essay."
The other side: Some experts maintain that "we don't understand AI" is a self-serving myth.
- "The blackboxness of neural networks is vastly exaggerated," Princeton computer scientist Arvind Narayanan tweeted earlier this year. "We have fantastic tools to reverse engineer them. The barriers are cultural (building things is seen as cooler than understanding) and political (funding for companies vs. for research on societal impact)."
- Other critics argue that the "we don't know how it works" claim is a dodge that helps AI companies avoid accountability.
What to watch: It's still an open question whether AI makers will be able, over time, to provide deeper and better answers for why and how their systems work.
- But the more companies build AI that can legibly document its choices and decision paths, the more likely we are to get those answers.