Dec 23, 2020 - Technology

Introducing an AI that can plan

Bryan Walsh

DeepMind's MuZero can master games without being told the rules. Credit: DeepMind

An AI company published research Wednesday that details a machine-learning agent that can figure out how to play and win multiple games with no prior instruction.

Why it matters: The new research shows that an AI can learn by observation, much as humans do, which will have real-world ramifications that go well beyond the chessboard.

Driving the news: In a paper published in the journal Nature, researchers at Google-owned DeepMind described the science behind MuZero, an AI agent that has shown the ability to "plan winning strategies in unknown environments," as the company described in a post.

MuZero mastered Go, chess, shogi and Atari games, performing as well as the company's earlier agent AlphaZero on the board games and beating all previous algorithms on video games like Pac-Man.

Background: DeepMind's earlier AlphaZero system relied on look-ahead search to rapidly select the best possible moves at a given time. But it only works if it has pre-programmed knowledge of how the game operates.

Other algorithms try to learn an accurate model of a game's environment, which they can then use to plan strategy. But visually rich games like Pac-Man — let alone anything an AI might encounter beyond the console — are difficult for even highly capable algorithms to model.

How it works: MuZero combines these two approaches by learning a model that relies only on the most important aspects of an environment, which it can then exploit with look-ahead search.

As MuZero tries one action after another, it learns the rules of the game while noticing the rewards and penalties, refining its methods until it hits on a winning strategy.

It's not unlike how a child — albeit one with infinite patience and Google's near-infinite computing budget — would learn to play a game.

What's next: MuZero's nimble efficiency means that it should have more success with real-world applications, and DeepMind says it's already using it to invent better video compression.

Other potential applications include protein design and automated vehicles.

The bottom line: Human beings' cognitive super power is our ability to generalize from what we learn and use that to plan ahead. If AIs can fully master that skill, the possibilities could be endless.

Go deeper: Workers need help to maximize new machine colleagues

Add Axios on Google

Introducing an AI that can plan

What to read next