Oct 18, 2017

New AlphaGo AI learns without help from humans

A game of Go. Photo: Baona / iStock

DeepMind's latest iteration of AlphaGo — the artificial intelligence that beat world champion Go player Lee Sedol in 2016 — can learn to play the ancient game without feedback from humans or data on their past plays, researchers report today in Nature. Instead, the new AlphaGo Zero started with just knowledge of the rules and learned from the success of a million random moves it made against itself.

The score: After three days of training, the AI beat the original AlphaGo 100 to 0 — and was also able to create new moves in the process. This demonstrates a decades-old idea called reinforcement learning, suggesting that "AIs based on reinforcement learning can perform much better than those that rely on human expertise," writes computer scientist Satinder Singh in his accompanying article.

What it means: If AI can utilize reinforcement learning, that could be important in cases where large amounts of human expertise isn't available. But, it isn't clear how much this strategy will generalize to other applications and problems, says the University of Washington's Pedro Domingos. Go, though more complex than chess, offers a problem with defined rules unlike a busy street with unpredictable pedestrians and ambiguous shadows that a robot-controlled car might operate in.

What's new: AlphaGo's initial iteration was trained on a database of human Go games whereas the newer AlphaGo Zero's artificial neural networks use the current state of the game as input. Through trial and error and feedback in the form of winning, the AI learned how to play.

It then used that same network to choose its next move whereas AlphaGo used a separate network. This reinforcement learning strategy, which was used extensively by AlphaGo as well, has its roots in psychology: the neural network learns from rewards like humans do.

The DeepMind researchers wrote: "the self-learned player performed much better overall, defeating the human-trained player within the first 24h of training. This suggests that AlphaGo Zero may be learning a strategy that is qualitatively different to human play."

How they did it: AlphaGo Zero uses less computing power than earlier versions but Google's immense computing power was still key. The sheer number of games the AI can play against itself is an advance, says Domingos, who is the author of a book called The Master Algorithm.

He points out though that the roughly 5 million training games of self-play it took for AlphaGo Zero to beat AlphaGo is "vastly more" than the number of games Sedol had played to become a champion.

Recent work suggests simpler forms of learning could achieve similar goals. A paper published earlier this year by OpenAI showed how a technique similar to hill-climbing — in which the AI basically starts with a solution then makes small tweaks to optimize it — can solve Atari games, albeit simpler than Go.

Go deeper

Coronavirus spreads to more countries, and South Korea ups its case count

Data: The Center for Systems Science and Engineering at Johns Hopkins, the CDC, and China's Health Ministry. Note: China numbers are for the mainland only and U.S. numbers include repatriated citizens.

The novel coronavirus continues to spread to more nations, and the U.S. reports a doubling of its confirmed cases to 34 — while noting those are mostly due to repatriated citizens, emphasizing there's no "community spread" yet in the U.S. South Korea's confirmed cases jumped from 204 on Friday to 433 on Saturday.

The big picture: COVID-19 has now killed at least 2,362 people and infected more than 77,000 others, mostly in mainland China. New countries to announce infections recently include Israel, Lebanon and Iran.

Go deeperArrowUpdated 49 mins ago - Health

Centrist Democrats beseech 2020 candidates: "Stand up to Bernie" or Trump wins

Bernie Sanders rallies in Las Vegas, Nevada on Feb. 21. Photo: Mario Tama/Getty Images

Center-left think tank Third Way urgently called on the Democratic front-runners of the 2020 presidential election to challenge Sen. Bernie Sanders on the South Carolina debate stage on Feb. 25, in a memo provided to Axios' Mike Allen on Saturday.

What they're saying: "At the Las Vegas debate ... you declined to really challenge Senator Sanders. If you repeat this strategy at the South Carolina debate this week, you could hand the nomination to Sanders, likely dooming the Democratic Party — and the nation — to Trump and sweeping down-ballot Republican victories in November."

Situational awareness

Warren Buffett. Photo: Daniel Zuchnik/WireImage

Catch up on today's biggest news:

  1. Warren Buffett releases annual letter, reassures investors about future of Berkshire Hathaway
  2. Centrist Democrats beseech 2020 candidates: "Stand up to Bernie" or Trump wins
  3. Reports: Facebook offers up to $5 for voice recordings
  4. America's future looks a lot like Nevada
  5. Greyhound bars immigration sweeps