Jun 14, 2018

An AI learns to predict a scene from just one image

The AI system operating in maze environments with partial information. Credit: DeepMind

A machine learning system from Google's DeepMind can collect snapshots of a 3D scene taken from different angles and then predict what that environment will look like from a viewpoint it hasn't seen before, according to research published today in Science.

The big picture: Researchers want to create AIs that can build models of the world from data they've seen and then use those models to function in new environments. That capability could take an AI from the realm of learning about a space to understanding it — much the same way humans do — and is key to developing machines that can move autonomously through the world. (Think: driverless cars.)

The context: Computer vision — spurred by the availability of data and increased computing power — has rapidly advanced in the past six years. Many of the underlying algorithms largely learn via supervision: an algorithm is given a large dataset that is labeled with information (for example, about the object in a scene) and uses it to predict an output.

“Supervised learning has been super successful but it’s unsatisfying for two reasons. One, humans need to manually create the [training] datasets, which is expensive and they don’t capture everything. And two, it is not the way infants or higher mammals learn.”
— Ali Eslami, study author and researcher at DeepMind

Instead, researchers want to train machines to learn from unlabeled inputs that they process without any guidance from a human, and then to be able to apply or transfer what they learn to other new scenarios and tasks.

How it works: The system uses a pair of images of a virtual 3D scene taken from different angles to create a representation of the space. A separate “generation” network then predicts what the scene will look like from a different viewpoint it hasn’t seen before.

  • After training the generative query network (GQN) on millions of images, it could use one image to determine the identity, position and color of objects as well as shadows and other aspects of perspective, the authors wrote.
  • That ability to understand the scene's structure is the "most fascinating" part of the study, wrote the University of Maryland's Matthias Zwicker, who wasn't involved in the research.
  • The DeepMind researchers also tested the AI in a maze and reported the network can accurately predict a scene with only partial information.
  • A virtual robotic arm could also be controlled by the GQN to reach a colored object in a scene.

Yes, but: These are relatively simple virtual environments and "it remains unclear how close [the researchers'] approach could come to understanding complex, real-world environments," Zwicker writes.

Harvard's Sam Gershman told MIT Technology Review the GQN still solves only the narrow problem of predicting what a scene looks like from a different angle. According to the article:

"Gershman says it’s unclear whether DeepMind’s approach could be adapted to answer more complex questions or whether some fundamentally different approach might be required."

The challenges: Eslami says it took a couple of months to train the network. “We really were pushing the hardware available to us to its limits. We need a step up in hardware capabilities and the techniques to build these deep neural networks and train them.”

Go deeper: Read more about the various ways researchers are trying to design AI to work like the human brain.

Go deeper

Why big banks are breaking up with some fossil fuels

Illustration: Sarah Grillo/Axios

JPMorgan Chase is the latest financial giant to unveil new climate commitments, and like its peers, it is hard to disentangle how much is motivated by pressure, conscience or making a virtue of necessity.

Why it matters: The move comes as grassroots and shareholder activists are targeting the financial sector's fossil energy finance, especially amid federal inaction on climate.

Trump acknowledges lists of disloyal government officials to oust

Photo: Mandel Ngan/AFP via Getty Images

President Trump on Monday acknowledged the existence of assembled lists of government officials that his administration plans to oust and replace with trusted pro-Trump people, which were first reported by Axios' Jonathan Swan.

What he's saying: “I don’t think it's a big problem. I don’t think it's very many people,” Trump said during a press conference in India, adding he wants “people who are good for the country, loyal to the country.”

Coronavirus only part of the story behind the Dow’s drop

Photo: Andrew Burton/Getty Images

As someone has certainly told you by now, the Dow fell by more than 1,000 points yesterday, its worst day in more than two years, erasing all of 2020's gains. Most news headlines assert that the stock market's momentum was finally broken by "coronavirus fears," but that's not the full story.

What's happening: The novel coronavirus has been infecting and killing scores of people for close to a month and, depending on the day, the market has sold off or risen to record highs.