Jun 14, 2018

An AI learns to predict a scene from just one image

The AI system operating in maze environments with partial information. Credit: DeepMind

A machine learning system from Google's DeepMind can collect snapshots of a 3D scene taken from different angles and then predict what that environment will look like from a viewpoint it hasn't seen before, according to research published today in Science.

The big picture: Researchers want to create AIs that can build models of the world from data they've seen and then use those models to function in new environments. That capability could take an AI from the realm of learning about a space to understanding it — much the same way humans do — and is key to developing machines that can move autonomously through the world. (Think: driverless cars.)

The context: Computer vision — spurred by the availability of data and increased computing power — has rapidly advanced in the past six years. Many of the underlying algorithms largely learn via supervision: an algorithm is given a large dataset that is labeled with information (for example, about the object in a scene) and uses it to predict an output.

“Supervised learning has been super successful but it’s unsatisfying for two reasons. One, humans need to manually create the [training] datasets, which is expensive and they don’t capture everything. And two, it is not the way infants or higher mammals learn.”
— Ali Eslami, study author and researcher at DeepMind

Instead, researchers want to train machines to learn from unlabeled inputs that they process without any guidance from a human, and then to be able to apply or transfer what they learn to other new scenarios and tasks.

How it works: The system uses a pair of images of a virtual 3D scene taken from different angles to create a representation of the space. A separate “generation” network then predicts what the scene will look like from a different viewpoint it hasn’t seen before.

  • After training the generative query network (GQN) on millions of images, it could use one image to determine the identity, position and color of objects as well as shadows and other aspects of perspective, the authors wrote.
  • That ability to understand the scene's structure is the "most fascinating" part of the study, wrote the University of Maryland's Matthias Zwicker, who wasn't involved in the research.
  • The DeepMind researchers also tested the AI in a maze and reported the network can accurately predict a scene with only partial information.
  • A virtual robotic arm could also be controlled by the GQN to reach a colored object in a scene.

Yes, but: These are relatively simple virtual environments and "it remains unclear how close [the researchers'] approach could come to understanding complex, real-world environments," Zwicker writes.

Harvard's Sam Gershman told MIT Technology Review the GQN still solves only the narrow problem of predicting what a scene looks like from a different angle. According to the article:

"Gershman says it’s unclear whether DeepMind’s approach could be adapted to answer more complex questions or whether some fundamentally different approach might be required."

The challenges: Eslami says it took a couple of months to train the network. “We really were pushing the hardware available to us to its limits. We need a step up in hardware capabilities and the techniques to build these deep neural networks and train them.”

Go deeper: Read more about the various ways researchers are trying to design AI to work like the human brain.

Go deeper

Teenager killed after shots fired at protesters in Detroit

Detroit police during protests on Friday night. Photo: Matthew Hatcher/Getty Images

A 19-year-old man was killed on Friday night after shots were fired into a crowd of demonstrators in downtown Detroit who were protesting the death of George Floyd in Minneapolis police custody, per AP.

Details: The teenager was injured when shots were fired from an SUV about 11:30 p.m. and later died in hospital, reports MDN reports, which noted police were still looking for a suspect. Police said officers were not involved in the shooting, according to AP.

Go deeper: In photos: Protesters clash with police nationwide over George Floyd

Updated 5 hours ago - Politics & Policy

In photos: Protesters clash with police nationwide over George Floyd

Police officers grapple with protesters in Atlanta. Photo: Elijah Nouvelage/Getty Images

Police used tear gas, rubber bullets and pepper spray as the protests sparked by the killing of George Floyd spread nationwide on Friday evening.

The big picture: Police responded in force in cities ranging from Atlanta to Des Moines, Houston to Detroit, Milwaukee to D.C. and Denver to Louisville. In Los Angeles, police declared a stretch of downtown off limits, with Oakland issuing a similar warning.

Updated 6 hours ago - Politics & Policy

Supreme Court sides with California on coronavirus worship service rules

The Supreme Court has ruled 5-4, with Chief Justice John Roberts joining the court's liberal justices, to reject a challenge to California's pandemic restrictions on worship services.

Why it matters: This is a setback for those seeking to speed the reopening of houses of worship, including President Trump.