Oct 12, 2019

Training real AI with fake data

Illustration: Aïda Amer/Axios

AI systems have an endless appetite for data. For an autonomous car's camera to identify pedestrians every time — not just nearly every time — its software needs to have studied countless examples of people standing, walking and running near roads.

Yes, but: Gathering and labeling those images is expensive and time consuming, and in some cases impossible. (Imagine staging a huge car crash.) So companies are teaching AI systems with fake photos and videos, sometimes also generated by AI, that stand in for the real thing.

The big picture: A few weeks ago, I wrote about the synthetic realities that surround us. Here, the machines that we now rely on — or may soon — are also learning inside their own simulated worlds.

How it works: Software that has been fed tons of human-labeled photos and videos can deduce the shapes, colors and movements that correspond, say, to a pedestrian.

  • But there's an ever-present danger that the car will come across a person in a setting unlike any it's seen before and, disastrously, fail to recognize them.
  • That's where synthetic data can fill the gap. Computers can generate millions of scenes that an actual car might not experience, even after a million driving hours.

What's happening: Startups like Landing.ai, AI.Reverie, CVEDIA and ANYVERSE can create super-realistic scenes and objects for AI systems to learn from.

  • Nvidia and others make synthetic worlds for digital versions of robots to play in, where they can test changes or learn new tricks to help them navigate the real world.
  • And autonomous vehicle makers like Waymo build their own simulations to train or test their driving software.

Synthetic data is useful for any AI system that interacts with the world — not just cars.

  • In health care, made-up data can substitute for sensitive information about patients, mirroring characteristics of the population without revealing private details.
  • In manufacturing, "if you're doing visual inspection on smartphones, you don't have a million pictures of scratched smartphones," says Andrew Ng, founder of Landing.ai and former AI head of Google and Baidu. "If you can get something to work with just 100 or 10 images, it breaks open a lot of new applications."
  • In robotics, it's helpful to imitate hard-to-find conditions. "It's very expensive to go out and vary the lighting in the real world, and you can't vary the lighting in an outdoor scene," says Mike Skolones, director of simulation technology at Nvidia. But you can in a simulator.

"We're still in the early days," says Evan Nisselson of LDV Capital, a venture firm that invests in visual technology.

  • But, he says, synthetic data keeps getting closer to reality.
  • Generative adversarial networks — the same AI technology that drives most deepfakes — have helped vault synthetic data to new heights of realism.

Go deeper

Updated 17 mins ago - Politics & Policy

George Floyd protests: Unrest continues for 6th night across U.S.

A protest near the White House on Sunday night. Photo: Alex Wong/Getty Images

Most external White House lights were turned off late Sunday as the D.C. National Guard was deployed to assist and authorities fired tear gas at hundreds of protesters nearby, per the New York Times.

What's happening: It's one of several tense, late-night standoffs between law enforcement and demonstrators.

Updated 4 hours ago - Politics & Policy

Journalists get caught in the crosshairs as protests unfold

A man waves a Black Lives Matter flag atop the CNN logo outside the CNN Center during a protest in response to the police killing of George Floyd, Atlanta, Georgia, May 29. Photo: Elijah Nouvelage/Getty Images

Dozens of journalists across the country tweeted videos Saturday night of themselves and their crews getting arrested, being shot at by police with rubber bullets, targeted with tear gas by authorities or assaulted by protesters.

Driving the news: The violence got so bad over the weekend that on Sunday the Cleveland police said the media was not allowed downtown unless "they are inside their place of business" — drawing ire from news outlets around the country, who argued that such access is a critical part of adequately covering protests.

Updated 5 hours ago - Politics & Policy

Tanker truck plows into Minneapolis protesters

The tanker after plowing into protesters on the shut-down bridge in Minneapolis on Sunday evening. Authorities said it appeared protesters escaped injury. Photo: Jeff Wheeler/Star Tribune via Getty Images

Minnesota authorities said in a statement they're investigating as a criminal matter what happened with a truck that "drove into demonstrators" on a Minneapolis bridge Sunday evening while the eight-lane road was closed for a protest.

What they're saying: Minnesota Department of Public Safety tweeted, "Very disturbing actions by a truck driver on I-35W, inciting a crowd of peaceful demonstrators. The truck driver was injured & taken to a hospital with non-life threatening injuries. He is under arrest. It doesn't appear any protesters were hit by the truck."