Illustration: Lazaro Gamio/Axios
Self-driving technology is hard — so hard that even the industry front-runner is showing its cards to try to get more brainpower on the problem.
Driving the news: Waymo announced Wednesday it's sharing what is believed to be one of the largest troves of self-driving vehicle data ever released in the hope of accelerating the development of automated vehicle technology.
- "The more smart brains you can get working on the problem, whether inside or outside the company, the better," says Waymo principal scientist Drago Anguelov.
Why it matters: Data is a critical ingredient for machine learning, which is why until recently, companies developing automated driving systems viewed their testing data as a closely guarded asset.
- But there's now a growing consensus that sharing that information publicly could help get self-driving cars on the road faster.
What's happening: The idea is to eliminate what has been a major roadblock for academia — a lack of relevant research data.
- Aptiv, Argo and Lyft have released maps and images collected via cameras and lidar sensors.
- Now, even Waymo — the market leader, with more than 10 million autonomous test miles — is opening up its digital vault.
Context: On any given day, an AV can collect more than 4 terabytes of raw sensor data, but not all of that is useful, Navigant Research analyst Sam Abuelsamid writes in Forbes.
- During testing, a safety driver typically oversees the vehicle's operation, while an engineer with a laptop in the passenger seat makes a notation of interesting encounters or challenging scenarios.
- At the end of the day, all the sensor data from the vehicle is downloaded. The "good stuff," as Abuelsamid calls it — encounters with pedestrians, cyclists, animals, traffic signals and more — is analyzed and labeled.
- It's a labor-intensive process, as the New York Times described in a story this week.
- Humans — lots and lots of humans, NYT notes — must label and annotate all the data by hand so the AI system can understand what it's “seeing" before it can begin learning.
- People pore over images of street scenes, drawing digital boxes around and adding labels to things that are important to know, like: This is a pedestrian, a stroller, a double yellow line.
Between the lines: The data that Waymo is releasing is particularly rich, collected from 1,000 driving scenes in 25 cities including Phoenix, San Francisco, Mountain View and Kirkland, Washington. Even so, it still amounts to just 5.5 hours of driving time.
- Each segment captures 20 seconds of continuous driving in 360-degree footage captured from 5 lidar and 5 camera sensors, giving researchers the opportunity to develop algorithms to track and predict the behavior of other road users.
- Each of the scenes has been painstakingly labeled — 13 million labels in all.
The intrigue: Waymo's move shows how high the stakes have grown, says Gartner analyst Mike Ramsey.
- "I don’t think they are worried about anyone catching them. They’re probably more worried about whether they can make this work, or if anyone can. Is this even doable after 10 years of working on this?"