
A driverless car's view of the road. Photo: David McNew/Getty
Once crowdsourced for pennies on platforms like Amazon Mechanical Turk, labeling data for AI is swiftly becoming a hugely lucrative market — with much of the work done in places with cheap labor like China, India and Malaysia.
Why it matters: It's a necessary step for algorithms that learn from enormous troves of examples. A system that's seen a million cat photos, hand labeled as such, will be able to identify the million-and-first.
Details: The global market for AI data labeling is predicted to explode from $150 million in 2018 to more than $1 billion by the end of 2023, according to research company Cognilytica.
- It's tedious work: Imagine spending all day at a screen just highlighting stop signs in images taken by autonomous vehicles.
- Workers abroad who label data for Alegion, a Texas-based crowdsourcing platform, earn between $3 and $6 an hour, Alegion CEO Nathaniel Gates told IEEE Spectrum.
- But it's pitched as an economic boost for rural areas, because the work can employ large numbers of people without much formal education.
- For a driverless car, "one hour of video data can lead up to 800 man-hours of work,” Siddharth Mall, co-founder of data-labeling outfit Playment, told India's Factor Daily.