Aiming to close what it calls a "data divide," Microsoft on Tuesday announced a plan to make more data widely available so the benefits of artificial intelligence aren't confined to a few large companies.
Why it matters: Machine learning has the potential to make governments and countries far more efficient but often requires an enormous amount of data, in addition to the necessary computing power.
What they're saying: "Fully half of all of the data created, every day, on the internet, is flowing to only 100 companies," Microsoft President Brad Smith said in an interview.
Smith said that those same companies, left unchecked, would be the beneficiaries of AI, while most others would likely fall behind: "Fundamentally, it's going to accrue to a handful of companies on the West Coast of the United States and the East Coast of China."
As part of the push, Microsoft is:
- Publishing new principles that will guide how the company approaches sharing its data with others.
- Pledging to develop 20 new collaborations built around shared data by 2022. It's already working with the Open Data Institute and NYU's GovLab.
- Investing in tools and templates that make it easier for other companies to share data.
- In particular, Microsoft talks about making available its data around various social good projects.
Yes, but: While Microsoft is pledging to share data around issues like health and the environment, Microsoft and other companies are unlikely to share their most proprietary data sets, which will likely generate most of the profits in the AI era.
What's next: Open data efforts are getting the most interest right now from companies in Europe and the U.S., though Smith said eventually he'd like to see a truly global effort. However, he acknowledged it could take longer to get Chinese companies to join in.
- "Just as Microsoft was a late adopter of open source, I'm not sure I expect China to be an early endorser of this," Smith said.
Of note: Smith said the coronavirus had an effect on the open data project, as it has on everything else, delaying the announcement by about a week.
- "Like everybody, we asked ourselves, 'Are we going to stick with this or set it aside until after COVID-19?'" Smith said. "In the year 2030, we will almost certainly be spending a lot more time talking about the need for open data than we will be talking about COVID-19."