Apr 4, 2019

The data trash-pile blame game

Illustration: Aïda Amer/Axios

The tale of the latest Facebook data spill, announced Wednesday by security outfit Upguard, has a unique new twist: No one is shouldering responsibility for the half a billion user records that were exposed on a public server.

Driving the news: The story broke yesterday when Upguard reported it had found two troves of Facebook user data sitting on publicly accessible Amazon Web Services S3 "buckets" — cloud storage containers used mostly by backend programmers.

  • The first belonged to a Mexico-based media company, Cultura Colectiva, and contained 540 million user records.
  • The second contained a backup by the apparently defunct Facebook app "At the Pool," which, though much smaller, included more sensitive information, including plain text passwords for 22,000 users.

The data originated with Facebook. But Facebook maintains that fault lies not with its own practices but rather with the developers of the apps that carelessly stored their data.

  • "Facebook's policies prohibit storing Facebook information in a public database," the company told Axios. "Once alerted to the issue, we worked with Amazon to take down the databases. We are committed to working with the developers on our platform to protect people's data."

The data lived on Amazon's cloud servers. But Amazon says that responsibility for securing data stored with it lies with the companies that put it there.

  • AWS's S3 is like the internet's data warehouse. The programmers for tons of widely used apps and services use it as a cheap, flexible, on-demand source of storage. S3 buckets are set private by default but some are made publicly accessible so users can download data directly.
  • "AWS customers own and fully control their data," Amazon said. "When we receive an abuse report concerning content that is not clearly illegal or otherwise prohibited, we notify the customer in question and ask that they take appropriate action, which is what happened here."

The data was held by the app makers.

  • But one of them, "At the Pool," seems to be out of business.
  • The other, Cultura Colectiva, offered no apology in a statement that circulated among journalists on Twitter, but said that the data it was storing came from "the fanpages we manage" and was "public, not sensitive," information.

Our thought bubble: Everything these companies say may be correct, but none of it is satisfying.

  • App makers who are aggregating hundreds of millions of data points about their users owe it to them to protect the resulting databases from random downloaders, not leave them out like a stagnant data dump.
  • Amazon may reasonably let its customers decide whether data should be public or private. But it could also take more proactive measures to alert storage users about what information they've exposed to the whole internet. The world's biggest digital landlord has a role in cleaning up the dumps on its property.
  • Facebook rightly points out that it has tightened up its policies on sharing user data with app makers since last year's Cambridge Analytica debacle. But its lax privacy practices leaked user information to other companies for years. Incidents like this one will continue to erode public trust in Facebook until the company creates something like a digital Superfund to help clean up the messes it has made.

The bottom line: Facebook and Google have turned user data into advertising gold. But that data can also end up as garbage left out on the net in abandoned "buckets" for mischief-makers and criminals to pilfer. When that happens, "not our fault" won't reassure anyone.

Go deeper

George Floyd updates

Protesters in Washington, D.C. on June 6. Photo: Samuel Corum/Getty Images

Thousands of demonstrators are gathering in cities across the U.S. and around the world to protest the killing of George Floyd. Huge crowds have assembled in Washington, D.C., Philadelphia and Chicago for full-day events.

Why it matters: Twelve days of nationwide protest in the U.S. has built pressure for states to make new changes on what kind of force law enforcement can use on civilians and prompted officials to review police conduct.

Updated 34 mins ago - Politics & Policy

Coronavirus dashboard

Illustration: Sarah Grillo/Axios

  1. Global: Total confirmed cases as of 7:30 p.m. ET: 6,852,810 — Total deaths: 398,211 — Total recoveries — 3,071,142Map.
  2. U.S.: Total confirmed cases as of 7:30 p.m. ET: 1,917,080 — Total deaths: 109,702 — Total recoveries: 500,849 — Total tested: 19,778,873Map.
  3. Public health: Why the pandemic is hitting minorities harder — Coronavirus curve rises in FloridaHow racism threatens the response to the pandemic Some people are drinking and inhaling cleaning products in attempt to fight the virus.
  4. Tech: The pandemic is accelerating next-generation disease diagnostics — Robotics looks to copy software-as-a-service model.
  5. Business: Budgets busted by coronavirus make it harder for cities to address inequality Sports, film production in California to resume June 12 after 3-month hiatus.
  6. Education: Students and teachers flunked remote learning.

Why the coronavirus pandemic is hitting minorities harder

Illustration: Aïda Amer/Axios. Photo: Mark Makela/Getty Images

The coronavirus’ disproportionate impact on black and Latino communities has become a defining part of the pandemic.

The big picture: That's a result of myriad longstanding inequities within the health care system and the American economy.