May 30, 2019

Medical AI has a big data problem

Illustration: Rebecca Zisser / Axios

Facing increasingly overworked doctors and labyrinthine insurance systems, hospitals are searching for a lifeline in AI systems that promises to ease hard diagnoses and treatment decisions.

Reality check: The data underpinning the very first systems is often spotty, volatile and completely lacking in critical context, leading to a poor early record in the field.

The big picture: Basic clinical decision support (CDS) systems have been around for decades, but a skepticism of technology leads many doctors to ignore or override them. Now, experts say a nascent generation of CDS — infused with AI in academic labs and startups — may reduce the estimated 40,000–80,000 deaths a year that result from medical errors.

  • The grand vision: Researchers hope AI programs can point doctors toward the best medications, lab tests or treatment plans based on minute patterns discovered in huge numbers of patients' past experiences.
  • Last week, we reported on the promise of combining pools of private data to strengthen AI systems, feeding them with ever more examples of past outcomes.
  • This helps solve the quantity issue. But data quality — a constant struggle in health care — remains an enormous threat to medical AI.

The big problem: Record keeping is so bad that doctors laugh when you ask about it.

  • Electronic medical records, central to CDS predictions, are notoriously error-ridden. Doctors fill them with generic diagnosis codes, and 82% of any given record is likely to have been copied or imported from elsewhere.
  • "Filling out documentation right is something that most physicians don't care about," says Jonathan Chen, a doctor and professor of biomedical informatics at Stanford.

Other quirks of health data make more problems for CDS systems:

  • Records reaching back even a year or two become useless for predicting future trends because of how quickly treatments shift. A study led by Chen found that medical data has a usefulness half-life of just 4 months.
  • Hospital readmission rates are a gold standard for measuring whether a particular treatment worked, but the statistic misses more important outcomes like patient happiness and long-term health.
  • Together, all this means that if doctors and researchers are not careful, "you'll end up learning patterns that are either obvious or you'll completely misinterpret what they mean in potentially dangerous ways," says Chen.

It's the oldest problem in data science: garbage in, garbage out.

  • "We are at a decided disadvantage because our core electronic record is so pitiful," says Eric Topol, director of the Scripps Research Translational Institute.
  • The messy reality of medical records tripped up IBM's much-ballyhooed Watson AI system when it was deployed at a Texas cancer hospital. "[T]he acronyms, human errors, shorthand phrases, and different styles of writing" were too much to handle, Stat News reported in 2017. An IBM executive today blamed data quality for past AI failures.

What's next: The Food and Drug Administration, which currently doesn't review most CDS systems, is considering policy changes that could head off some data issues. Scientists are pushing the agency to impose strict benchmarks and audits to prevent mistakes.

Go deeper: What your hospital knows about you

Go deeper

Boris Johnson admitted to hospital as coronavirus symptoms persist

Photo: Ray Tang/Anadolu Agency via Getty Images

U.K. Prime Minister Boris Johnson has been admitted to the hospital for tests as a "precautionary step" as his coronavirus symptoms have continued to persist 10 days after testing positive, according to a Downing Street spokesperson.

Why it matters: Johnson was the first major elected leader to test positive for the coronavirus. He was admitted on the same day that Queen Elizabeth II gave a rare televised address to the nation, urging the British people to confront the pandemic with the same "self-discipline" and "resolve" that has defined the country in times of crisis.

Go deeperArrow12 mins ago - World

Coronavirus dashboard

Illustration: Sarah Grillo/Axios

  1. Global: Total confirmed cases as of 4 p.m. ET: 1,252,265 — Total deaths: 68,413 — Total recoveries: 258,495Map.
  2. U.S.: Total confirmed cases as of 4 p.m. ET: 325,185 — Total deaths: 9.267 — Total recoveries: 16,820Map.
  3. Public health latest: CDC launches national trackers and recommends face coverings in public. Federal government will cover costs of COVID-19 treatment for uninsured. Surgeon general says this week will be "our Pearl Harbor, our 9/11 moment."
  4. 2020 latest: "We have no contingency plan," Trump said on the 2020 Republican National Convention. Biden says DNC may have to hold virtual convention.
  5. States updates: New York Gov. Andrew Cuomo said the state is "literally going day-to-day" with supplies.
  6. Work update: Queen Elizabeth II urges the British people to confront pandemic with "self-discipline" and "resolve" in rare televised address.
  7. What should I do? Pets, moving and personal health. Answers about the virus from Axios expertsWhat to know about social distancingQ&A: Minimizing your coronavirus risk.
  8. Other resources: CDC on how to avoid the virus, what to do if you get it.

Subscribe to Mike Allen's Axios AM to follow our coronavirus coverage each morning from your inbox.

Queen Elizabeth addresses U.K. amid coronavirus crisis: "We will meet again"

In a rare televised address on Sunday, Queen Elizabeth II urged the United Kingdom to respond to the coronavirus pandemic with the "self-discipline" and "resolve" that have defined the British people in moments of crisis.

Why it matters: It's just the fifth time that the queen, who traditionally speaks to the nation once a year on Christmas Day, has addressed the British people in this way during her 68-year reign.

Go deeperArrow42 mins ago - World