Oct 15, 2019

Machine learning can't flag false news, new studies show

Current machine learning models aren't yet up to the task of distinguishing false news reports, two new papers by MIT researchers show.

The big picture: After different researchers showed that computers can convincingly generate made-up news stories without much human oversight, some experts hoped that the same machine-learning-based systems could be trained to detect such stories. But MIT doctoral student Tal Schuster's studies show that, while machines are great at detecting machine-generated text, they can't identify whether stories are true or false.

Details: Many automated fact-checking systems are trained using a database of true statements called Fact Extraction and Verification (FEVER).

  • In one study, Schuster and team showed that machine learning-taught fact-checking systems struggled to handle negative statements ("Greg never said his car wasn't blue") even when they would know the positive statement was true ("Greg says his car is blue").
  • The problem, say the researchers, is that the database is filled with human bias. The people who created FEVER tended to write their false entries as negative statements and their true statements as positive statements — so the computers learned to rate sentences with negative statements as false.
  • That means the systems were solving a much easier problem than detecting fake news. "If you create for yourself an easy target, you can win at that target," said MIT professor Regina Barzilay. "But it still doesn't bring you any closer to separating fake news from real news."
  • Both studies were headed by Schuster with teams of MIT collaborators.

The bottom line: The second study showed that machine-learning systems do a good job detecting stories that were machine-written, but not at separating the true ones from the false ones.

Yes, but: While you can generate bogus news stories more efficiently using automated text, not all stories created by automated processes are untrue.

Go deeper

The hidden costs of AI

Illustration: Eniola Odetunde/Axios

In the most exclusive AI conferences and journals, AI systems are judged largely on their accuracy: How well do they stack up against human-level translation or vision or speech?

Yes, but: In the messy real world, even the most accurate programs can stumble and break. Considerations that matter little in the lab, like reliability or computing and environmental costs, are huge hurdles for businesses.

Go deeperArrowOct 26, 2019

In AI we trust — too much

AI systems intended to help people make tough choices — like prescribing the right drug or setting the length of a prison sentence — can instead end up effectively making those choices for them, thanks to human faith in machines.

How it works: These programs generally offer new information or a few options meant to help a human decision-maker choose more wisely. But an overworked or overly trusting person can fall into a rubber-stamping role, unquestioningly following algorithmic advice.

Go deeperArrowOct 19, 2019

Analytics company Databricks raises $400 million in Series F

Databricks, a San Francisco-based data analytics SaaS company, raised $400 million in Series F funding led by Andreessen Horowitz at a $6.2 billion valuation.

Why it matters: Because this one can legitimately lay claim to all the hottest enterprise software buzzwords, from open-source to machine learning to cloud.

Go deeperArrowOct 23, 2019