Aug 19, 2020 - Technology

How an AI grading system ignited a national controversy in the U.K.

Illustration of glitched checkmark plus

Illustration: Eniola Odetunde/Axios

A huge controversy in the U.K. over an algorithm used to substitute for university-entrance exams highlights problems with the use of AI in the real world.

Why it matters: From bail decisions to hate speech moderation, invisible algorithms are increasingly making recommendations that have a major impact on human beings. If they're seen as unfair, what happened in the U.K. could be the start of an angry pushback.

What's happening: Every summer, hundreds of thousands of British students sit for advanced-level qualification exams, known as A-levels, which help determine which students go to which universities.

  • Because of the coronavirus pandemic, however, the British government canceled A-levels this year. Instead, the government had teachers give an estimate of how they thought their students would have performed on the exams.
  • Those predicted grades were then adjusted by Ofqual, England's regulatory agency for exams and qualifications, using an algorithm that weighted the scores based on the historic performance of individual secondary schools.
  • The idea was that the algorithm would compensate for the tendency of teachers to inflate the expected performance of their students and more accurately predict how test-takers would have actually performed.

The catch: It didn't quite work out that way.

  • When students received their predicted A-level results last week, many were shocked to discover that they had "scored" lower than they had expected based on their previous grades and performance on earlier mock exams.
  • For some, the algorithm-weighted results meant that they were now ineligible for the university programs they had expected to attend, potentially altering the course of their future careers.
  • Around 40% of the predicted performances were downgraded, while only 2% of marks increased. The biggest victims were students with high grades from less-advantaged schools, who were more likely to have their scores downgraded, while students from richer schools were more likely to have their scores raised.

Of note: The BBC reported that among the students hurt by the algorithmic scoring was an 18-year-old who last year won an award for writing a dystopian short story about an algorithm that sorts students into bands based on economic class.

Be smart: The testing algorithm essentially reinforced the economic and societal bias built into the U.K.'s schooling system, leading to results that a trio of AI experts called in The Guardian "unethical and harmful to education."

  • Accusations of algorithmic bias have also been aimed at the International Baccalaureate (IB), which also canceled exams and used a similar model to predict results that also seemed to penalize students from historically lower-performing schools.

What's new: After days of front-page controversies, the British government on Monday abandoned the algorithmic results, instead deciding to accept teachers' initial predictions.

Yes, but: British students may be relieved, but the A-level debacle showcases major problems with using algorithms to predict human outcomes. "It's not just a grading crisis," says Anton Ovchinnikov, a professor at Queen's University's Smith School of Business who has written about the situation. "It's a crisis of data abuse."

  • To students on the outside, the algorithms used to adjust their grades appeared to be an unexplained black box — a frequent concern with AI systems. It wasn't clear how students could appeal predicted scores that often made little sense.
  • Putting what Ovchinnikov notes was a "disproportionately large weight" on schools' past performance meant that students — especially those from disadvantaged backgrounds — lost the chance to be treated as individuals, even though scoring high on A-levels and going to an elite university arguably represents one of the best opportunities for individuals to improve their lot in life.
  • To avoid such disasters in the future, authorities need to "be more inclusive and diverse in the process of creating such models and algorithms," says Ed Finn, an associate professor at Arizona State University and the author of "What Algorithms Want."
AI systems are easy to scale. But if there's a problem, it's also easier to duplicate that problem.
— Anton Ovchinnikov

The bottom line: Bias, positive and negative, is a fact of human life — a fact that AI systems are often meant to counter. But poorly designed algorithms risk entrenching a new form of bias that could have impacts that go well beyond university placement.

Go deeper