Oct 26, 2017

Computers are learning to recognize letters like we do

Illustration: Lazaro Gamio / Axios

One of the ways computers distinguish humans from robots is with CAPTCHAs — that little box with a weird letter combination at the bottom of your online ticket or other transaction. Researchers report they've now trained a computer to solve CAPTCHAs using less data than other AIs by borrowing the human brain's approach to the problem.

The big picture: This isn't about cracking CAPTCHAs, but a much larger effort to create AIs that use principles of the human brain to solve visual tasks — like recognizing a cat from a dog with not a lot of examples to go on. AI pioneer Geoffrey Hinton recently told Axios he suspects the current approach of giving computers lots of rules and labeled data is limited, and that researchers should look again to the brain to make material advances in AI. This is a step in that direction.

What's new: Computers can solve these tests if given enough examples of the images. (Google and others have retired text-based CAPTCHAs for that reason.) But if the letters are crowded together or the CAPTCHA changes in another way, the AI may need to be retrained to spot the variations, whereas humans can easily recognize a letter in many forms.

"There is a wide variety of things that a human would call an 'A' and understand as an 'A' that is very hard to get into a computer." — Dileep George, who led the research at San Francisco Bay-area AI company Vicarious.

Instead of feeding the algorithm a large dataset containing images of the many variations of CAPTCHA letters, George and his colleagues gave the computer examples of letters that it then broke into parts — the intersections of lines and different contours that make up the shapes of A and N, for example — like the human brain does. Knowing the parts and how they together construct letters, the researchers' model could use those features to identify letter variations it hadn't seen before.

The results:

  • Their model could solve reCAPTCHAs with 66.6% accuracy using five training examples per character. Humans do it with 87.4% accuracy. The accuracy for other deep learning approaches is higher than the new approach but the researchers say theirs is more robust because it models the actual shape of the letter and can therefore generalize to recognize other letters it hasn't seen before.
  • They tested the model's ability to recognize text in real world images compared to a deep learning approach. For about the same or higher accuracy, the deep learning algorithm used roughly 300 times more data.
  • The new algorithm could take a character and create plausible variations of it. "I'm not sure that those examples would be indistinguishable from examples produced by people but it is definitely grasping some important structure in what makes up those concepts of those letters," says NYU's Brenden Lake, who works on similar problems but wasn't involved in this research.

What it means: The researchers hope it sparks a broader return to neuroscience for some AI inspiration.

"Your brain is not an unstructured neural network. Genetics wires some amount of structure into it," says George. "Biology has put a scaffolding in our brain that is suitable for working with this world. It makes the brain learn quickly in our world. So we copy those insights from nature and put it in our model. Similar things can be done in neural networks."

CAPTCHAs are just one benchmark for AI. "As a problem in itself if you have an algorithm that can break CAPTCHAs that's great but it's an application that not everybody needs. Whereas object recognition is something that our minds do every second of every day and it is also a core application domain for technology companies and in products that we use and so on," says Lake.

The big question: Cognitive neuroscientist and philosopher Douglas Hofstadter has said that the toughest challenge for AI researchers is to answer the question: What are the letters 'A' and 'I'?

"Building systems modeled after the brain is a long-term process and we have done only part of the work. Much more work remains to be done if we are to scale these systems to do things that deep learning is good at doing now," says George.

Go deeper