The food scientist who manipulated data to make headlines
Brian Wansink, a food scientist at Cornell University who gained fame for his headline-ready research, manipulated his poor data to produce false-positives and attention-grabbing results, writes Stephanie M Lee for Buzzfeed. Wansink is under investigation by Cornell for academic misconduct, and several papers have been retracted.
Why you should care: Wansink isn’t the only scientist to use sneaky statistics — they're not uncommon in food science, genetics and other fields. But Buzzfeed’s reporting shows that Wansink, one of the most well-known offenders, is also possibly one of the worst.
“He hadn’t really looked at the results critically and he was trying to make the paper say something that wasn’t true,” Krissika Kaipanen, a former graduate student in Wasink’s lab, told Lee. “That’s when I started feeling like, this is not the kind of research I want to do.”
What they did:
- Gathering the data first, and then develop a hypothesis that fit that data. In science, you should have a hypothesis first and then design an experiment to test it.
- “That was weird also,” Kaipanen told Lee, “to come up with some questions not based on any theory, just ‘What would be cool to ask?’, ‘What cool headlines could we get if we got some associations?’"
- Slicing large datasets and ‘p-hacking’ to get statistically significant results.
It’s unclear if Wansink did this with malicious intent. His former lab members, even those who criticize his science, speak of him fondly. In fact, although Wansink’s science has been scrutinized for years, it wasn’t until he described his own research practices in a blog post that the criticism rolled in.
Sound smart: P-hacking is the practice of using large datasets to find spurious correlations. It works because a lot of statistics in science comes down to the p-value, which measures the statistical robustness of an experiment. It’s best described as the probability that your experiment’s results could have occurred by chance, if your hypothesis were false. It’s the likelihood of a false-positive. Most studies require a p-value of 0.05, or a 5% chance of the finding being a false-positive, to get published. But that means studies like Wasink’s that sometimes test thousands of variables will inevitably find associations that aren’t true.
Go deeper: Read the entire story at Buzzfeed, and their prior reporting on Wansink. FiveThirtyEight’s Christie Aschwanden has a great round-up of what p-hacking is, and why it’s so easy to find spurious correlations in food science (and certain types of genetics studies and social science studies).