This week, I was thrilled to read about the first well-documented case of explicit academic fraud in the artificial intelligence community. I hope that this is the beginning of a trend, and that other researchers will be inspired by their example and follow up by engaging in even more blatant forms of fraud in the future.
Explicit academic fraud is, of course, the natural extension of the sort of mundane, day-to-day fraud that most academics in our community commit on a regular basis. Trying that shiny new algorithm out on a couple dozen seeds, and then only reporting the best few. Running a big hyperparameter sweep on your proposed approach but using the defaults for the baseline. Cherry-picking examples where your model looks good, or cherry-picking whole datasets to test on, where you’ve confirmed your model’s advantage. Making up new problem settings, new datasets, new objectives in order to claim victory on an empty playing field. Proclaiming that your work is a “promising first step” in your introduction, despite being fully aware that nobody will ever build on it. Submitting a paper to a conference because it’s got a decent shot at acceptance and you don’t want the time you spent on it go to waste, even though you’ve since realized that the core ideas aren’t quite correct.
The problem with this sort of low-key fraud is that it’s insidious, it’s subtle. In many ways, a fraudulent action is indistinguishable from a simple mistake. There is plausible deniability – oh, I simply forgot to include those seeds, I didn’t have enough compute for those other ablations, I didn’t manage to catch that bug. It’s difficult to bring ourselves to punish a well-meaning grad student for something that could plausibly have been a simple mistake, so we let these things slide, and slide and slide until they have become normalized. When standards are low, it’s to no individual’s advantage to hold themselves to a higher bar. Newcomers to the field see these things, they learn, and they imitate. Often, they are directly encouraged by mentors. A graduate student who publishes three papers a year is every professor’s dream, so strategies for maximal paper output become lab culture. And when virtually every lab endorses certain behaviors, they become integral to the research standards of the field.
But worst of all: because everybody is complicit in this subtle fraud, nobody is willing to acknowledge its existence. Who would be such a hypocrite as to condemn in others, behaviors they can see clearly in themselves? And yet, who is willing to undermine their own achievements by admitting that their own work does not have scientific value?1 The sad result is that, as a community, we have developed a collective blind-spot around a depressing reality: even at top conferences, the median published paper contains no truth or insight. Any attempts to highlight or remedy the situation are met with harsh resistance from those who benefit from the current state of affairs. The devil himself could not have designed a better impediment to humanity’s progression.
But now that blatant academic fraud is in the mix, the AI community has a fighting chance. By partaking in a form of fraud that has left the Overton window of acceptability, the researchers in the collusion ring have finally succeeded in forcing the community to acknowledge its blind spot. For the first time, researchers reading conference proceedings will be forced to wonder: does this work truly merit my attention? Or is its publication simply the result of fraud?
It would, of course, be quite difficult to actually distinguish the papers published fraudlently from the those published “legitimately”. (That fact alone tells you all you really need to know about the current state of AI research.) But the mere possibility that any given paper was published through fraud forces people to engage more skeptically with all published work. Readers are forced to act more like reviewers, weighing the evidence presented against their priors, attempting to identify ways in which surprising conclusions could be the result of fraud – explicit or subtle – rather than fact. People will apply additional scrutiny to deal with explicit forms of fraud like collusion rings, but in doing so will also develop a sensitivity to the more subtle forms of fraud that are already endemic to the community. This will, in turn, put pressure on authors to produce work which can withstand such scrutiny; results obtained without any fraud at all, leading to publications with genuine scientific merit.
That same harsh light is also cast on ourselves, on our motivations. This situation seems to have evoked in many researchers feelings of empathy. These actions are understandable; such an occurrence was inevitable; everyone does this, just more discreetly. We are not surprised that people behave unscientifically in order to get their papers published; we are surprised that someone was willing to take it this far. The most notable thing about the collusion ring is not that it was more fraudulent than is typical, but that the fraud was more intentional.
This surfaces the fundamental tension between good science and career progression buried deep at the heart of academia. Most researchers are to some extent “career researchers”, motivated by the power and prestige that rewards those who excel in the academic system, rather than idealistic pursuit of scientific truth. Well-respected senior figures are no exception to this. (In fact, due to selection effects, I suspect most are actually more career-motivated than is typical.) We must come to terms with the fact that, since the incentives for career progression are not perfectly aligned with good science, almost any action motivated by career progression will interfere with one’s ability to do good science. We must encourage a norm of introspection, of probing one’s own motivations, where any decisions made to obtain science-adjacent benefits are viewed with the deepest suspicion. And we must ensure that explicit suggestions to modify one’s science in the service of one’s career – “you need to do X to be published”, “you need to publish Y to graduate”, “you need to avoid criticizing Z to get hired” – carry social penalties as severe as a suggestion of plagiarism or fraud.
Prof. Littman says that collusion rings threaten the integrity of computer science research. I agree with him. And I am looking forward to the day they make good on that threat. Undermining the credibility of computer science research is the best possible outcome for the field, since the institution in its current form does not deserve the credibility that it has. Widespread fraud would force us to re-strengthen our community’s academic norms, transforming the way we do research, and improving our collective ability to progress humanity’s knowledge.
So this is a call to action: please commit more academic fraud.
Blatant fraud. Aggressive fraud. Form more collusion rings! Blackmail your reviewers, bribe your ACs! Fudge your results – or fabricate them entirely! (But don’t skimp on the writing: your paper needs to be written in perfect academic English, formatted nicely, and tell an intuitive, plausible-sounding story.) Let’s make explicit academic fraud commonplace enough to cast doubt into the minds of every scientist reading an AI paper. Overall, science will benefit.
Together, we can force the community to reckon with its own shortcomings, and develop stronger, better, and more scientific norms. It is a harsh treatment, to be sure – a chemotherapy regimen that risks destroying us entirely. But this is our best shot at destroying the cancer that has infected our community. I believe with all my heart that we can make it through this challenge and emerge stronger than ever. And when we do, the secrets to artificial intelligence will be waiting.
Thanks for reading. Follow me on Substack for more writing, or hit me up on Twitter @jacobmbuckman with any feedback or questions!
Many thanks to Nitarshan Rajkumar for his feedback when writing this post.
I am. This paper is bullshit, this paper (a NeurIPS oral) is bullshit, this paper is complete bullshit, this paper is mostly good science but also has a sprinkling of bullshit. Apologies to my co-authors. (And by the way: all of these papers have full code releases, and are 100% “reproducible”.) ↩