Fernandez-Mulligan on Clayton, 'Bernoulli's Fallacy: Statistical Illogic and the Crisis of Modern Science'
Aubrey Clayton. Bernoulli's Fallacy: Statistical Illogic and the Crisis of Modern Science. New York: Columbia University Press, 2021. 368 pp. $23.99 (e-book), ISBN 978-0-231-55335-3; $34.95 (cloth), ISBN 978-0-231-19994-0.
Reviewed by Sebastian Fernandez-Mulligan (Yale University) Published on H-Sci-Med-Tech (June, 2022) Commissioned by Penelope K. Hardy (University of Wisconsin-La Crosse)
Printable Version: https://www.h-net.org/reviews/showpdf.php?id=57708
“Science and statistics are overdue for a reckoning,” prophesies Aubrey Clayton in his new book, Bernoulli’s Fallacy: Statistical Illogic and the Crisis of Modern Science (p. 241). The reckoning Clayton foresees rides toward us on the back of the reproducibility crisis embroiling many scientific disciplines. As a 2016 survey by the scientific journal Nature showed, only 24 percent of the surveyed scientists had ever published a reproduced result in their careers. In the pages of journals large and small, much ink has been spilled debating this challenge. Clayton joins the growing rumble of voices asking, "What do we do?"
Bernoulli’s Fallacy enters the conversation with a clear plan: orthodox statistics is founded on a logical fallacy, and to move beyond its errors and the crisis it has caused, we must employ Bayesian methods. “Consider this ... a piece of wartime propaganda,” Clayton writes (p. xv). For those readers who have little to no experience with Bayesian inference, this book is not only a successful introduction but also a call to battle, and it arrives armed to the teeth. To make his argument, Clayton canvasses a series of math problems dating from the eighteenth to the twenty-first centuries and traces the development of the modern tools used in null hypothesis statistical testing. In each case, he attempts to unpack the quantitative and logical errors in the orthodox calculations, and he works through the Bayesian solutions to the same problems.
The first two chapters set the stage for the nonmathematical reader with a clear and approachable introduction to the technical terminology. In chapter 1, Clayton asks, “What is probability?” and surveys four responses: classical, frequentist, subjective, and axiomatic. He describes how the frequentist answer, that probability is inferred from the set of observed samples, forms the foundation for modern orthodox statistics. Moving on to chapter 2, Clayton defines the book’s eponymous fallacy. He outlines Jakob Bernoulli’s classic theorem, the Law of Large Numbers, which says that the frequency of a result observed in a sample approaches the true probability of that result as the number of samples taken approaches infinity. Constructing a hypothetical case of factory-producing multicolored candies, Clayton deftly shows how inference from sampling fails to reproduce the true distribution of candy colors in certain cases.
If the approach fails to produce accurate results, how did it come to dominate the field? The next two chapters explain just that, tracing the ascent of the frequentist interpretation of probability and the development of modern statistical tools. Chapter 3 shows how methods in astronomy and geodesy were exported to social research by Adolphe Quetelet. Clayton argues that during Quetelet’s fervent application of normal distributions to observables in society, the idea of probability shifted to favor a frequentist interpretation, though he is not clear why or when. The fourth chapter, distastefully titled "The Frequentist Jihad," has little to do with the stereotype Clayton invokes. In it, he covers the complete victory of the frequentist interpretation over alternatives in the late nineteenth and early twentieth centuries. He studies the work of Karl Pearson, Francis Galton, and Ronald Fisher and shows how the tools they invented, such as the chi-squared test and Student’s t-test, were developed to support their eugenicist aims. Clayton argues that the frequentist interpretation cloaked eugenicist conclusions in an aura of objectivity, simultaneously enhancing the professional positions of these researchers and the epistemic position of frequentist statistics.
The concluding third of the book brings the reader to the reproducibility problems of the modern day. The fifth chapter takes the reader through nine examples, ranging from real world legal cases to common statistical paradoxes, in which frequentist analysis fails to reproduce common sense conclusions. Clayton then solves the same problems with the Bayesian procedure, obtaining much more reasonable results. He highlights that Bayesianism always follows the same logical process and argues that this procedure is a form of probabilistic reasoning. Following this definition, he moves on in chapter 6 to take on the problem of reproducibility in the sciences. Clayton succinctly outlines the problems that arose when the p-value became a blind standard for journal publication. Focusing on a few publications that met p-value thresholds but were eventually disproven, Clayton shows how Bayesian priors can be applied to the same data to achieve different conclusions from the beginning. In the final chapter, Clayton argues that the replication crisis is rooted in the mathematical methods scientists use. If the frequentist mission is rotten to the core, as Clayton believes, it must be replaced by Bayesian tools. Of Bayesian probability, he writes, “Probability is ultimately nothing more than a codification of our ability to reason with less than perfect information” (p. 303).
Ultimately, Bernoulli’s Fallacy is more of a wake-up call to the scientists of the present than an exhaustive musing on the mathematicians of the past. Clayton outlines a fascinating proposal—to look at the residue left on mathematical equations by their historical context. However, the discussion of eighteenth-century mathematics is largely isolated from the broader historiography of the period; Clayton sticks to biographical information on the mathematicians before briskly moving on to paraphrase the Law of Large Numbers in the language and symbols of the modern day. Furthermore, the historical connection Clayton draws between the eugenicists' desire for objectivity and the rise of frequentist statistics in chapter 4 leaves several questions pertaining to historical causality unanswered and would benefit from engaging with scholarship on the concept of objectivity at that time. Overall, the historical information tends to serve as an introduction for the statistical case studies Clayton unpacks and does not carry much analytical weight itself. I also worry that by using modern-day notation to describe eighteenth-century mathematics Clayton is ignoring the impact of notation on mathematical meaning and is projecting a contemporary viewpoint onto his cases.
This should not deter the reader who wishes to understand the current state of statistics in the sciences. The jargon of scientific statistics is a boundary that normally keeps nontechnical readers out of the conversation. Clayton expertly describes statistical concepts normally locked behind the terminology of p-values, t-tests, priors, and posteriors. Through well-chosen examples, Clayton demystifies these terms and invites the reader to participate in the critical conversation raging in the sciences. Bernoulli’s Fallacy will be of use to readers of any mathematical background who wish to understand not only the math but also the motivations behind the rising Bayesian wave. It is a vivid, nontechnical look at the bees in the contemporary statistician’s bonnet.
. Monya Baker, "1,500 Scientists Lift the Lid on Reproducibility," Nature 533 (2016): 452–54.
Citation: Sebastian Fernandez-Mulligan. Review of Clayton, Aubrey, Bernoulli's Fallacy: Statistical Illogic and the Crisis of Modern Science. H-Sci-Med-Tech, H-Net Reviews. June, 2022. URL: https://www.h-net.org/reviews/showrev.php?id=57708This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.