H-Diplo/ISSF Forum on "What We Talk About When We Talk About Nuclear Weapons."

H-Diplo | ISSF Forum, No. 2 (2014)- "What We Talk About When We Talk About Nuclear Weapons."


H-Diplo/ISSF Editors:  James McAllister and Diane Labrosse
Introduction by Scott D. Sagan

Published by H-Diplo/ISSF on 15 June 2014
“Two Renaissances in Nuclear Security Studies,” Introduction by Scott D. Sagan, Stanford University  2

“What We Talk About When We Talk About Nuclear Weapons:  A Review Essay,” by Francis J. Gavin, Frank Stanton Chair in Nuclear Security Policy Studies, MIT. 11

Response:  “The Case for Using Statistics to Study Nuclear Security,” by Matthew Fuhrmann, Texas A&M University, Matthew Kroenig, Georgetown University, and Todd S. Sechser, University of Virginia  37

Response: “Nuclear Weapons Are (Still) Poor Instruments of Blackmail: A Reply to Francis J. Gavin’s Critique” by Todd S. Sechser, University of Virginia and Matthew Fuhrmann, Texas A&M University  55

“A Superior Theory of Superiority,” Response by Matthew Kroenig, Georgetown University. 63

“Archives and the Study of Nuclear Politics” by Hal Brands, Duke University. 66

“An Apology for Numbers in the Study of National Security . . . if an apology is really necessary” by Erik Gartzke, University of California, San Diego. 77

“The Use and Abuse of Large-n Methods in Nuclear Studies” An Essay by Vipin Narang, MIT. 91

“Two Renaissances in Nuclear Security Studies,” Introduction by Scott D. Sagan, Stanford University

“Quality is better than quantity, especially in large numbers.”  Anon.

Over the past decade, two intellectual renaissances have emerged in the field of nuclear security studies. The first is in political science, where exciting new research has been published about such important subjects as the causes of nuclear weapons proliferation, the linkages between the growth of civilian nuclear power and the spread of nuclear weapons, deterrence and compellence theory and practice, and the consequences of new states acquiring atomic arsenals. A second renaissance is occurring in history, as new archives have opened up and scholars are studying such important subjects as Cold War crises, the evolution of international institutions such as the Treaty on the Non-Proliferation of Nuclear Weapons (NPT) and the International Atomic Energy Agency (IAEA), and the history of medium powers and smaller states that decided to pursue or decided to stop pursing nuclear weapons.

These two scholarly renaissances, however, have largely developed in completely separate spheres, or on parallel tracks at best, with little interchange between historians and political scientists. This is deeply unfortunate, for creative multidisciplinary research can significantly improve our understanding of complex technical, historical, and political phenomena such as the causes and consequences nuclear weapons proliferation. During the golden age of nuclear strategy in the 1950s and 1960s, for example, when many of our theories about nuclear weapons were first developed, the breadth and diversity of scholars engaged in the field was stunning. Political scientist Bernard Brodie, economist Thomas Schelling, mathematician Albert Wohlstetter, physicist Herman Kahn, and historian Roberta Wohlstetter each produced seminal contributions about nuclear weapons and strategic stability, the danger of surprise attacks, and the possibility of arms control that created both important public policy debates in Washington and personal debates in the hallways of RAND.[1] These debates on key security issues both significantly improved the quality each individual’s scholarship and the collective policy relevance of academic research for the U.S. government.

In contrast, today, the vigorous debates and intellectual cross-fertilization that enhanced earlier nuclear scholarship are missing. Both political scientists and historians too often publish only in their own disciplinary journals, attend only their own professional conferences, care only about policy implications of their narrow findings, and only engage in debates with members of their own academic tribes. Robert Jervis, James McAllister, and Francis Gavin are therefore to be thanked for putting together this H-Diplo forum and for encouraging dialogue across the disciplinary divide.

This H-Diplo forum is a most welcome exchange of views about the strengths and weaknesses of different approaches to nuclear scholarship. My introduction to the forum has three sections. First, I will briefly describe some of the trends I see emerging in the new political science and history scholarship on the effects of nuclear weapons on international politics. Second, I will briefly outline the major points made by the contributors to this lively forum. Third, I will discuss how historians and political scientists can interact and contribute to, rather than simply critique, each other’s work more effectively.

In the decades after the end of the Cold War, many political science students and scholars turned their attention to studying civil wars, insurgency, and terrorism, with far less research conducted on nuclear issues. Over the past decade, however – sparked in part by real-world policy concerns about North Korea, Iran, nuclear terrorism, and global disarmament – much new research has been published on nuclear weapons issues. This H-Diplo Forum focuses mostly on new nuclear security literature using large-N statistical methods,[2] but the renaissance in political science work on nuclear issues is much broader in focus and more diverse in terms of methodology than this admittedly important emerging strand of the literature. New nuclear weapons research in political science includes important case-study work examining the domestic political and psychological determinants of proliferation,[3] normative and constructivist analyses of states’ and individuals’ nuclear identity and ethical taboos,[4] new game theoretic models of proliferation and preventive war decisions,[5] and the use of public opinion survey experiments.[6] (Some of the contributions to this H-Diplo forum label the large-N statistical work as “the quantitative approach” to nuclear studies, but this is misleading since other commonly used methods – such as game theory and survey experiments – are also clearly quantitative in nature.) Some nuclear scholars – Vipin Narang and Matthew Kroenig, for example – use multiple methods, combining detailed case studies with quantitative tests to determine both broad correlations between variables and the importance of causal mechanisms.[7]

The renaissance in nuclear security studies among historians also displays considerable diversity regarding methods and approach. Important books by Marc Trachtenberg and Francis Gavin, for example, primarily use American archives to illuminate the evolution of U.S. nuclear strategy and high-level international diplomacy during the Cold War.[8] Other scholars have focused on social factors and domestic politics to examine nuclear decision making in different countries, such as Matthew Jones’ book about race and nuclear weapons threats in Asia, Sasha Polakow-Suransky’s work on Israeli-South African nuclear cooperation, and Tsuyoshi Hasegawa and his collaborators’ work on the influence of national perceptions and misperceptions on Cold War crises.[9] Some scholars – Timothy Naftali and Aleksandr Fursenko are a prominent example – formed effective partnerships to conduct joint research in Soviet and American archives.[10] A broad and strong group of diplomatic and international historians have conducted multi-archival research to explore the creation of international treaties such the NPT and international disputes over the spread of civilian nuclear power technology.[11] Matthew Connelly creatively organized a large consortium of younger scholars to pool intellectual perspectives and archival data to understand the history of U.S. intelligence estimates regarding nuclear weapons use and proliferation.[12] In a related development, historians, anthropologists, and sociologists using approaches developed in science, technology, and society (STS) studies have produced stunningly innovative work on missile accuracy, the Indian nuclear power and weapons programs, the fire and blast effects of nuclear weapons use, and the links between nuclear technology and development in Africa.[13]

These are exciting developments in political science and history, but it is surprising to see how rarely work in one discipline influences work in the other. This H-Diplo forum is therefore an important and pioneering effort of cross-fertilization. Historian Francis Gavin provides a detailed and critical assessment of some of the recent large-N statistical studies on deterrence and compellence. He argues that misunderstandings about specific historical cases, especially the 1961 Berlin Crisis, have created deep flaws in the specific articles he reviews here, and are also illustrative of fundamental weaknesses in the large-N social science approach to nuclear studies more generally.

The political scientists who are the targets of Gavin’s critical review – specifically Matthew Fuhrmann, Matthew Kroenig, and Todd S. Sechser – respond to Gavin’s critiques and defend their use of statistical methods both individually and in a joint ‘united front’ essay. (That Gavin succeeds in creating a unified response from these three scholars is impressive in and of itself since Kroenig, Fuhrmann, and Sechser previously engaged in a lively debate among themselves about the issues in dispute.[14] Once again, the diplomatic adage ‘the enemy of my enemy is my friend’ appears to ring true.) 

Historian Hal Brands then discusses some of the new archival materials that have been made available on nuclear issues, outlines some of his concerns about how political scientists code cases, and identifies areas in which historians and political scientists can usefully collaborate. Brands importantly calls for diplomatic and international historians to become more “theory-literate” and to be more willing to do comparative work in order to make accurate generalizations and to contribute more effectively to policy debates.  Finally, demonstrating that political scientists themselves have diverse views on the appropriate uses of large-N statistical methods to study nuclear weapons issues, Erik Gartzke provides a spirited defense of his work and that of other scholars using statistical regressions to test theories about the causes and consequences of nuclear proliferation, while Vipin Narang provides a sophisticated criticism of this scholarship for not being transparent about its own limitations and for too often overselling the robustness and magnitude of the results.

I will leave it to the readers to decide for themselves who ‘wins’ the various debates presented in this forum about what kinds of methods and evidence are best utilized to improve our understanding of nuclear politics. This forum should, however, be considered merely the start of a dialogue across disciplines. For historians and political scientists have much to offer each other. One way to gain the benefits of multidisciplinary research is to form joint research collaborations. Works produced by historians and political scientists working as co-authors on nuclear-related projects include, for example, my article with Jeremi Suri about Richard Nixon’s Madman nuclear alert in 1969 and Hal Brands and David Palkki’s co-authored work on Saddam Hussein’s nuclear ambitions.[15] But even without co-authoring, historians and political scientists could usefully improve each other’s scholarship.

Having been the recipient of one of Francis Gavin’s critical reviews in the past, I know how valuable it can be for political scientists and historians to debate carefully the accuracy of interpretations, the facts of individual cases, and the influence of hidden assumptions on our findings.[16] Yet I also see an unfortunate tendency, as exemplified by Gavin’s approach in this forum, for historians to focus their criticism on interpretations of the specific cases that they know well and accuse political scientists of getting crucial facts wrong and thus miscoding particular cases. This is a useful and fair criticism, of course, but it too easily enables political scientists to wiggle away from broader criticisms by claiming that even if one accepted that a single case was wrong, there are so many other cases represented in their data that one small change in coding does not hurt the robustness of their general findings. The nature of this kind of intellectual critique and the resulting defense is unfortunate, however, for historians can and should play even more important roles in helping to improve the research and findings of political science scholars.

Historians (and political scientists themselves) can most usefully critique political science scholarship in one of three ways. First, they can assess whether the theory or theories developed are logically consistent. Second, they can examine the accuracy of the empirical evidence to determine whether the correlations presented by authors are valid.  Third, they can dig into the details of cases – using what Alexander George calls “process tracing”[17] – to determine whether the factors and causal mechanisms posited as being influential in the theory are actually performing that function in the historical record.  

It is perhaps understandable that historians like Gavin mostly challenge interpretations of individual historical cases, in this instance the 1961 Berlin Crisis, in their critiques of political science. But because Gavin does not make an effort to demonstrate that Berlin is, as he claims, “the most important and most representative case” (see Gavin, 6, emphasis added) of the effects of the nuclear balance between adversaries on crisis outcomes, political scientists can easily claim, as they do repeatedly in this forum, that their findings are robust. This means that even if one accepts Gavin’s critique about this one case, the generalizations or findings based on a much larger set of cases in the database remain valid.

It is important that historians be much more than mere ‘fact checkers’ for political scientists, for they can usefully perform critical roles in contributing to all three of the kinds of critiques outlined above. Let me illustrate the point by focusing briefly on one of the studies that Gavin criticizes in his review: Matthew Kroenig’s 2013 International Organization article, “Nuclear Superiority and the Balance of Resolve: Explaining International Crisis Outcomes.” Kroenig theorizes that states with a larger number of nuclear warheads in their arsenal are more likely to win – that is, to achieve their central objectives – in crises with other nuclear weapons states. He clearly explains his argument – that leaders of superior states will have more resolve because they would suffer less in a nuclear exchange – and provides a formal model demonstrating the logical consistency of the argument. Kroenig even provides a folksy analogy to drive home the argument: “In more colloquial terms, the logic of the argument is that in a game of chicken between two cars on a collision course, one might expect the smaller car to swerve first, even if a crash would be disastrous for both.”[18]

At first glance, this does seem logical. But a historical sensibility would encourage a theorist to think differently about the logic in two ways. First, one would want to know if there is any reason to believe that statesmen in many of the cases knew whether their nuclear arsenal was larger or smaller than their competitors’ nuclear arsenals during crises. In some Cold War cases, such as the 1961 Berlin crisis or the 1962 Cuban Missile Crisis, the assumption that leaders were aware of the relative sizes of their arsenals seems reasonable. But I see no reason to believe that Indian or Pakistani leaders knew the size of their adversary’s nuclear arsenal when they tested nuclear weapons in 1998, during the 1999 Kargil war, or during the 2001 crisis after the terrorist attack on the Indian parliament. (Indeed, it is not even clear that Indian and Pakistani civilian leaders knew the size of their own nuclear arsenals during this period.) Second, one would want to examine the whether the variable being measured is logically connected to the factor that the theory claims is of causal importance. Kroenig operationalizes nuclear superiority in a binary manner, coding the state that has the overall larger number of warheads as the superior state regardless of whether the difference in arsenal size is massively large or minutely small and regardless of whether the weapons are strategic (that is could be used against the enemy’s homeland) or tactical battlefield weapons. But by what logic would the leaders of a state that has, for example, 4,000 nuclear weapons have more resolve than the leaders of a state that has 3,999 warheads? To extend the game of chicken analogy, by what logic would one expect the driver of a Mercedes E-500 with a larger hood ornament to have more resolve in a game of chicken than the driver of an otherwise identical E-500 with a smaller hood ornament?

This is not simply an abstract point.  Kroenig, for example, notes that the Soviet Union “won” the 1979 “Invasion of Afghanistan” crisis with the United States, and claims that this case supports his theory that the Soviet leadership had more resolve than U.S. leadership because the USSR had nuclear superiority.  In 1979, however, the database he uses shows that the Soviet arsenal had 27,935 nuclear warheads (strategic and tactical, deployed and not deployed) while the U.S. arsenal had “only” 24,107 warheads.  I leave it to the reader to determine whether there is a logical reason to think that Soviet leaders with such a “superior” nuclear arsenal in 1979 would find such an “advantage” in overall nuclear warhead numbers to be an important source of resolve in the Afghanistan crisis.[19]  

Another avenue for critiquing political science is to ask if the evidence is accurate, or in the case of large-N work, if the coding of the variables is done in a manner that is consistent with the known facts. Here, for example, rather than criticizing the coding of one or two cases, a stronger historical critique would examine whether there are systematic biases in the data that produce not one error, but a pattern of errors that push in one direction and would therefore can make the findings less robust.[20] This requires that scholars jump into the details of the data, studying it the way a good historian studies the materials in an archive. (Here I should note a point of disagreement with Erik Gartzke’s claims that quantitative political science studies are far more transparent than historically-oriented “qualitative” work. This claim both exaggerates the extent to which large-N scholars put all relevant data in appendices, which would make replication easier, and underappreciates the extent to which historically-oriented political scientists seek to use "active citations" to enable others to replicate and assess their findings.[21])

Gavin nicely questions the coding used in political science databases that claims that the Soviet Union “won” the 1961 Berlin crisis. But again, that is just one case. A much larger and systematic bias in the widely-used coding contained in political science databases has been suggested by Yevgeniy Kirpichevsky and Phillip Lipscy in their research on regime type and crisis outcomes. Kirpichevsky and Lipscy present a case study of the Cuban Missile Crisis and argue that the secret deal over the Jupiter missiles suggests that the outcome (always coded as a U.S. victory) was not as one-sided as is widely believed.[22] They theorize (and provide some initial statistical tests of the theory) that democratic leaders are less willing than autocratic leaders to accept secret deals that mask hidden compromises as public defeats. If Kirpichevsky and Lipscy’s hypothesis is correct, our confidence in the traditional coding of crisis outcomes will be reduced and many findings about crisis management, democratic advantages, and coercive diplomacy will need to be reexamined due to systematic biases in data sets.

A third way to critique political science findings – conducting detailed case studies to assess whether the causal mechanisms deemed to be important are actually producing the outcomes – is what historians excel at. Predictably, Gavin’s commentary on Berlin and other crises is at its strongest in this dimension. But I hope that political scientists do not leave the crucial task of ‘process tracing’ to the historians, for social scientists should always, in my view, be concerned about causation, not just correlations. Here I share Gavin’s skepticism about whether Kroenig, Fuhrmann, and Sechser have “proven” their cases, for until they (or others) have studied the details of at least carefully selected cases, many of their “statistically robust” findings could still be utterly spurious.  

In addition, historians, with their emphasis on explaining both continuity and change, should be even better than political scientists at identifying discontinuities that can influence the ability to make accurate generalizations across space and time. The degree to which the development of nuclear weapons in 1945 changed the dynamics of international politics, of course, is one common subject of scholarly debate in this regard.  But there are other potential nuclear discontinuities that need to be studied with the historian’s sensitivity for seeing change and the political scientist’s penchant for discerning patterns.  To what degree, for example, did the creation of the NPT in 1968, by instituting international monitoring of nuclear facilities, produce a different dynamic in weapons proliferation behavior? In addition, the number of nuclear weapons states has also changed over time, but did the existence of more potential nuclear rivals alter the dynamics of deterrence or the difficulties of arms control agreements?  It is often said that each child is born into a different family.  In a similar way, every new nuclear state is born into a different nuclear system, which might alter both the complexity of balancing behavior in deterrent relationships and possible patterns of cooperation.  These kinds of questions will best be addressed in the future through a mixture of political science and historical methodologies.

This forum is a lively start to what promises to be an ongoing effort to improve the quality and the policy relevance of nuclear scholarship in the future. Indeed, there is enough heat generated in this exchange of opinions that one can confidently predict that the forum will spark further debates. This is all to the benefit of the nuclear security studies field, provided that historians and political scientists maintain tolerance of intellectual diversity and focus on the shared goals of providing a more accurate understanding of nuclear issues and more policy-relevant scholarship. Certainly, the field of nuclear security studies has sufficient complexity and importance to deserve a big tent.


Response to H-Diplo/ ISSF Forum on “What We Talk About When We Talk About Nuclear Weapons.” http://issforum.org/ISSF/PDF/ISSF-Forum-2.pdf

Jayita Sarkar, Harvard University

Questioning the Sensitive: When Historians Study Nuclear Assistance

The dialogue initiated on this H-Diplo/ISSF Forum between political scientists and historians studying nuclear issues is a very welcome one. In many ways, it builds upon the thriving debates and discussions that we witnessed last October in Lost Pines, TX at the inaugurating workshop of the Nuclear Studies Research Initiative, which brought together both renowned scholars in history and political science as well as select graduate students engaging in interdisciplinary research. The debate could not have been timelier, not just for the two renaissances in nuclear studies that Scott Sagan mentions in his introduction but also owing to the dire need for a proliferation of historical research on issues of international security. What Frank Gavin identifies as an American problem with respect to history as a discipline (31), is probably a globally pervasive one because history departments worldwide tend to dedicate themselves more to the study of social and cultural phenomena than to security, strategy, and diplomacy. Unless there is an expanded effort by historians, especially younger ones, the nuclear studies renaissances risk becoming unsustainable in the long run.

A key challenge facing historians in this debate is probably rooted in the nature and character of historical research, which involves detailed causal and correlational analyses of a single or a small number of past phenomena. As a result, a historian is able to demonstrate possible faults with one or only few cases of a large N-study in political science. This prevents the historian from successfully dismantling the problematic theoretical revelations that emerge from that political science research. In the final analysis, not only does it prevent the political scientists from revisiting their own work, but it also allows the propagation of mistaken theories that could mislead policymakers. Hence, the higher the number of historical studies on nuclear issues, the higher is the possibility of a meaningful engagement between historians and political scientists on this subject matter, and the greater the possibility of bridging the gulf between academia and policy.

As a young historian of nuclear proliferation, my concern involves Matthew Kroenig’s 2010 book Exporting the Bomb: Technology Transfer and the Spread of Nuclear Weapons (Cornell University Press), where a historical sensibility à la Gavin could have been useful.[1] Kroenig uses the term “sensitive nuclear assistance” to mean nuclear technology that aids the development of nuclear weapons, and tends to consider the definition of “sensitive” as constant.[2] Yet, sensitive nuclear assistance as a concept has been evolving throughout much of the Cold War. Its dynamic definition itself sometimes constituted a nonproliferation tool for Washington, as in the latter half of the 1970s.[3]

As technology never remained static in the postindustrial world, neither did the constituents of nuclear technological assistance. What were celebrated in the 1950s and 1960s as technological innovations in nuclear research, was controlled or phased-out in the later decades owing to their proliferation risks. This was the case, for instance, with gas-centrifuge technology in the latter part of the 1960s, and with breeder reactor technology in the late 1970s.[4] In the years preceding the finalization of the Nuclear Nonproliferation Treaty (NPT), the United States began to classify information concerning gas centrifuges.[5] During the Carter administration, when controlling the global stockpile of plutonium became a core U.S. nonproliferation strategy, plutonium-producing breeder reactors were actively discouraged by the United States. [6]

Moreover, resource controls in atomic energy— a key nonproliferation tactic especially in the early Cold War (before the term nonproliferation became prevalent) — also underwent substantial alterations. With the discovery of uranium in several parts of the world, like South Africa, Madagascar, Niger, Australia, and Kazakhstan, it was realized that this atomic energy-producing material was not as scarce as it had initially been thought to be.[7] As a result, the importance of thorium as the second-best choice for radioactive source material, after uranium, dwindled (except in India owing to its persistent uranium-scarcity), as did U.S. efforts to control access to it. This also contributed to a loss of industrial interest in the thorium nuclear fuel cycle in most countries, India being an exception.

In other words, the evolving nature of sensitive nuclear assistance transpired owing to at least four main factors: the discovery of atomic energy-producing resource materials, notably, uranium, in various parts of the world; technological innovation in nuclear technology and the subsequent discovery of their proliferation risks, e.g. gas centrifuges and breeders; subversive tactics of second-tier proliferating states that demonstrated various exploitable loopholes in the prevailing nonproliferation instruments, for example, ‘peaceful nuclear explosion’ and plutonium-producing breeder reactors in the case of 1974 Indian nuclear test; the changing definition of ‘sensitive’ itself became an important tool of U.S. nonproliferation strategy, for example, inclusion of several industrial components like compressors into the Zangger Committee’s ‘trigger list’ in the late 1970s in order to control their unsafeguarded exports.

By considering the definition of sensitive nuclear assistance as static, Kroenig ignores vital elements that could have better informed his analysis. For instance, when is technological assistance an instance of industrial cooperation, and when does it become “sensitive nuclear assistance”? Perception of the dynamic nature of what constitutes sensitive nuclear assistance at a given time is indispensable both to the understanding of proliferation tendencies of supplier and recipient states, and U.S. nonproliferation strategies. Without deep historical studies on nuclear assistance, such awareness remains unrealizable. In the absence of such awareness, our understanding of nuclear proliferation and conflict risks remaining both skewed as well as incomplete.

