Examining pathways through which narrow AI systems might increase the likelihood of nuclear war

The goal of this post is to sketch out potential research projects about the possible pathways through which narrow AI systems might increase the likelihood of nuclear war.1

The post is mostly based on reading that I did in February-March 2022. I wrote the basic ideas in August 2022, and made quick edits in May 2023, largely in response to comments that I received on an unpublished draft. It is now very unlikely that I will do more research based on the ideas here. So, I am sharing the sketch with the hope that it might be helpful to other people doing research to reduce risks from nuclear weapons and/or AI.2

Since my research here is mostly a year old, I unfortunately do not engage here with the newest work about the topic.3 Indeed, I expect that some or many of the points that I make here will have been made in recent published work by others. I’m sharing my notes anyway, however, since it seems quick to do so, and in case my notes are able to add any additional value to the existing literature.

This post is intended to map out potential future research rather than to present existing findings. That said, to give a sense of my current views, I think the likelihood of nuclear war over the next 30 years is around 30% higher than it would be in a world where none of the pathways described here have any effect. This forecast is conditional on there not being transformative artificial intelligence. Out of this increased risk, I assign (very approximately) around 45% to Pathway 1, 30% to Pathway 2, 20% to Pathway 3, and 5% to Pathway 4.4

Please note that this is a blog post, not a research report, meaning that it was produced quickly and is not to Rethink Priorities’ typical standards of substantiveness, careful checking for accuracy, and reasoning transparency.

Please also note that many topics in nuclear security are sensitive, and so people working on this area should be cautious about what they publish. Before publishing, I ran this post by a couple of people whom I expect to be knowledgeable and careful about infohazards related to nuclear security. If you do research based on this post and are not sure whether your work would be fine to share, please feel free to reach out to me and I’d be happy to give some thoughts.

Summary

There seem to be four main possible pathways discussed in existing literature by which narrow AI systems might increase nuclear risk. I consider the amount of additional risk from each of them in turn:

  1. How much additional nuclear risk is there from reduced second-strike capability? This refers to narrow AI systems reducing the ability of a country to launch a “second” nuclear strike as retaliation against a “first” nuclear strike against that country by someone else. This might increase the likelihood of nuclear war by undermining nuclear deterrence. I focus on two specific threat-models: AI-enabled cyberattacks that make it easier to disable nuclear weapons, then AI-enabled drones and image recognition that make it easier to destroy nuclear weapons.
  2. How much risk from delegating nuclear launch authority? In this pathway, decisions about launching nuclear weapons are (partly) delegated to AI. Intrinsic weaknesses with current AI approaches — in particular brittleness — mean that the AI may launch when we would not want it to.5
  3. How much risk from a degraded information environment? In this pathway, narrow AI technologies reduce the ability for leaders to make informed decisions about nuclear launches. This may increase the likelihood that leaders launch nuclear weapons when they would not do so if they had a full picture of what’s going on.
  4. How much risk from lethal autonomous weapons (LAWs)? In this pathway, there is an incident with conventionally-armed lethal autonomous weapons that escalates to nuclear war. Alternatively, LAWs change the balance of conventional military power, making the countries that are now much weaker more tempted to resort to using nuclear weapons.

For each pathway, I list some questions that would need to be answered to understand how plausible/big the effect is. I sometimes also give brief arguments for why I think the expected effect is overrated (at least by the people who propose it).

1. How much risk from reduced second-strike capability?

Researchers have suggested several mechanisms for how narrow AI systems may reduce countries’ ability to launch “second-strikes”, i.e. to launch nuclear strikes that retaliate against a “first strike” nuclear attack from a different country.6 This threat model can also be framed in terms of countries’ nuclear weapons having reduced “survivability”, i.e. those weapons being less able to “survive” a first-strike launched by an adversary.7

If countries have reduced second-strike capability, then nuclear deterrence becomes weaker. This is because the nuclear weapons of an attacked country might not survive the attack, making the threat of retaliation less credible. Indeed, this loss of deterrence might incentivise a first strike as each country tries to wipe out its enemies before they do the same.

There does not genuinely have to be reduced second-strike capability for this pathway to increase the likelihood of nuclear war — merely the perception of reduced second-strike capability. If national leaders believe that their nuclear weapons are vulnerable — or that the weapons of their enemies are vulnerable — then they will be more likely to launch a first strike, even if this belief is incorrect. Because it is the perception of reduced second-strike capability that is dangerous, note that research that promotes this perception could be harmful.8

Two premises of this pathway seem most contestable to me:

Premise 1: Nuclear deterrence plays a significant role in preventing the use of nuclear weapons.

  • There is a lot of academic literature on whether the non-use of nuclear weapons results from deterrence rather than e.g. norms / ethics / identities (or some mix of these).
  • Given the crowdedness of this debate, I would guess that it would be difficult to make a major research contribution here!

Premise 2: There are AI-related mechanisms that significantly reduce (perceived) survivability of nuclear weapons. I discuss two possible mechanisms below by which these could happen. Perhaps other mechanisms could also reduce second-strike capability, but I don’t remember seeing other mechanisms discussed in any detail.

1.1 Impact of AI-enabled cyberattacks on the survivability of nuclear weapons

In this threat model, states could try to disable each other’s nuclear weapons through cyberattacks. This phenomenon (or the perception of this phenomenon) would lead to incentives to launch a nuclear strike before the cyberattack is successful. Cyberattacks do not depend upon AI, but AI may make it easier to carry out cyberattacks.

Key questions for assessing this threat model:

  • To what extent do nuclear weapons (or will nuclear weapons) run on technology that is modern enough to be hacked?
  • To what extent are AI-enabled cyberattacks more capable than non-AI-enabled cyberattacks?
  • Does AI-enabled hacking favor attackers over defenders? (Presumably countries could use AI cyber tools to find weaknesses in their own systems and fix them. If this makes weapons more secure — or leads to an impression that they are more secure — then this would actually decrease the likelihood of nuclear war.)

If researching these questions, please be careful about infohazards. For example, if in your research you find specific ways in which nuclear weapons could be vulnerable to cyberattack, be careful about sharing that.9 Similarly, because the perception of low survivability seems dangerous, be careful about sharing research that could contribute to this perception — even if that perception is incorrect.

Reasons to be skeptical of this threat model:

  • Some nuclear weapons systems require very little technology. For example, UK submarine captains are given letters with orders of what to do after a nuclear strike.

1.2 Impact of AI-enabled drones and image recognition on the survivability of nuclear weapons

Two nuclear weapons platforms — submarines and road- or rail-mobile launchers — basically depend on the fact (“obscurity”) that they can hide for their survivability.

In this threat model, AI could undermine the obscurity of these nuclear weapons platforms, and so their survivability. Two main mechanisms seem to be discussed in relation to this. First, autonomous drones might make it possible to collect way more sensor data. Second, computer vision and other AI-enabled data processing may make it possible to analyze massive amounts of data to find nuclear weapons platforms.10 Lieber and Press (2017) basically made this argument in the case of the US trying to destroy North Korea’s nuclear weapons.11

Key questions for assessing this threat model:

  • How feasible is it for one country to undermine the survivability of another country’s nuclear submarines by using AI to analyze sensor data? For example, would an unfeasibly large number of sensors be needed?
  • The same question as above, but for land-mobile nuclear weapons platforms.

As before, if researching these questions, please be careful regarding infohazards. Sharing ways in which the survivability of nuclear weapons could be reduced — or creating a perception that survivability is low — is plausibly dangerous.

Reasons to be skeptical of this threat model:

  • (I order roughly according to how convincing I find these reasons)
  • There are nuclear weapons that do not depend at all on hiding for their survivability — silos. So even if this pathway exists maybe the impact is not that big.
    • Although it would be bad to have more reliance on silo-based nuclear weapons because they push strongly towards “launch on warning” policies, which create the possibility of launching because of a false alarm.

  • Even if you can find submarines or road-mobile missiles, is it easy to destroy them? Especially when you might have to destroy all of them simultaneously, or risk the surviving ones launching an attack.

  • The US was supposedly able to find Russian submarines for much of the Cold War by tracking them as they left their bases. This didn’t cause nuclear strikes.

2. How much risk from delegating nuclear launch authority?

In this pathway, decisions about launching nuclear weapons are (partly) delegated to narrow AI systems. Technical weaknesses with current AI approaches, such as brittleness, mean that the AI may launch nuclear weapons in cases when we would not want it to.12

2.1 Central nuclear command & control

In the most extreme version of this pathway, a country’s entire nuclear command and control system is placed in the hands of an AI. Militaries might do this to increase the credibility of nuclear deterrence, reasoning that even if an attacker can kill everyone, there would still be retaliation.

Key questions for assessing this threat model:

  • Do decision makers believe that delegating to automated systems is a good idea? What might cause them to think it is a good idea? When have decision makers previously delegated launch authority to other people?
    • For example, Eisenhower reportedly delegated nuclear launch authority to theater military commanders because poor communications meant that he could not reliably stay in contact with them.

  • The Soviet Dead Hand system is often discussed as a historical example of this threat model. What did that system actually look like? The most detailed source that I am aware of is the book The Dead Hand — but only a small part of the book is actually about the system.13 Additionally, other sources seem to contradict each other, e.g. about the extent to which the system was actually automated, or the extent to which it is still in operation.

    • I guess, however, that making progress on this research question would be difficult for most people. For example, I assume that most of the relevant documents are in Russian, and that many are not in the public domain.

Reasons to be skeptical of this thread model:

  • Countries would presumably struggle to verify whether other countries had delegated nuclear launch authority to an AI. If countries want to have a more credible second strike capability, therefore, maybe the optimal strategy would just be to bluff about delegating to an AI.
  • Existing research often uses the historical “Dead Hand” case as supporting evidence for the plausibility of this scenario. But the description of the system in The Dead Hand is actually pretty different to the scenario described in the threat model that I describe above.14 The Dead Hand seems to me to be the most reliable source about the system — though this is hard for me to evaluate.
  • SIPRI held workshops about AI and nuclear issues, inviting various senior people from militaries, governments, etc. The notes from these workshops make it sound like these decision makers think it would be crazy to build these kinds of systems.15

    Even if the specific scenario here seems implausible, maybe we should be worried about variants. For example, maybe nuclear launch authority remains nominally in human control, but increasingly with automated systems to advise (e.g. by reporting whether a missile attack is incoming). People may defer excessively to these systems,16 causing them to launch because the machine incorrectly tells them to do so.17

2.2 Individual nuclear weapons platforms

In a milder delegation scenario, humans retain central nuclear command & control, but there are autonomous nuclear weapons platforms (such as submarines or bombers) that can also launch their nuclear weapons autonomously — such as if a nuclear war appears to have started.18 This creates the possibility for these systems to make mistakes, e.g. because of the brittleness associated with current AI techniques. The launch of nuclear weapons from individual platforms might additionally escalate into a larger nuclear war.

The Russian Poseidon submarine and US Raider bomber have been discussed as potential examples of this kind of nuclear weapon platform (although the arguments I have seen for this are very speculative).

Key questions for assessing this threat model:

  • Would national decision makers think these kinds of systems are a good idea?
  • Is the survivability of nuclear weapons likely to decrease to such an extent that it would be worth taking this kind of extreme measure to increase survivability?

3. How much risk from a degraded information environment?

In this pathway, narrow AI technologies reduce the ability for leaders to make informed decisions about nuclear launches. This may increase the likelihood that leaders launch nuclear weapons when they would not do so if they had better information about what’s going on.

3.1 Hacking early warning systems

AI-enabled hacking may make it easier to interfere with early warning systems (or create a perception that this is the case). Possible effects:

  • A state or terror group could try to create catalytic nuclear war (i.e., a nuclear war between other states) by making an early warning system falsely show an incoming attack. This could cause leaders to order a real attack “in response” to the fake one.
  • Leaders who are paranoid about an incoming attack might be less reassured by the absence of any early warning. They might think the early warning system had been hacked, and that an attack is incoming even though it cannot be detected. This might make them more likely to order their own attack.

This part of the pathway raises the same questions as in pathway 1 about how much AI-enabled hacking varies from the status quo.

3.2 Deepfakes / AI-enabled influence campaigns undermining decision making

Similarly, states or terror groups may try to start a nuclear war by convincing leaders that one has already started, or that it is the right decision to launch nuclear weapons, such as with AI-enabled influence campaigns on social media or deepfakes.

Questions:

  • Are the information sources that leaders would use to make nuclear launch decisions vulnerable to this kind of disinformation?

4. How much risk from LAWs?

In this pathway, there is an incident with conventionally-armed lethal autonomous weapons (LAWs) that escalates to nuclear war. Examples:

  • Flash escalation between LAWs from different countries. LAWs are connected to AI systems in nuclear decision making, meaning that this flash escalation goes all the way up to the nuclear level.
  • A glitch in a LAW causes it to do something horrible (like killing lots of civilians). This makes people decide to launch nuclear weapons.

Alternatively, LAWs change the balance of conventional military power, making the countries that are now much weaker more tempted to resort to nuclear weapons. See discussion of this and other “destabilization” risks in a post from Anthony Guirre / Future of Life Institute.

Questions:

  • How plausible is the idea of flash escalation involving LAWs?
  • Would LAWs be connected to nuclear systems in any way?
  • What might LAWs do that would cause people to decide to escalate to nuclear war?

Reasons to be skeptical:

  • Various historical examples show that non-nuclear violence between nuclear powers does not necessarily escalate to nuclear war.
  • For the scenario where LAWs are connected to AI systems in nuclear decision making: LAWs are likely to be on battlefields where communications are difficult. (This is part of the point of LAWs over piloted drones!) This introduces a time delay and makes it harder to imagine a flash escalation that can reach as far as nuclear systems.

Recommended reading

If you found this post interesting, or plan to research some of the ideas that I discussed, I encourage you to read more recent works on this topic that I unfortunately have not yet had much chance to engage with. I expect that these provide more nuance on many of the points in this post:

Additionally, if you are planning to research the intersection of narrow AI and nuclear risk, I may be able to share my reading notes from pre-2022 public sources on this topic; these are the notes that informed the thinking going into this post. I am avoiding sharing the notes fully publicly, since they are even rougher than this post! If the notes would be helpful to you, please make an access request here with a brief explanation of why they would be helpful.

Acknowledgments

This blog post is a project of Rethink Priorities. It was written by Oliver Guest.

Thank you to Matthijs Maas for originally pointing me towards a lot of helpful existing work, and to Patrick Levermore for feedback on an earlier draft of this post. They do not necessarily endorse the points that I make here.

If you are interested in RP’s work, please visit our research database and subscribe to our newsletter.

Notes


  1. The project could be expanded to include ways in which narrow AI systems might decrease the likelihood of nuclear war. These are not sketched out here, but there is this journal article, which argues that AI could strengthen nuclear stability. I did not find the article particularly convincing, but the NATO director for nuclear policy is a co-author, meaning that it might reflect the beliefs of relevant decision makers, making it relevant for this reason. 

  2. Because I am sharing quickly, this post unfortunately has more jargon and unreferenced empirical claims than I would like. 

  3. Recent research that I am aware of (but have unfortunately not had much opportunity to engage with) include:

    Maas, M. M., Matteucci, K., & Cooke, D. (2022). Military Artificial Intelligence as Contributor to Global Catastrophic Risk. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4115010 (Forum post here)

    Nadibaidze, A., & Miotto, N. (2023). The Impact of AI on Strategic Stability is What States Make of It: Comparing US and Russian Discourses. Journal for Peace and Nuclear Disarmament, 1–21. https://doi.org/10.1080/25751654.2023.2205552

    Rautenbach, P. (2022, November 18). Artificial Intelligence and Nuclear Command, Control, & Communications: The Risks of Integration. EA Forum. https://forum.effectivealtruism.org/posts/BGFk3fZF36i7kpwWM/artificial-intelligence-and-nuclear-command-control-and-1

    Ruhl, C. (2022). Autonomous weapons systems & military AI. Founders Pledge. https://founderspledge.com/stories/autonomous-weapon-systems-and-military-artificial-intelligence-ai (Forum post here

  4. These numbers are just a way of conveying my subjective views and are not based on any formal modeling. The numbers might change significantly with more rigorous research. 

  5. I use “brittleness” to mean an AI system's inability to adapt to novel situations or handle edge cases it has not encountered during training. 

  6. Related ways of framing this are a “counterforce revolution” (p. 31) and reduced “strategic stability” (p. 150). This threat model has also been discussed with AI governance as an example of a “structural risk” from AI (Zwetsloot & Dafoe, 2019). Note that “first strike” generally means “first strike with nuclear weapons”, but that it could also include first strikes with certain conventional weapons — see e.g. Acton (2013)

  7. See “survivability” being used this way in e.g. Lieber and Press (2017) — though this use of the term is common in International Relations literature. Note that (perhaps upsettingly) “survivability” is used here to mean that nuclear weapons, rather than people, “survive” a nuclear attack. 

  8. Cf. infohazards. That said, the term “infohazard" generally implies that the given information is true, whereas I tentatively claim that it is not true that nuclear deterrence is likely to be undermined by the mechanisms here. In any case, I claim that this blog post is not particularly infohazardous because the key ideas are already commonly discussed in the academic literature, meaning that any marginal effect from this post is small. 

  9. Reduced survivability of nuclear weapons seems bad for nuclear risk so sharing ways in which survivability could be reduced seems harmful! 

  10. For example, with sufficiently good algorithms to analyze sensor data, it may be possible to find previously undetectable submarines with sensors in the ocean. 

  11. If I remember correctly, Lieber and Press claimed that it would be much harder to destroy the nuclear weapons of other nuclear powers. (This is, in any case, my view.) 

  12. I use “brittleness” to mean an AI system's inability to adapt to novel situations or handle edge cases it has not encountered during training. A related concept is distributional shift — see Amodei et al. (2016, p. 16)

  13. From memory, less than 10% of the book focuses on the Dead Hand system (but that is just an estimate, and based on reading the book several months ago). 

  14. When the system was activated, soldiers sat in a bunker, deciding whether or not to press the button based on sensor data that came in. The only delegation here is that it would normally be Soviet leaders who decide whether to push the button. The Dead Hand system was turned on when Soviet leaders feared a nuclear attack and were therefore worried that they would not be able to order a retaliation. This was actually a safety measure — the soldiers in a bunker would survive a nuclear strike, meaning that there was no need for leaders to rely on (potentially unreliable) early warning systems or to make very quick decisions. 

  15. Although everything is under Chatham House rules and maybe there is currently an incentive to lie, so it’s hard to know how seriously to take this. 

  16. such as for psychological reasons, like automation bias 

  17. This is reminiscent of the Stanislav Petrov incident — though Petrov ignored the false alarm. 

  18. Countries might build autonomous nuclear weapons platforms because the autonomy would increase the likelihood that the platforms would be able to launch the weapons that they are carrying. For example, it would be harder to track a submarine if it could remain at sea for longer (as would presumably be the case if there are no people on board).  

Oliver Guest

Oliver Guest is an Associate Researcher on the AI Governance and Strategy team. He has just completed an MA in Security Studies from Hamburg University, focusing on nuclear security and its intersection with AI. Oliver previously obtained a BA in Modern Languages at Cambridge, where he also contributed to EA Cambridge’s fellowship program.

Previous
Previous

Concrete projects for reducing existential risk

Next
Next

The Rethink Priorities Existential Security Team's Strategy for 2023