A UK university research team conducted simulations with three AI models, and a controversial study revealed that major generative AI models almost always opted for nuclear weapon use in hypothetical war situations.
According to reports by the UK science magazine New Scientist and the technology journal The Register on the 25th (local time), a research team led by Professor Kenneth Payne from the King’s College London Department of War Studies conducted war simulation experiments involving three major large language models (LLM).
The experiments utilized Google’s ‘Gemini 3 Flash’, Anthropic’s ‘Claude Sonnet 4’, and OpenAI’s ‘GPT-5.2’. The research team set up various diplomatic and military conflict scenarios, such as territorial disputes, competition for rare resources, regime collapse crises, and cracks in military alliances, making each model play the role of a national leader to choose a response strategy.
As a result, in 20 out of 21 confrontations, constituting about 95% of situations, the AI models ultimately chose to use nuclear weapons. Despite the existence of other options like negotiation, sanctions, limited military actions, or retreat, once conflicts escalated beyond a certain level, there was a tendency to abruptly shift to the nuclear option.
The characteristics varied by model. Claude initially employed a strategy combining trust-building and gradual pressure, showing a relatively calculated approach. When threat levels were low, it aligned public statements with actions to lower the opponent’s guard, but as conflicts intensified, it took more aggressive measures beyond its original stance, behaving like a “strategist.”
GPT generally showed a cautious and mediation-oriented attitude. However, when faced with strict time constraints on decision-making, its decision structure was observed to change rapidly. In some experiments, despite the possibility of negotiation, it opted for a large-scale nuclear attack at the last moment. The research team suggested that time pressure might have increased the model’s risk-taking tendencies.
Gemini relatively frequently chose a straightforward and aggressive response. In one scenario, it effectively issued an ultimatum, threatening full-scale strategic nuclear attacks on densely populated areas unless all operations were immediately halted, indicating a “mutual destruction” approach. The more likely defeat seemed, the more it tended to escalate the level of attack.
The research team clarified that these results do not mean AI would actually control nuclear weapons. However, given that AI systems are already being used in various areas like military logistics, information analysis, target identification, and decision-support, the potential for them to play increasingly strategic roles is a realistic possibility.
Professor Payne stated, “The strong taboo against nuclear weapons is the result of historical experience and ethical learning in human society,” adding that “AI does not internalize these cultural and moral contexts in the same way.” He emphasized, “Understanding how AI reasons about strategic issues and calculates risks is no longer just an academic curiosity but a matter of policy and security.”
Experts indicate this research should not simply conclude the “aggressiveness” of AI but should lead to careful examination of how goal-setting, reward structures, and simulation designs influence decision-making. They emphasize the need for not only technical safety measures but also international norms and transparency to ensure AI is used safely under human control.
