How Well Can AI do Strategy? Empirical Benchmarking Using Strategy Simulations
32 Pages Posted: 7 May 2025
Date Written: May 01, 2025
Abstract
AI research has introduced several benchmarks tracking how large language models (LLMs) have rapidly advanced in lower-level tasks such as math, science, reading comprehension, and coding. Yet no systematic evaluation criteria currently exist to assess LLMs' unaided performance in strategic decision-making. The absence of a reliable benchmark limits strategy scholars' ability to answer fundamental questions about AI's capacity to augment or automate core strategic management decisions. We propose that AI's performance on established strategy teaching simulations offers a promising benchmark, as these exercises replicate the complexity and uncertainty of strategic decision-making in a controlled, validated, and replicable environment. In this paper, we benchmark the performance of OpenAI's models on the Back Bay Battery simulation, a widely used exercise in courses on strategy and innovation. Designed to test decision-making under uncertainty, the simulation requires participants to balance trade-offs between short-term profitability and long-term competitive positioning, while integrating diverse information about customer preferences, competitive moves, and evolving technologies over extended time horizons. We created an interface that allows AI to interact with the simulation without any fine-tuning or prompting beyond the information available within the simulation itself. We find that OpenAI's latest o3-mini model performs on par with MBA students from a top school. Other recent models (GPT-4o, o1-mini), while not as strong as o3-mini, significantly outperform earlier versions (GPT-4, GPT-3.5), although the pace of progress appears to have slowed. Beyond showing that AI can make effective strategic decisions, our simulation-based approach offers a useful empirical benchmark for tracking its future development.
Keywords: Strategy, Artificial Intelligence, Large Language Models, Strategic Decision-Making
JEL Classification: L26, O33, D83, C63
Suggested Citation: Suggested Citation