The Hard Problem of Prediction for Conflict Prevention

50 Pages Posted: 30 May 2019

See all articles by Hannes Felix Mueller

Hannes Felix Mueller

Autonomous University of Barcelona

Christopher Rauh

University of Cambridge - Cambridge-INET Institute

Date Written: May 2019


There is a growing interest in better conflict prevention and this provides a strong motivation for better conflict forecasting. A key problem of conflict forecasting for prevention is that predicting the start of conflict in previously peaceful countries is extremely hard. To make progress in this hard problem this project exploits both supervised and unsupervised machine learning. Specifically, the latent Dirichlet allocation (LDA) model is used for feature extraction from 3.8 million newspaper articles and these features are then used in a random forest model to predict conflict. We find that forecasting hard cases is possible and benefits from supervised learning despite the small sample size. Several topics are negatively associated with the outbreak of conflict and these gain importance when predicting hard onsets. The trees in the random forest use the topics in lower nodes where they are evaluated conditionally on conflict history, which allows the random forest to adapt to the hard problem and provides useful forecasts for prevention.

Keywords: Armed Conflict, Forecasting, Machine Learning, Newspaper Text, Random Forest, Topic Models

Suggested Citation

Mueller, Hannes Felix and Rauh, Christopher, The Hard Problem of Prediction for Conflict Prevention (May 2019). CEPR Discussion Paper No. DP13748, Available at SSRN:

Hannes Felix Mueller (Contact Author)

Autonomous University of Barcelona ( email )

Plaça Cívica
Cerdañola del Valles
Barcelona, Barcelona 08193

Christopher Rauh

University of Cambridge - Cambridge-INET Institute ( email )

Sidgwick Avenue
Cambridge, CB3 9DD
United Kingdom

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Abstract Views
PlumX Metrics