A Fast Ecological Inference Algorithm for the R×C case

42 Pages Posted: 20 May 2024

See all articles by Charles Thraves

Charles Thraves

University of Chile - Industrial Engineering; Instituto Sistemas Complejos de Ingenieria (ISCI)

Pablo Ubilla

University College London

Date Written: May 18, 2024


It is known the difficulty to address the R×C ecological inference problem. In the past, researchers have approached this problem from various angles, including parametric probability models, entropy-maximization, and mathematical programming. In this work, the ecological inference problem is based in an election context where at each ballot box we observe candidates' votes and the number of voters from each demographic groups. Employing a non-parametric model, we use the EM algorithm to maximize the likelihood given the observed data. We show that the M-Step can be solved in a closed-form solution, while the E-Step requires an exponential number of steps to be solved exactly. To address this, we evaluate several approximation methods to compute the E-Step in polynomial time. Through simulated instances, we observe that the resulting estimations of probabilities using these methods are very close to the real values. Furthermore, some of these methods exhibit running times of less than a thousandth of a second. Then, we introduce a methodology to perform group aggregation in cases where there are insufficient samples, i.e., ballot boxes in this case, to accurately estimate voting probabilities. We apply this technique to the Chilean Presidential election of 2021, obtaining estimates of voting probabilities with bounded errors for each resulting group aggregation within each district. We note that, in general, the number of aggregated groups obtained increases with the number of ballot boxes. Finally, we show how these methods can also be used to detect outlier ballot boxes.

Keywords: Ecological Inference, R×C case, EM Algorithm, Multinomial distribution, Group aggregation

JEL Classification: C14, C44

Suggested Citation

Thraves, Charles and Ubilla, Pablo, A Fast Ecological Inference Algorithm for the R×C case (May 18, 2024). Available at SSRN: https://ssrn.com/abstract=4832834 or http://dx.doi.org/10.2139/ssrn.4832834

Charles Thraves (Contact Author)

University of Chile - Industrial Engineering ( email )

República 701, Santiago

Instituto Sistemas Complejos de Ingenieria (ISCI) ( email )

Republica 695

Pablo Ubilla

University College London ( email )

Gower Street
London, WC1E 6BT
United Kingdom

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Abstract Views
PlumX Metrics