Estimating Average Treatment Effects With Propensity Scores Estimated With Four Machine Learning Procedures: Simulation Results in High Dimensional Settings and With Time to Event Outcomes

24 Pages Posted: 16 Nov 2018

See all articles by Kip Brown

Kip Brown

CHUM Research Center

Phil Merrigan

University of Quebec at Montreal (UQAM)

Jimmy Royer

Analysis Group, Inc.; Université de Sherbrooke - Department of Economics

Date Written: September 21, 2018

Abstract

Background: The increased availability of claims data allows one to build high dimensional datasets, rich in covariates, for accurately estimating treatment effects in medical and epidemiological cohort studies. This paper shows the full potential of machine learning for the estimation of average treatment effects with propensity score methods in a context rich and high dimensional datasets.

Methods: Four different methods are used to estimate average treatment effects in the context of time to event outcomes. The four methods explored in this study are LASSO, Random Forest, Gradient Descent Boosting and Artificial Neural networks. Simulations based on an actual medical claims data set are used to assess the efficiency of these methods. The simulations are performed with over 100, 000 observations and 1,100 explanatory variables. Each method is tested on 500 datasets that are created from the original dataset, allowing us to report the mean and standard deviation of estimated average treatment effects.

Results: The results are very promising for all four methods; however, LASSO, Random Forest and Gradient Boosting seem to be performing better than Random Forest.

Conclusion: Machine Learning methods can be helpful for observational studies that use the propensity score when a very large number of covariates are available, the total number of observations is large, and the dependent event rare. This is an important result given the availability of big data related to Health Economics and Outcomes Research (HEOR) around the world.

Keywords: machine Learning, propensity score, claims data, impact of treatment on treated

JEL Classification: C01, C13, C31, C34, C45, C53, I11

Suggested Citation

Brown, Kip and Merrigan, Phil and Royer, Jimmy, Estimating Average Treatment Effects With Propensity Scores Estimated With Four Machine Learning Procedures: Simulation Results in High Dimensional Settings and With Time to Event Outcomes (September 21, 2018). Available at SSRN: https://ssrn.com/abstract=3272396 or http://dx.doi.org/10.2139/ssrn.3272396

Kip Brown

CHUM Research Center ( email )

Phil Merrigan

University of Quebec at Montreal (UQAM) ( email )

P.O. Box 8888, Downtown Station
Succursale Centre Ville
Montreal, Quebec H3C 3P8
Canada

Jimmy Royer (Contact Author)

Analysis Group, Inc. ( email )

1000 De La Gauchetiere
Suite 1200
Montreal, Quebec H3B4W5
Canada

Université de Sherbrooke - Department of Economics

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
429
Abstract Views
1,570
Rank
131,810
PlumX Metrics