Using Machine Learning to Find Environmentally At-Risk Communities

10 Pages Posted: 22 Feb 2019

Date Written: December 1, 2018


Environmental health persists as a genuine concern in many US localities. However, public agencies often face limited capacity and resources to collect comprehensive environmental health data. Inspired by CalEnviroScreen, an environmental health assessment tool used to identify environmentally at-risk communities in California, I calculate pollution burden scores at the census tract level for the entire contiguous United States. Pollution burden is a composite score that encompasses 12 environmental (air, water, waste) indicators. I combine the actual pollution burden indicator data with predicted statistics using machine learning. I create an interactive, publicly accessible National Pollution Burden Map using ArcGIS Online. Although applied to US states, the same approach can also be applied to other regions of the world.

Keywords: environmental health, environmental justice, pollution, machine learning, United States

Suggested Citation

Shen, Shiran Victoria, Using Machine Learning to Find Environmentally At-Risk Communities (December 1, 2018). Available at SSRN: or

Shiran Victoria Shen (Contact Author)

Stanford University ( email )

Stanford, CA 94305
United States


Here is the Coronavirus
related research on SSRN

Paper statistics

Abstract Views
PlumX Metrics