Scope 3 Emissions: Data Quality and Machine Learning Prediction Accuracy

46 Pages Posted: 17 Aug 2022 Last revised: 18 Aug 2022

See all articles by Quyen Nguyen

Quyen Nguyen

CEFGroup & Department of Accountancy and Finance, University of Otago

Ivan Diaz-Rainey

CEFGroup & Department of Accountancy and Finance, University of Otago

Adam Kitto

EMMI

Ben McNeil

University of New South Wales (UNSW) - Climate Change Research Centre; EMMI

Nicholas Pittman

EMMI

Renzhu Zhang

CEFGroup & Department of Accountancy and Finance, University of Otago

Date Written: August 16, 2022

Abstract

This paper explores the quality of Scope 3 emission data in terms of divergence and composition and the performance of machine learning models in predicting Scope 3 emissions. We do so using the Scope 3 emission datasets of three of the largest data providers (Refinitiv Eikon, and ISS). We find considerable divergence between third party providers, making it difficult for investors to know their ‘ exposure to Scope 3 emissions. Surprisingly, divergence exists between the datasets for emissions values that have been reported by firms (identical data points between Bloomberg and Refinitiv Eikon). The divergence is even larger for ISS when it adjusts reported values using its proprietary models ( identical data points). With respect to the composition of Scope 3 emissions, firms generally report incomplete compositions, yet they are reporting more categories over time. There is a persistent contrast between relevance and completeness in the composition of Scope 3 emissions across sectors, as irrelevant categories such as travel emissions are reported more frequently than relevant ones, such as the use of products and processing of sold products We also find that the application of machine learning algorithms can improve the prediction accuracy of the aggregated Scope 3 emissions (up to 6%) and its components, especially when each category is estimated individually and aggregated into the total Scope 3 emissions values (up to 25%). It is easier to predict upstream emissions than downstream e missions. Prediction performance is primarily limited by low observations in particular categories, and predictor importance varies by category. We conclude that users of the Scope 3 emission datasets should consider data source, quality and prediction errors when using data from third party providers in their risk analyses.

Keywords: Scope 3 emissions, Carbon footprint, Climate finance, Machine learning, transition risk, Errors in variables

JEL Classification: C89, G17, Q51, Q54

Suggested Citation

Nguyen, Quyen and Diaz-Rainey, Ivan and Kitto, Adam and McNeil, Ben and Pittman, Nicholas and Zhang, Renzhu, Scope 3 Emissions: Data Quality and Machine Learning Prediction Accuracy (August 16, 2022). USAEE Working Paper No. 22-562, Available at SSRN: https://ssrn.com/abstract=4191648

Quyen Nguyen

CEFGroup & Department of Accountancy and Finance, University of Otago ( email )

P.O. Box 56
Dunedin, Otago 9010
New Zealand

Ivan Diaz-Rainey (Contact Author)

CEFGroup & Department of Accountancy and Finance, University of Otago ( email )

Dunedin
New Zealand

Adam Kitto

EMMI

Docklands, Victoria
Australia

Ben McNeil

University of New South Wales (UNSW) - Climate Change Research Centre ( email )

Sydney, 2052
United States

EMMI ( email )

Docklands, Victoria
Australia

Nicholas Pittman

EMMI

Docklands, Victoria
Australia

Renzhu Zhang

CEFGroup & Department of Accountancy and Finance, University of Otago ( email )

Dunedin
New Zealand
+64274917441 (Phone)

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
215
Abstract Views
719
rank
207,958
PlumX Metrics