CLIMATEBERT: A Pretrained Language Model for Climate-Related Text

9 Pages Posted: 19 Oct 2022

See all articles by Nicolas Webersinke

Nicolas Webersinke

Friedrich-Alexander-Universität Erlangen-Nürnberg

Mathias Kraus

University of Erlangen-Nuremberg-Friedrich Alexander Universität Erlangen Nürnberg

Julia Bingler

Council on Economic Policies

Markus Leippold

University of Zurich; Swiss Finance Institute

Date Written: September 26, 2022

Abstract

Over the recent years, large pretrained language models (LM) have revolutionized the field of natural language processing (NLP). However, while pretraining on general language has been shown to work very well for common language, it has been observed that niche language poses problems. In particular, climate-related texts include specific language that common LMs can not represent accurately. We argue that this shortcoming of today's LMs limits the applicability of modern NLP to the broad field of text processing of climate-related texts. As a remedy, we propose ClimateBert, a transformer-based language model that is further pretrained on over 1.6 million paragraphs of climate-related texts, crawled from various sources such as common news, research articles, and climate reporting of companies. We find that ClimateBertleads to a 46% improvement on a masked language model objective which, in turn, leads to lowering error rates by 3.57% to 35.71% for various climate-related downstream tasks like text classification, sentiment analysis, and fact-checking.

Keywords: Climate Finance, Language Model, Fact-Checking, Classification

JEL Classification: G2, G38, C8, M48

Suggested Citation

Webersinke, Nicolas and Kraus, Mathias and Bingler, Julia and Leippold, Markus, CLIMATEBERT: A Pretrained Language Model for Climate-Related Text (September 26, 2022). Available at SSRN: https://ssrn.com/abstract=4229146 or http://dx.doi.org/10.2139/ssrn.4229146

Nicolas Webersinke

Friedrich-Alexander-Universität Erlangen-Nürnberg ( email )

Lange Gasse 20
Lange Gasse 20,
Nürnberg, 90403
Germany

Mathias Kraus

University of Erlangen-Nuremberg-Friedrich Alexander Universität Erlangen Nürnberg ( email )

Schloßplatz 4
Erlangen, DE Bavaria 91054
Germany

Julia Bingler

Council on Economic Policies ( email )

Zurich
Switzerland

Markus Leippold (Contact Author)

University of Zurich ( email )

Rämistrasse 71
Zürich, CH-8006
Switzerland

Swiss Finance Institute ( email )

c/o University of Geneva
40, Bd du Pont-d'Arve
CH-1211 Geneva 4
Switzerland

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
24
Abstract Views
141
PlumX Metrics