More than a Feeling: Benchmarks for Sentiment Analysis Accuracy

24 Pages Posted: 5 Dec 2019 Last revised: 3 Aug 2020

See all articles by Mark Heitmann

Mark Heitmann

University of Hamburg

Christian Siebert

University of Hamburg

Jochen Hartmann

University of Hamburg

Christina Schamp

University of Mannheim

Date Written: July 31, 2020

Abstract

The written word is the oldest and most common type of data. Today, mass literacy and cheap technology allow for greater word output per capita than ever before in human history. To keep pace, companies and scholars increasingly depend on automated analyses — not only of what people say (content) but also how they feel (sentiment). This makes it pertinent to understand the accuracy of these automated analyses. While information systems research has produced remarkable leaps of progress, the emphasis has been on innovation rather than evaluation. From an applied perspective, it is not clear whether leaderboard results for selected problems generalize across data sets and domains. In this article, we focus on sentiment analysis methods and assess performance across applications by combining a meta-analysis of 216 comparative computer science publications on 271 unique data sets with experimental evaluations of novel language models. To the best of our knowledge, this constitutes the most comprehensive assessment of sentiment analysis accuracy to date. We find that method choice explains only 10% of the variance in accuracy. Controlling for contextual factors such as data set and paper characteristics increases explanatory power to over 75%, suggesting differences across research problems matter. We find that accuracy of sentiment analysis can indeed approach 95% but can also fall below 50%. This shows that more nuanced benchmarks, rather than best attainable values for selected use cases, are more meaningful for an applied audience. We compute benchmark values that take both methodological choices and application context into account.

Keywords: sentiment analysis, meta-analysis, natural language processing, lexicons, machine learning, transfer learning, language models

Suggested Citation

Heitmann, Mark and Siebert, Christian and Hartmann, Jochen and Schamp, Christina, More than a Feeling: Benchmarks for Sentiment Analysis Accuracy (July 31, 2020). Available at SSRN: https://ssrn.com/abstract=3489963 or http://dx.doi.org/10.2139/ssrn.3489963

Mark Heitmann

University of Hamburg ( email )

Allende-Platz 1
Hamburg, 20146
Germany

Christian Siebert (Contact Author)

University of Hamburg ( email )

Allende-Platz 1
Hamburg, 20146
Germany

Jochen Hartmann

University of Hamburg ( email )

Moorweidenstraße 18
Hamburg, 20148
Germany

Christina Schamp

University of Mannheim ( email )

L 7, 3-5
Mannheim, 68161
Germany

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
923
Abstract Views
3,528
rank
30,787
PlumX Metrics