The R Package sentometrics to Compute, Aggregate and Predict with Textual Sentiment
Journal of Statistical Software, Vol. 99, Issue 2, pp. 1-40, 2021
40 Pages Posted: 11 Nov 2017 Last revised: 19 Aug 2021
Date Written: January 16, 2020
Abstract
We provide a hands-on introduction to optimized textual sentiment indexation using the R package sentometrics. Textual sentiment analysis is increasingly used to unlock the potential information value of textual data. The sentometrics package implements an intuitive framework to efficiently compute sentiment scores of numerous texts, to aggregate the scores into multiple time series, and to use these time series to predict other variables. The workflow of the package is illustrated with a built-in corpus of news articles from two major U.S. journals to forecast the CBOE Volatility Index.
Keywords: aggregation, penalized regression, prediction, R, sentometrics, textual sentiment, time series
JEL Classification: C10, C32, C49, C52, C87, E37
Suggested Citation: Suggested Citation