Evons: A Dataset for Fake and Real News Virality Analysis and Prediction

The 29th International Conference on Computational Linguistics (COLING 2022)

8 Pages Posted: 12 Nov 2022

See all articles by Kriste Krstovski

Kriste Krstovski

Columbia University

Angela Ryu

Columbia University in the City of New York

Bruce Kogut

Columbia University - Sociology/Columbia Business School

Date Written: September 16, 2022

Abstract

We present a novel collection of news articles originating from fake and real news media sources for the analysis and prediction of news virality. Unlike existing fake news datasets which either contain claims or news article headline and body, in this collection each article is supported with a Facebook engagement count which we consider as an indicator of the article virality. In addition we also provide the article description and thumbnail image with which the article was shared on Facebook. These images were automatically annotated with object tags and color attributes. Using cloud based vision analysis tools, thumbnail images were also analyzed for faces and detected faces were annotated with facial attributes. We empirically investigate the use of this collection on an example task of article virality prediction.

Keywords: Fake news, real news, virality

Suggested Citation

Krstovski, Kriste and Ryu, Angela and Kogut, Bruce, Evons: A Dataset for Fake and Real News Virality Analysis and Prediction (September 16, 2022). The 29th International Conference on Computational Linguistics (COLING 2022), Available at SSRN: https://ssrn.com/abstract=4221440 or http://dx.doi.org/10.2139/ssrn.4221440

Kriste Krstovski (Contact Author)

Columbia University ( email )

New York
United States

Angela Ryu

Columbia University in the City of New York

Bruce Kogut

Columbia University - Sociology/Columbia Business School ( email )

3022 Broadway
New York, NY MA 10027
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
10
Abstract Views
219
PlumX Metrics