Real-Time Prediction of Online False Information Purveyors and their Characteristics
19 Pages Posted: 12 Jan 2021
Date Written: October 30, 2020
Abstract
Disinformation, misinformation, and other 'fake news' - collectively false information - is quick and inexpensive to create and distribute in our increasingly digital and connected world. Identifying false information early and cost effectively can offset some of those operational advantages. In this paper, we develop light-weight machine learning models that utilize (1) a novel data set tracking browsing behavior and (2) domain registration data that is available for all websites when they are established. Using only the domain registration data, we develop and demonstrate a machine learning classifier that identifies domains, at the time the domain is registered, that will go on to produce false information. We then combine this data with our browsing data and develop a machine learning classifier that identifies false information domains whose content is most associated with higher levels of consumption. Finally, we use our data to identify false information domains that will cease operations after an event of interest, in our case the 2016 U.S. presidential election. We theorize that the last category involves actors seeking primarily to manipulate perceptions and outcomes of that event.
Suggested Citation: Suggested Citation