A Model of Fake Data in Data-driven Analysis
22 Pages Posted: 14 Nov 2018
Date Written: October 23, 2018
Data-driven analysis has been increasingly used in various decision making processes. With more sources, including reviews, news, and photos, that can now be used for data analysis, the authenticity of data sources is in doubt. While previous literature attempted to detect fake data piece by piece, in the current work, we try to capture the fake data sender's strategic behavior to detect the fake data source. Specifically, we model the tension between a data receiver who makes data-driven decisions and a fake data sender who benefits from misleading the receiver. We propose a potentially infinite horizon continuous time game-theoretic model with asymmetric information to capture the fact that the receiver does not initially know the existence of fake data and learns about it during the course of the game. We use point processes to model the data traffic, where each piece of data can occur at any discrete moment in a continuous time flow. We fully solve the model and employ numerical examples to illustrate the players' strategies and payoffs for insights. Specifically, our results show that maintaining some suspicion about the data sources can be very helpful to the data receiver.
Keywords: fake data, data-driven analysis, game theory
Suggested Citation: Suggested Citation