The Information Paradox
4 Pages Posted: 10 Jul 2019
Date Written: July 8, 2019
The following paradox is based on the consideration that the value of a statistical datum does not represent a useful information, but becomes a useful information only when it is possible to proof that it was not obtained in a random way. In practice, the probability of obtaining the same result randomly must be very low in order to consider the result useful. It follows that the value of a statistical datum is something absolute but its evaluation in order to understand whether it is useful or not is something of relative depending on the actions that have been performed. So two people who analyze the same event, under the same conditions, performing two different procedures obviously find the same value, regarding a statistical parameter, but the evaluation on the importance of the data obtained will be different because it depends on the procedure used. This condition can create a situation like the one described in this paradox, where in one case it is practically certain that the statistical datum is useful, instead in the other case the statistical datum turns out to be completely devoid of value. This paradox wants to bring attention to the importance of the procedure used to extract statistical information; in fact the way in which we act affects the probability of obtaining the same result in a random way and consequently on the evaluation of the statistical parameter.
Keywords: big data, overfitting, data analytics
JEL Classification: C1
Suggested Citation: Suggested Citation