On Information Quality
Journal of the Royal Statistical Society, Series A, Forthcoming
34 Pages Posted: 13 Aug 2012
Date Written: August 13, 2012
We define the concept of Information Quality (InfoQ) as the potential of a dataset to achieve a specific (scientific or practical) goal using a given empirical analysis method. InfoQ is different from data quality and analysis quality, but is dependent on these components and on the relationship between them. We survey statistical methods for increasing InfoQ at the study-design and post-data-collection stages, and consider them relative to what we define as InfoQ. We propose eight dimensions that help assess InfoQ: Data Resolution, Data Structure, Data Integration, Temporal Relevance, Generalizability, Chronology of Data and Goal, Construct Operationalization, and Communication. We demonstrate the concept of InfoQ, its components (what it is) and assessment (how it is achieved) through three case studies in online auctions research. We suggest that formalizing the concept of InfoQ can help increase the value of statistical analysis, and data mining both methodologically and practically, thus contributing to a general theory of applied statistics.
Keywords: data, statistical modeling, data analytics, data mining, study design, study goal, data quality
Suggested Citation: Suggested Citation