Date Written: April 23, 2012


We define the concept of Information Quality (InfoQ) as the potential of a dataset to achieve a specific (scientific or practical) goal using a given empirical analysis method. InfoQ is different from data quality and analysis quality, but is dependent on these components and on the relationship between them. We survey statistical methods for increasing InfoQ at the study-design and post-data-collection stages, and consider them relative to what we define as InfoQ. We propose eight dimensions that help assess InfoQ: Data Resolution, Data Structure, Data Integration, Temporal Relevance, Generalizability, Chronology of Data and Goal, Construct Operationalization, and Communication. We demonstrate the concept of InfoQ, its components (what it is) and assessment (how it is achieved) through three case studies in online auctions research. We suggest that formalizing the concept of InfoQ can help increase the value of statistical analysis, and data mining both methodologically and practically, thus contributing to a general theory of applied statistics.

Keywords: data, statistical modeling, data analytics, data mining, study design, study goal, data quality

Kenett, Ron S. and Shmueli, Galit, On Information Quality (April 23, 2012). Robert H. Smith School Research Paper No. RHS 06-100. Available at SSRN: https://ssrn.com/abstract=1464444 or http://dx.doi.org/10.2139/ssrn.1464444

