Measuring Data Believability: A Provenance Approach

12 Pages Posted: 19 Dec 2007

See all articles by Nicolas Prat

Nicolas Prat

ESSEC Business School

Stuart Madnick

Massachusetts Institute of Technology (MIT) - Sloan School of Management

Date Written: December 2007

Abstract

Data quality is crucial for operational efficiency and sound decision making. This paper focuses on believability, a major aspect of quality, measured along three dimensions: trustworthiness, reasonableness, and temporality. We ground ourapproach on provenance, i.e. the origin and subsequent processing history of data. We present our provenance model and our approach for computing believability based on provenance metadata. The approach is structured into three increasingly complex building blocks: (1) definition of metrics for assessing the believability of data sources, (2) definition of metrics for assessing the believability of data resulting from one process run and (3) assessment of believability based on all the sources and processing history of data. We illustrate our approach with a scenario based on Internet data. To our knowledge, this is the first work to develop a precise approach to measuring data believability and making explicit use of provenance-based measurements.

Keywords: data quality, provenance metadata

Suggested Citation

Prat, Nicolas and Madnick, Stuart E., Measuring Data Believability: A Provenance Approach (December 2007). MIT Sloan Research Paper No. 4672-07. Available at SSRN: https://ssrn.com/abstract=1075723 or http://dx.doi.org/10.2139/ssrn.1075723

Nicolas Prat (Contact Author)

ESSEC Business School ( email )

3 Avenue Bernard Hirsch
CS 50105 CERGY
CERGY, CERGY PONTOISE CEDEX 95021
France

Stuart E. Madnick

Massachusetts Institute of Technology (MIT) - Sloan School of Management ( email )

E53-321
Cambridge, MA 02142
United States
617-253-6671 (Phone)
617-253-3321 (Fax)

Register to save articles to
your library

Register

Paper statistics

Downloads
139
Abstract Views
1,076
rank
206,684
PlumX Metrics