Evaluating and Aggregating Data Believability across Quality Sub-Dimensions and Data Lineage

8 Pages Posted: 19 Dec 2007 Last revised: 6 Jan 2009

See all articles by Nicolas Prat

Nicolas Prat

ESSEC Business School

Stuart Madnick

Massachusetts Institute of Technology (MIT) - Sloan School of Management

Date Written: December 1, 2007

Abstract

Data quality is crucial for operational efficiency and sound decision making. This paper focuses on believability, a major aspect of data quality. The issue of believability is particularly relevant in the context of Web 2.0, where mashups facilitate the combination of data from different sources. Our approach for assessing data believability is based on provenance and lineage, i.e. the origin and subsequent processing history of data. We present the main concepts of our model for representing and storing data provenance, and an ontology of the sub-dimensions of data believability. We then use aggregation operators to compute believability across the sub-dimensions of data believability and the provenance of data. We illustrate our approach with a scenario based on Internet data. Our contribution lies in three main design artifacts (1) the provenance model (2) the ontology of believability subdimensions and (3) the method for computing and aggregating data believability. To our knowledge, this is the first work to operationalize provenance-based assessment of data believability.

Keywords: Quality Sub-Dimensions, Data Lineage

Suggested Citation

Prat, Nicolas and Madnick, Stuart E., Evaluating and Aggregating Data Believability across Quality Sub-Dimensions and Data Lineage (December 1, 2007). MIT Sloan Research Paper No. 4670-07, Available at SSRN: https://ssrn.com/abstract=1075722 or http://dx.doi.org/10.2139/ssrn.1075722

Nicolas Prat

ESSEC Business School ( email )

3 Avenue Bernard Hirsch
CS 50105 CERGY
CERGY, CERGY PONTOISE CEDEX 95021
France

Stuart E. Madnick (Contact Author)

Massachusetts Institute of Technology (MIT) - Sloan School of Management ( email )

E53-321
Cambridge, MA 02142
United States
617-253-6671 (Phone)
617-253-3321 (Fax)

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
191
Abstract Views
6,811
Rank
271,711
PlumX Metrics