CISL Working Paper No. 2005-08
20 Pages Posted: 20 Oct 2005
Date Written: October 2005
Data quality issues have taken on increasing importance in recent years. In our research, we have discovered that many data quality problems are actually data misinterpretation problems - that is, problems caused by heterogeneous data semantics. In this paper, we first identify semantic heterogeneities that, when not resolved, often cause data quality problems. We discuss the especially challenging problem of aggregational ontological heterogeneity, which concerns how complex entities and their relationships are aggregated. Then we illustrate how COntext INterchange (COIN) technology can be used to capture data semantics and reconcile semantic heterogeneities, thereby improving data quality.
Keywords: Data Quality, Data Semantics, Semantic Heterogeneity, Ontology, Context
Suggested Citation: Suggested Citation
Madnick, Stuart and Zhu, Hongwei (Harry), Improving Data Quality Through Effective Use of Data Semantics (DKE) (October 2005). ; MIT Sloan Working Paper No. 4558-05. Available at SSRN: https://ssrn.com/abstract=825650 or http://dx.doi.org/10.2139/ssrn.825650