DIRECT: A System for Mining Data Value Conversion Rules from Disparate Data Sources

51 Pages Posted: 6 Feb 2003  

Weiguo Fan

Virginia Polytechnic Institute & State University - Department of Accounting and Information Systems

Hongjun Lu

National University of Singapore (NUS) - School of Computing

Stuart Madnick

Massachusetts Institute of Technology (MIT) - Sloan School of Management

David W. Cheung

The University of Hong Kong - Department of Computer Science and Information Systems

Date Written: November 2001

Abstract

The successful integration of data from autonomous and heterogeneous systems calls for the resolution of semantic conflicts that may be present. Such conflicts are often reffected by discrepancies in attribute values of the same data object. In this paper, we describe a recently developed prototype system, DIRECT (DIscovering and REconciling ConflicTs). The system mines data value conversion rules in the process of integrating business data from multiple sources. The system architecture and functional modules are described. The process of discovering conversion rules from sales data of a trading company is presented as an illustrative example.

Keywords: Data Integration, Data Mining, Semantic Conflicts, Data Visualization, Statistical Analysis, Data Value Conversion

Suggested Citation

Fan, Weiguo and Lu, Hongjun and Madnick, Stuart and Cheung, David W., DIRECT: A System for Mining Data Value Conversion Rules from Disparate Data Sources (November 2001). MIT Sloan Working Paper No. 4411-01. Available at SSRN: https://ssrn.com/abstract=377900 or http://dx.doi.org/10.2139/ssrn.377900

Weiguo Fan

Virginia Polytechnic Institute & State University - Department of Accounting and Information Systems ( email )

Pamplin College of Business
3007 Pamplin Hall
Blacksburg, VA 24061
United States
540-231-6588 (Phone)

HOME PAGE: http://www.cob.vt.edu/acis/faculty/wfan/

Hongjun Lu

National University of Singapore (NUS) - School of Computing ( email )

3 Science Drive 2
Singapore 117543
Singapore

Stuart E. Madnick (Contact Author)

Massachusetts Institute of Technology (MIT) - Sloan School of Management ( email )

E53-321
Cambridge, MA 02142
United States
617-253-6671 (Phone)
617-253-3321 (Fax)

David Wai-lok Cheung

The University of Hong Kong - Department of Computer Science and Information Systems ( email )

Room 301, Chow Yei Ching Building
Pokfulam Road
Hong Kong

Paper statistics

Downloads
191
Rank
129,292
Abstract Views
2,557