| . |
Stuart Madnick's
Scholarly Papers
Click on the title of any column to sort the table by that
column. |
|
|
| |
|
|
Aggregate Statistics |
|
Total Downloads
11,677 |
Total
Citations
109 |
|
|
|
|
|
1.
|
|
|
Mark D. Hansen Massachusetts Institute of Technology (MIT) - Sloan School of Management Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
06 Feb 03
|
|
Last Revised:
|
|
06 Jan 06
|
|
710 (8,648)
|
2
|
|
| |
Abstract:
In this paper we examine the opportunities for data integration in the context of the emerging Web Services systems development paradigm. The paper introduces the programming standards associated with Web Services and provides an example of how Web Services can be used to unlock heterogeneous business systems to extract and integrate business data. We provide an introduction to the problems and research issues encountered when applying Web Services to data integration. We provide a formal definition of aggregation (as a type of data integration) and discuss the impact of Web Services on aggregation. We show that Web Services will make the development of systems for aggregation both faster and less expensive to develop. A system architecture for Web Services based aggregation is presented that is representative of products available from software vendors today. Finally, we highlight some of the challenges facing Web Services that are not currently being addressed by standards bodies or software vendors. These include context mediation, trusted intermediaries, quality and source selection, licensing and payment mechanisms, and systems development tools. We suggest some research directions for each of these challenges.
Data Integration, Web Services Systems Development
|
|
|
2.
|
|
|
Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
14 Nov 01
|
|
Last Revised:
|
|
07 Jan 06
|
|
519 (13,553)
|
8
|
|
| |
Abstract:
The eXtensible Markup Language (XML) offers many important benefits and improvements over its predecessor, HTML. But, articles have appeared about XML with exaggerated claims of it being a "Rosetta Stone" with "miraculuous ways" to almost automatically provide information integration. These claims are actually being believed by some executives. It is almost surprising that no one has claimed that XML can cure cancer and provide world peace! In reality, XML must face many of the same challenges that plagued Electronic Data Interchange (EDI) and database integration efforts of the past. To a large extent, there are both managerial and technical challenges - much related to the difficulties of attaining universally accepted semantically-rich standards. In this paper, these challenges will be discussed with specific emphasis on the issue of dealing with a real-world with multiple "contexts." Some promising research directions, some overlapping with the "semantic web" effort, will be presented.
|
|
|
3.
|
|
|
Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Nazli Choucri Massachusetts Institute of Technology (MIT) Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management Farnaz Haghseta Massachusetts Institute of Technology (MIT) Allen Moulton Massachusetts Institute of Technology (MIT) Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT)
|
| Posted: |
|
28 Apr 02
|
|
Last Revised:
|
|
07 May 02
|
|
417 (18,237)
|
|
|
| |
Abstract:
The convergence of three distinct but interconnected trends - unrelenting globalization, growing worldwide electronic connectivity, and increasing knowledge intensity of economic activity - is creating powerful new opportunities and challenges for global politics. This rapidly changing environment has information demands that surpass existing capabilities for information access, interpretation, and overall use, thus hindering our abilities to address emergent and complex global challenges, such as terrorism and other security threats. This reality has serious implications for two diverse domains of scholarship: international relations (IR) in political science and information technology (IT). Unless IT advances remain "one step ahead" of emergent realities and complexities, strategies for better understanding and responding to critical global challenges will be severely impeded. For example, more so now than ever, the U.S. Office of Counter-Terrorism and the newly-created Office of Homeland Security rely on intelligence information from all over the world to develop strategic responses to security threats. However, relevant information is stored in various regions throughout the world and by diverse agencies in different media, formats, and contexts. Intelligent integration of information is fundamental to developing policies to anticipate and strengthen protection against terrorist threats or attacks in the United States. This Project's activities, and relationships with its collaborators, will be coordinated through a newly formed joint Laboratory for Information Globalization and Harmonization Technologies (LIGHT). LIGHT will address information needs in the IR domain, focusing on the conflict realm, which deals with emergent risks, threats, and uncertainties of potentially global scale and scope related to: (a) crises, (b) conflicts and war; and (c) anticipation, monitoring and early warning. The goals of this initiative are to: (1) improve understanding of the types of IR information needs for decision making and institutional performance under varying degrees of risk and uncertainty; (2) design and implement the System for Harmonized Information Processing, to facilitate access to and correct interpretation of essential information that is critical to policy and research in the IR realm, as well as to other similarly complex domains, and (3) advance developments in the use of information technologies to facilitate such interdisciplinary research and to contribute to new education approaches, tools, and methods. Increasingly, addressing problems central to national and global interests in complex domains such as IR requires the use of technologies that easily combine observations from disparate sources, using different interpretations, for different purposes, and by a wide range of users. Critical advances in IT capabilities must span multiple domains (e.g., economic, political, geographic, commercial, and demographic), diverse contexts (i.e., meanings, languages, assumptions), and a multiplicity of contending agents (i.e., states, governments, corporations, international institutions). The technology-related research will focus on acquiring and enhancing information to serve user requirements both over individual domains (i.e., a single shared ontology) and across multiple domains, which are necessary for addressing complex challenges. The core innovation is reflected in the notion of a Collaborative Domain Space (CDS), within which applications in a common domain can share, analyze, modify, and develop information. For applications that span multiple domains we provide for a Collection of CDSs to link shared concepts in distinct domains. Moreover, we will develop the System for Harmonized Information Processing that incorporates CDSs as a basis for knowledge representation and includes all the necessary reasoning algorithms required to support information processing over a range of heterogeneous sources and applications. The development of the system described above builds upon prior work. The political science IR work will draw on an earlier Internet-based experimental "platform" for exploring forms of information generation, provision, and integration across multiple domains, regions, languages, and epistemologies which are relevant to complex but domain-specific applications, the Global System for Sustainable Development (GSSD). The IT component builds on work on the Context Interchange project (COIN) focused on the integration of a range of distributed heterogeneous information sources (e.g., financial, supply chain, disaster relief) using ontologies, databases, context mediation algorithms, and wrapper technologies. Both groups have considerable experience with the organization and management of large scale, international, distributed, and diverse research projects, including cross-national (e.g., China, Middle East, Europe) and institutional (private, public, national and international) agencies. The anticipated results will apply to any complex domain with multiple entities that rely on heterogeneous distributed data to address and resolve compelling problems. This initiative is supported by a network of international collaborators from (a) scientific and research institutions, (b) business and industry, and (c) national and international agencies. Expected research products include: a software platform, IR-based knowledge repository, and diverse applications in policy, research, and education which are anticipated to significantly impact the way complex organizations, and society in general, understand and manage critical global challenges.
|
|
|
4.
|
|
|
Richard Y. Wang Massachusetts Institute of Technology (MIT) Thomas J. Allen Massachusetts Institute of Technology (MIT) - Sloan School of Management Wesley Harris Massachusetts Institute of Technology (MIT) - Sloan School of Management Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
06 Feb 03
|
|
Last Revised:
|
|
07 Jan 06
|
|
354 (22,438)
|
1
|
|
| |
Abstract:
To fight terrorism successfully, the quality of data must be considered to avoid garbage-in-garbage-out. Research has shown that data quality (DQ) goes beyond accuracy to include dimensions such as believability, timeliness, and accessibility. In collecting, processing, and analyzing a much broader array of data than we do currently, therefore, a comprehensive approach must be developed to ensure that DQ is incorporated in determining the most probable current or future scenario for preemption, national security warning and decision making. Additional data such as who was the data source, when was the data made available, how, where, and why also need to be included to judge the quality of the information assembled from these data. We propose such an approach for Total Information Awareness with Quality (TIAQ), which includes concepts, models, and tools. Central to our approach is to manage information as a product with four principles. We have applied the information product approach to research sites where opportunities arise. For example, the Air Force Material Command uses requirements definition and forecasting processes to perform a number of functions. However, the Air Force experienced several complex problems due to DQ problems; as a result, fuel pumps were unavailable. Each engine needs a fuel pump; when a pump is not available, a military aircraft is grounded. We traced the fuel-pump throughout the process of remanufacture, and identified root causes such as delays by pump contractors and ordering problems. To a certain extent, detecting foreign terrorists and decipher their plots are analogous to tracing fuel pumps. Our research provides an interdisciplinary approach to facilitating Total Information Awareness.
Total Information Awareness (TIA), Total Information Awareness with Quality (TIAQ), Data Quality (DQ), Information Product Map (IPMap), Quality Entity Relationship (QER)
|
|
|
5.
|
|
|
Hiroshi Fujii Massachusetts Institute of Technology (MIT) Taeko Okano Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
06 Feb 03
|
|
Last Revised:
|
|
31 Mar 03
|
|
347 (23,004)
|
|
|
| |
Abstract:
Many financial institutions have built websites to inform and attract customers. Financial aggregation present an opportunity by which they can build stronger relationships with customers. For example, financial account aggregation services began in the United States but are now widely used by in other countries. In this paper, we first classify aggregator types and their method for implementing their service. Second, we explain the differences between financial account relationship aggregation services in the U.S. and in Asia-Pacific countries. We then discuss the status of financial comparison aggregation services and related issues. Owing to the popularity of WAP phones and mobile phone service in Asia-Pacific, we will also look into the development of mobile aggregation services. Finally, we examine future directions for aggregators in conjunction with universal and global banking concepts.
Financial Institution, Aggregation Service, Universal Banking, Global Banking
|
|
|
6.
|
|
|
Vincent Maugis Massachusetts Institute of Technology (MIT) - Department of Political Science Nazli Choucri Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management Sharon E. Gillett Massachusetts Institute of Technology (MIT) Farnaz Haghseta Massachusetts Institute of Technology (MIT) Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT) Mike NMI Best Massachusetts Institute of Technology (MIT) - Center for Technology, Policy, and Industrial Development (CTPID)
|
| Posted: |
|
26 Apr 04
|
|
Last Revised:
|
|
10 Apr 05
|
|
340 (23,619)
|
|
|
| |
Abstract:
With the rapid diffusion of the Internet worldwide, there has been considerable interest in the e-potentials of developing countries giving rise to a 1st generation of e-Readiness studies. Moreover, e-Readiness means different things to different people, in different contexts, and for different purposes. Despite strong merits, this first generation of e-Readiness studies assumed a fixed, one-size-fits-all set of requirements, regardless of the characteristics of individual countries, the investment context, or the demands of specific applications. This feature obscures critical information for investors or policy analysts seeking to reduce uncertainties and/or make more educated decisions. But there is very little known about e-Readiness for e-Banking. In particular, based on lessons learnt to date and their implications for emerging realities of the 21st century, we designed and executed a research project with theoretical as well as practical dimensions to answer the question of e-Readiness for What, focusing specifically on e-Banking, based on the very assumption that one size can seldom, if ever, fit all. We propose and develop a conceptual framework for the "next generation" ereadiness - focusing on different e-Business applications in different economic contexts with potentially different pathways - as well as a data model - to explore e-Readiness for e-Banking in ten countries.
e-readiness assessment, value-creation opportunities, e-Banking, banking, pathways, profiles, leapfrogging
|
|
|
7.
|
|
|
Mark D. Hansen Massachusetts Institute of Technology (MIT) - Sloan School of Management Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
06 Jan 03
|
|
Last Revised:
|
|
06 Jan 06
|
|
336 (24,057)
|
|
|
| |
Abstract:
This paper examines the opportunities and challenges related to data and process integration architectures in the context of Web Services. A primary goal of most enterprises in today's economic environment is to improve productivity by streamlining and aggregating business processes. This paper illustrates how integration architectures based on Web Services offer new opportunities to improve productivity that are expedient and economical. First, the paper introduces the technical standards associated with Web Services and provides business example for illustration. Abstracting from this example, we introduce a concept we call Process Aggregation that incorporates data aggregation and workflow to improve productivity. We show that Web Services will have a major impact on Process Aggregation, making it both faster and less expensive to implement. Finally, we suggest some research directions relating to the Process Aggregation challenges facing Web Services that are not currently being addressed by standards bodies or software vendors. These include context mediation, trusted intermediaries, quality and source selection, licensing and payment mechanisms, and systems development tools.
Process Aggregation, Web Services, Data Aggregation, Streamlining, Business Process
|
|
|
8.
|
|
|
Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
28 Apr 02
|
|
Last Revised:
|
|
28 Apr 02
|
|
332 (24,304)
|
7
|
|
| |
Abstract:
This paper examines the development of web aggregators, entities that collect information from a wide range of sources, with or without prior arrangements, and add value through post-aggregation services. New Web-page extraction tools, context sensitive mediators, and agent technologies have greatly reduced the barriers to constructing aggregators. We predict that aggregators will soon emerge in industries where they were not formerly present. Through studying over a hundred existing and emerging aggregators, we present a model for understanding the aggregator's strategic interaction with existing organizations. We also suggest different ways that businesses can take advantage of the new opportunities presented. Finally, we provide valuable insights to all organizations concerning the issues, impacts, and actions required as aggregators become increasingly present.
|
|
|
9.
|
|
|
Chander K. Velu University of Cambridge - Judge Business School Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Marshall W. Van Alstyne Boston University - Department of Management Information Systems
|
| Posted: |
|
09 Nov 05
|
|
Last Revised:
|
|
28 Oct 08
|
|
294 (28,082)
|
|
|
| |
Abstract:
This paper addresses the tension between benefits of centralized data control against the benefits of decentralized control at the level of the business unit. Centralized data control provides the benefit of uniform standards whereas business unit data control grants flexibility to react to rapidly changing environments. Many data standardization efforts fail because they do not fully take into account the value of flexibility and ownership incentives. We use a real options based framework and the theory of incomplete contracts to derive propositions about the optimal level of data standardization across the enterprise. Applications of the propositions are illustrated with case vignettes. The paper makes two main contributions. First, the approach defines formally how incentive structures influence ownership of the option value or value of flexibility, which is an intangible information asset. Second the derived propositions would help senior management to more precisely consider aligning incentives in data standardization exercises.
economics of IS, outsourcing, enterprise systems, real options, incomplete contracts, standardization, information asset, flexibility and information systems decentralization
|
|
|
10.
|
|
|
Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Richard Y. Wang Massachusetts Institute of Technology (MIT) Frank Dravis Firstlogic Inc. Xinping Chen Bose Corporation
|
| Posted: |
|
04 Jan 03
|
|
Last Revised:
|
|
04 Jan 03
|
|
272 (30,714)
|
9
|
|
| |
Abstract:
Corporate household data not only refers to the strict hierarchical structure about and within the corporation, but also the variety of inter-organizational relationships. It is becoming increasingly important for many purposes ranging from CRM and ERP applications, to risk management, supply chain management, and marketing. We propose conceptual definitions for corporate household, corporate household knowledge, and corporate household knowledge processor. After describing research challenges and conceptual definitions, we summarize current practices and approaches. We then present a two-part plan: (1) continue our qualitative research to describe the various different sources, views, and purposes for corporate household data, including the rules used in each case; (2) apply the context interchange theory to represent the corporate household data and underlying knowledge and enable the context mediation technology to correctly understand and reason about both the context of the sources and the context of the user's query about corporate household data.
|
|
|
11.
|
|
|
Aykut Firat Massachusetts Institute of Technology (MIT) - Sloan School of Management Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Benjamin Grosof Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
25 Oct 02
|
|
Last Revised:
|
|
07 Jan 06
|
|
268 (31,213)
|
4
|
|
| |
Abstract:
The shift towards global networking brings with it many opportunities and challenges. In this paper, we discuss key technologies in achieving global semantic interoperability among heterogeneous information systems, including both traditional and web data sources. In particular, we focus on the importance of this capability and technologies we have designed to overcome ontological heterogeneity, a common type of disparity in financial information systems. Our approach to representing and reasoning with ontological heterogeneities in data sources is an extension of the Context Interchange (COIN) framework, a mediator-based approach for achieving semantic interoperability among heterogeneous sources and receivers. We also analyze the issue of ontological heterogeneity in the context of source-selection, and offer a declarative solution that combines symbolic solvers and mixed integer programming techniques in a constraint logic-programming framework. Finally, we discuss how these techniques can be coupled with emerging Semantic Web related technologies and standards such as Web-Services, DAML+OIL, and RuleML, to offer scalable solutions for global semantic interoperability. We believe that the synergy of database integration and Semantic Web research can make significant contributions to the financial knowledge integration problem, which has implications in financial services, and many other e-business tasks.
Database Integration, Semantic Web, Ontologies, Source Selection
|
|
|
12.
|
|
|
Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Richard Y. Wang Massachusetts Institute of Technology (MIT) Krishna Chettayar Independent Frank Dravis Firstlogic Inc. James Funk Independent Raïssa Katz-Haas Independent Cindy Lee Massachusetts Institute of Technology (MIT) Yang Lee Massachusetts Institute of Technology (MIT) Xiang Xian Massachusetts Institute of Technology (MIT) - Department of Electrical Engineering Sumit Bhansali Massachusetts Institute of Technology (MIT)
|
| Posted: |
|
27 Apr 04
|
|
Last Revised:
|
|
13 Jul 08
|
|
264 (31,725)
|
6
|
|
| |
Abstract:
Corporate household (CHH) refers to the organizational information about the structure within the corporation and a variety of inter-organizational relationships. Knowledge derived from this data is becoming increasingly important for improving data quality in applications, such as Customer Relationship Management (CRM), Enterprise Resource Planning (ERP), Supply Chain Management (SCM), risk management, and sales and market promotion. Extending the concepts from our previous CHH research, we exemplify in this paper the importance of improved corporate household knowledge and processing in various business application areas. Additionally, we provide examples of CHH business rules that are often implicit and fragmented - understood and practiced by different domain experts across functional areas of the firm. This paper is intended to form a foundation for further research to systematically investigate, capture, and build a body of corporate householding knowledge across diverse business applications.
Corporate Householding, Data Quality, Organizational Structures, Interdependence, Name Matching, Entity Aggregation, Information Quality, Account Consolidation, Conflict of Interest, Risk Management, Customer Relationship Management (CRM), Supply Chain Management (SCM), Regulation and Disclosure
|
|
|
13.
|
|
|
Allen Moulton Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
14 Nov 01
|
|
Last Revised:
|
|
07 Jan 06
|
|
239 (35,416)
|
|
|
| |
Abstract:
We use a context interchange mediation approach for detecting and resolving data quality and semantic integrity conflicts in information exchanged across organizational boundaries. Context models draw on a domain ontology to explain how source and receiver data models implement general principles of the subject domain. Using the declarative knowledge from the domain ontology and context models, the mediator writes a query plan meeting receiver semantic requirements from autonomous, heterogeneous sources. Examples drawn from fixed income securities investments illustrate problems and solutions enabled by context interchange mediation.
|
|
|
14.
|
|
|
Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
20 Aug 03
|
|
Last Revised:
|
|
20 Aug 03
|
|
234 (36,236)
|
1
|
|
| |
Abstract:
Data quality issues have taken on increasing importance in recent years. In our research, we have discovered that many "data quality" problems are actually "data misinterpretation" problems - that is, problems with data semantics. In this paper, we first illustrate some examples of these problems and then introduce a particular semantic problem that we call "corporate householding." We stress the importance of "context" to get the appropriate answer for each task. Then we propose an approach to handle these tasks using extensions to the COntext INterchange (COIN) technology for knowledge storage and knowledge processing.
Data Quality, Data Semantics, Corporate Householding, COntext INterchange, Knowledge Management
|
|
|
15.
|
|
|
Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
19 Dec 07
|
|
Last Revised:
|
|
23 Apr 08
|
|
228 (37,275)
|
|
|
| |
Abstract:
As an open standard for electronic communication of business and financial data, XBRL has the potential of improving the efficiency of the business data supply chain. A number of jurisdictions have developed different XBRL taxonomies as their data standards. Semantic heterogeneity exists in these taxonomies, the corresponding instances, and the internal systems that store the original data. Consequently, there are still substantial difficulties in creating and using XBRL instances that involve multiple taxonomies. To fully realize the potential benefits of XBRL, we have to develop technologies to reconcile semantic heterogeneity and enable interoperability of various parts of the supply chain. In this paper, we analyze the XBRL standard and use examples of different taxonomies to illustrate the interoperability challenge. We also propose a technical solution that incorporates schema matching and context mediation techniques to improve the efficiency of the production and consumption of XBRL data.
XBRL, semantic data integration, context mediation, ontology, schema matching
|
|
|
16.
|
|
|
Steven Y. Tu Soochow University Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Luis Chin-Jung Wu Savi Technology
|
| Posted: |
|
13 Apr 04
|
|
Last Revised:
|
|
10 Apr 05
|
|
219 (38,871)
|
1
|
|
| |
Abstract:
UccNet is a globally centralized B2B electronic data platform for storing trading product item information and hosted by the non-profit international standardization institute EAN-UCC. It is an emerging B2B data communication standard for the retail industry with significant potential impact. Many US retailers are requesting their international suppliers for compulsory subscription by the year-end of either 2004 or 2005 and many major IT software providers and consulting firms specialized in supply chain management are preparing packaged services/solutions for this imminent demand. In light of the increasing importance of UccNet on both the technology and application sides, this paper attempts to advance the following argument: Though UccNet establishes an architectural framework to resolve the many-to-many connectivity issue and data synchronization issue through a centralized product database and a uniform numbering system (i.e., Global Trade Item Numbering), there are context discrepancy issues remaining to be addressed. We show with a real case study that context discrepancy is inherent in the international trading applications where UccNet is intended to be used. Naturally, international trading partners tend to define and describe product item information differently. That difference, either due to the culture or the geographical location, is not considered in the original design of UccNet. As an example, the attribute "width" contained in the database schema of UccNet would be filled by a China-based supplier in 'meter' and yet be interpreted as 'feet' by the US retail buyer. We show how the Context Interchange Framework, operating under the rationale of local autonomy and speaking to the resolution of context mediation issue, can be nicely incorporated into the existing UccNet framework to constitute theoretically a more complete technical solution and practically a more useful B2B supply chain business solution.
B2B, Retail Supply-Chain, UccNet, Data Connectivity and Synchronization, Context Interchange, Data Semantics
|
|
|
17.
|
|
|
Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Richard Y. Wang Massachusetts Institute of Technology (MIT)
|
| Posted: |
|
25 Sep 01
|
|
Last Revised:
|
|
13 Mar 02
|
|
217 (39,234)
|
2
|
|
| |
Abstract:
Corporate household not only refers to the strict hierarchical structure within the corporation, but also the variety of inter-organizational relationships. We propose a conceptual definition for corporate household, corporate household knowledge, and corporate household knowledge processor. After describing research challenges and conceptual definitions, we summarize potential solution approaches. Our research plan is twofold: (1) continue to document the various different sources, views, and purposes for corporate household knowledge, including the rules used in each case; (2) extend the context interchange framework to represent the corporate household knowledge and enable the context mediation technology to correctly understand and reason about both the context of the sources and the context of the user's query.
Corporate Household, Data Quality, Context Mediation
|
|
|
18.
|
|
|
Nazli Choucri Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Allen Moulton Massachusetts Institute of Technology (MIT) Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT)
|
| Posted: |
|
05 Jan 05
|
|
Last Revised:
|
|
09 Feb 05
|
|
216 (39,433)
|
|
|
| |
Abstract:
In its Preface, The 9/11 Commission Report states: We learned that the institutions charted with protecting ... national security did not understand how grave this threat can be, and did not adjust their policies, plans, and practices to deter or defeat it (2004: xvi). Given current realities and uncertainties better preparedness can be achieved by identifying, controlling and managing the elusive linkages & situational factors that fuel hostilities. This paper focuses on new opportunities and capabilities provided by anticipatory technologies that help understand, measure and model the complex dynamics shaping and precipitating conflict in specific settings worldwide. We introduce a research initiative focusing on linking pre- and post- conflict by drawing upon the power of system dynamics, augmented by new technologies for integrated information analysis, in conjunction with the development of conceptual and computational ontologies capturing the diversity, intensity, and dynamics of the conflict domain.
national security, system dynamics, integrated information analysis, conceptual and computational ontologies
|
|
|
19.
|
|
|
Philip Tan Singapore-MIT Alliance Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Kian-Lee Tan National University of Singapore - School of Computing
|
| Posted: |
|
17 Aug 04
|
|
Last Revised:
|
|
16 Aug 05
|
|
205 (41,611)
|
2
|
|
| |
Abstract:
The COntext INterchange (COIN) strategy is an approach to solving the problem of interoperability of semantically heterogeneous data sources through context mediation. COIN has used its own notation and syntax for representing ontologies. More recently, the OWL Web Ontology Language is becoming established as the W3C recommended ontology language. We propose the use of the COIN strategy to solve context disparity and ontology interoperability problems in the emerging Semantic Web - both at the ontology level and at the data level. In conjunction with this, we propose a version of the COIN ontology model that uses OWL and the emerging rules interchange language, RuleML.
COntext INterchange strategy, context mediation, context disparity and ontology interoperability problems
|
|
|
20.
|
|
|
Allen Moulton Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
14 Nov 01
|
|
Last Revised:
|
|
07 Jan 06
|
|
203 (42,010)
|
1
|
|
| |
Abstract:
We examine a knowledge representation architecture to support context interchange mediation. For autonomous receivers and sources sharing a common subject domain, the mediator's reasoning engine can devise query plans integrating multiple sources and resolving semantic heterogeneity. Receiver applications obtain the data they need in the form they need it without imposing changes on sources. The KR architecture includes: 1) data models for each source and receiver, 2) subject domain ontologies, containing abstract subject matter conceptualizations that would be known to experienced practitioners in the industry, and 3) context models for each source and receiver that explain how each source or receiver data model implements the abstract concepts from a subject domain ontology. Examples drawn from the fixed income securities industry illustrate problems and solutions enabled by the proposed architecture.
|
|
|
21.
|
|
|
Sajindra Jayasena Singapore-MIT Alliance Stéphane Bressan National University of Singapore (NUS) - School of Computing Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
17 Aug 04
|
|
Last Revised:
|
|
16 Aug 05
|
|
173 (49,326)
|
1
|
|
| |
Abstract:
There is no such thing as a World Wide Bank managing the central database of all possible financial activities. Such a concept makes neither technical nor business sense. Each player in the financial industry, each bank, stock exchange, government agency, or insurance company operates its own financial information system or systems. By its very nature, financial information, like the money that it represents, changes hands. Therefore the interoperation of financial information systems is the cornerstone of the financial services they support. E-services frameworks such as web services are an unprecedented opportunity for the flexible interoperation of financial systems. Naturally the critical economic role and the complexity of financial information led to the development of various standards. Yet standards alone are not the panacea: different groups of players use different standards or different interpretations of the same standard. We believe that the solution lies in the convergence of flexible E-services such as web-services and semantically rich meta-data as promised by the semantic Web; then a mediation architecture can be used for the documentation, identification, and resolution of semantic conflicts arising from the interoperation of heterogeneous financial services. In this paper we illustrate the nature of the problem in the Electronic Bill Presentment and Payment (EBPP) industry and the viability of the solution we propose. We describe and analyze the integration of services using four different formats: the IFX, OFX and SWIFT standards, and an example proprietary format. To accomplish this integration we use the COntext INterchange (COIN) framework. The COIN architecture leverages a model of sources and receivers' contexts in reference to a rich domain model or ontology for the description and resolution of semantic heterogeneity.
Electronic Bill Presentment, IFX, OFX, SWIFT, example proprietary format, COntext INterchange framework
|
|
|
22.
|
|
|
Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Richard Y. Wang Massachusetts Institute of Technology (MIT) Xiang Xian Massachusetts Institute of Technology (MIT) - Department of Electrical Engineering
|
| Posted: |
|
29 Oct 03
|
|
Last Revised:
|
|
29 Oct 03
|
|
167 (51,046)
|
6
|
|
| |
Abstract:
Advances in Corporate Householding are needed to address certain categories of data quality problems caused by data misinterpretation. In this paper, we first summarize some of these data quality problems and our more recent results from studying corporate householding applications and knowledge exploration. Then we outline a technical approach to a Corporate Householding Knowledge Processor (CHKP) to solve a particularly important type of corporate householding problem - entity aggregation. We illustrate the operation of the CHKP by using a motivational example in account consolidation. Our CHKP design and implementation uses and expands on the COntext INterchange (COIN) technology to manage and process corporate householding knowledge. Keywords: Corporate Household, Corporate Householding, Data Quality, Context Mediation, COntext INterchange, Enterprise Knowledge Management, Database Interoperability
Corporate Household, Corporate Householding, Data Quality, Context Mediation, COntext INterchange, Enterprise Knowledge Management, Database Interoperability
|
|
|
23.
|
|
|
Aykut Firat Massachusetts Institute of Technology (MIT) - Sloan School of Management Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Benjamin Grosof Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
06 Feb 03
|
|
Last Revised:
|
|
07 Apr 03
|
|
167 (51,046)
|
11
|
|
| |
Abstract:
While there are efforts to establish a single international accounting standard, there are strong current and future needs to handle heterogeneous accounting methods and systems. We advocate a context-based approach to dealing with multiple accounting standards and equational ontological conflicts. In this paper we first define what we mean by equational ontological conflicts and then describe a new approach, using Constraint Logic Programming and abductive reasoning, to reconcile such conflicts among disparate information systems. In particular, we focus on the use of Constraint Handling Rules as a simultaneous symbolic equation solver, which is a powerful way to combine, invert and simplify multiple conversion functions that translate between different contexts. Finally, we demonstrate a sample application using our prototype implementation that demonstrates the viability of our approach.
Financial Information Integration, Equational Ontological Conflicts, Multiple Accounting Standards, Constraint Handling Rules
|
|
|
24.
|
|
|
Sajindra Jayasena Singapore-MIT Alliance Stéphane Bressan National University of Singapore (NUS) - School of Computing Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
20 Oct 05
|
|
Last Revised:
|
|
02 Feb 06
|
|
166 (51,337)
|
1
|
|
| |
Abstract:
There is no such monopoly as The World Wide Bank that manages the databases of all possible financial activities. Such a concept makes neither technical nor business sense. Each player in the financial industry, each bank, stock exchange, government agency, or insurance company, operates its own internal financial information systems. By its very nature, financial information, like the money that it represents, changes hands. Therefore the interoperation of financial information systems is the cornerstone of the financial services they support. Naturally the critical economic role and the complexity of financial information led to the development of standards for its management and interchange. Yet standards are not the panacea: different groups of players use different standards or versions of a standard's implementation. In this paper we illustrate the nature of the problem in the Electronic Bill Presentment and Payment industry. In particular, we describe and analyze the difficulty of the integration of services using four different formats: IFX, OFX and SWIFT standards, and an example proprietary format. We then propose an improved way to accomplish this integration using the COntext INterchange (COIN) framework.
COntext INterchange (COIN) framework, financial information databases
|
|
|
25.
|
|
|
Aykut Firat Massachusetts Institute of Technology (MIT) - Sloan School of Management Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Frank Manola Independent Consultant
|
| Posted: |
|
25 May 05
|
|
Last Revised:
|
|
02 Sep 05
|
|
162 (52,564)
|
|
|
| |
Abstract:
This paper describes the coupling of contexts and ontologies for semantic integration in the ECOIN semantic interoperability framework. Ontological terms in ECOIN correspond to multiple related meanings in different contexts. Each ontology includes a context model that describes how a generic ontological term can be modified according to contextual choices to acquire specialized meanings. Although the basic ECOIN concepts have been presented in the past, this paper is the first to show how ECOIN addresses the case of "single-ontology with multiple contexts" with an example of semantic integration using our new prototype implementation.
single-ontology with multiple contexts
|
|
|
26.
|
|
|
Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Richard Y. Wang Massachusetts Institute of Technology (MIT) Wei X. Zhang University of Massachusetts at Boston - College of Management
|
| Posted: |
|
21 Mar 03
|
|
Last Revised:
|
|
28 Apr 03
|
|
159 (53,514)
|
2
|
|
| |
Abstract:
Previous research on corporate household and corporate householding has presented examples, literature review, and working definitions. In this paper, we first improve our understanding of the area by developing a typology of corporate householding tasks and knowledge requirements. We stress the importance of "context" for use to get the appropriate answer for each task. Then we propose an approach to handle these tasks using Communities of Practice (COP) for knowledge acquisition and extensions to the COntext INterchange (COIN) technology for knowledge storage and knowledge processing.
Corporate Householding, COntext INterchange, Communities of Practice, Knowledge Management
|
|
|
27.
|
|
|
Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT)
|
| Posted: |
|
20 Oct 05
|
|
Last Revised:
|
|
01 Feb 06
|
|
155 (54,796)
|
3
|
|
| |
Abstract:
Data quality issues have taken on increasing importance in recent years. In our research, we have discovered that many data quality problems are actually data misinterpretation problems - that is, problems caused by heterogeneous data semantics. In this paper, we first identify semantic heterogeneities that, when not resolved, often cause data quality problems. We discuss the especially challenging problem of aggregational ontological heterogeneity, which concerns how complex entities and their relationships are aggregated. Then we illustrate how COntext INterchange (COIN) technology can be used to capture data semantics and reconcile semantic heterogeneities, thereby improving data quality.
Data Quality, Data Semantics, Semantic Heterogeneity, Ontology, Context
|
|
|
28.
|
|
|
Weiguo Fan Virginia Polytechnic Institute & State University - Department of Accounting and Information Systems Hongjun Lu National University of Singapore (NUS) - School of Computing Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management David W. Cheung University of Hong Kong - Department of Computer Science and Information Systems
|
| Posted: |
|
06 Feb 03
|
|
Last Revised:
|
|
27 Feb 03
|
|
154 (55,125)
|
|
|
| |
Abstract:
The successful integration of data from autonomous and heterogeneous systems calls for the resolution of semantic conflicts that may be present. Such conflicts are often reffected by discrepancies in attribute values of the same data object. In this paper, we describe a recently developed prototype system, DIRECT (DIscovering and REconciling ConflicTs). The system mines data value conversion rules in the process of integrating business data from multiple sources. The system architecture and functional modules are described. The process of discovering conversion rules from sales data of a trading company is presented as an illustrative example.
Data Integration, Data Mining, Semantic Conflicts, Data Visualization, Statistical Analysis, Data Value Conversion
|
|
|
29.
|
|
|
Nazli Choucri Massachusetts Institute of Technology (MIT) Daniel Goldsmith MIT Center for Digital Business Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Dinsha Mistree Massachusetts Institute of Technology (MIT) - Sloan School of Management J. Bradley Morrison Independent Author Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
04 Sep 07
|
|
Last Revised:
|
|
04 Sep 07
|
|
152 (55,825)
|
|
|
| |
Abstract:
The world can be complex and dangerous - the loss of state stability of countries is of increasing concern. Although every case is unique, there are important common processes. We have developed a system dynamics model of state stability based on an extensive review of the literature and debriefings of subject matter experts. We represent the nature and dynamics of the 'loads' generated by insurgency activities, on the one hand, and the core features of state resilience and its 'capacity' to withstand these 'loads', on the other. The challenge is to determine when threats to stability override the resilience of the state and, more important, to anticipate conditions under which small additional changes in anti-regime activity can generate major disruptions. With these insights, we can identify appropriate and actionable mitigation factors to decrease the likelihood of radical shifts in behavior and enhance prospects for stability.
model, system dynamics, state stability, terrorists, insurgency, regime legitimacy
|
|
|
30.
|
|
|
Nazli Choucri Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management Richard Y. Wang Massachusetts Institute of Technology (MIT)
|
| Posted: |
|
10 May 04
|
|
Last Revised:
|
|
26 May 06
|
|
146 (57,992)
|
|
|
| |
Abstract:
A recent National Research Council study found that: "Although there are many private and public databases that contain information potentially relevant to counter terrorism programs, they lack the necessary context definitions (i.e., metadata) and access tools to enable interoperation with other databases and the extraction of meaningful and timely information" [NRC02, p.304, emphasis added] That sentence succinctly describes the objectives of this project. Improved access and use of information are essential to better identify and anticipate threats, protect against and respond to threats, and enhance national and homeland security (NHS), as well as other national priority areas, such as Economic Prosperity and a Vibrant Civil Society (ECS) and Advances in Science and Engineering (ASE). This project focuses on the creation and contributions of a Laboratory for Information Globalization and Harmonization Technologies (LIGHT) with two interrelated goals: (1) Theory and Technologies: To research, design, develop, test, and implement theory and technologies for improving the reliability, quality, and responsiveness of automated mechanisms for reasoning and resolving semantic differences that hinder the rapid and effective integration (int) of systems and data (dmc) across multiple autonomous sources, and the use of that information by public and private agencies involved in national and homeland security and the other national priority areas involving complex and interdependent social systems (soc). This work builds on our research on the COntext INterchange (COIN) project, which focused on the integration of diverse distributed heterogeneous information sources using ontologies, databases, context mediation algorithms, and wrapper technologies to overcome information representational conflicts. The COIN approach makes it substantially easier and more transparent for individual receivers (e.g., applications, users) to access and exploit distributed sources. Receivers specify their desired context to reduce ambiguities in the interpretation of information coming from heterogeneous sources. This approach significantly reduces the overhead involved in the integration of multiple sources, improves data quality, increases the speed of integration, and simplifies maintenance in an environment of changing source and receiver context - which will lead to an effective and novel distributed information grid infrastructure. This research also builds on our Global System for Sustainable Development (GSSD), an Internet platform for information generation, provision, and integration of multiple domains, regions, languages, and epistemologies relevant to international relations and national security. (2) National Priority Studies: To experiment with and test the developed theory and technologies on practical problems of data integration in national priority areas. Particular focus will be on national and homeland security, including data sources about conflict and war, modes of instability and threat, international and regional demographic, economic, and military statistics, money flows, and contextualizing terrorism defense and response. Although LIGHT will leverage the results of our successful prior research projects, this will be the first research effort to simultaneously and effectively address ontological and temporal information conflicts as well as dramatically enhance information quality. Addressing problems of national priorities in such rapidly changing complex environments requires extraction of observations from disparate sources, using different interpretations, at different points in times, for different purposes, with different biases, and for a wide range of different uses and users. This research will focus on integrating information both over individual domains and across multiple domains. Another innovation is the concept and implementation of Collaborative Domain Spaces (CDS), within which applications in a common domain can share, analyze, modify, and develop information. Applications also can span multiple domains via Linked CDSs. The PIs have considerable experience with these research areas and the organization and management of such large scale international and diverse research projects. The PIs come from three different Schools at MIT: Management, Engineering, and Humanities, Arts & Social Sciences. The faculty and graduate students come from about a dozen nationalities and diverse ethnic, racial, and religious backgrounds. The currently identified external collaborators come from over 20 different organizations and many different countries, industrial as well as developing. Specific efforts are proposed to engage even more women, underrepresented minorities, and persons with disabilities. The anticipated results apply to any complex domain that relies on heterogeneous distributed data to address and resolve compelling problems. This initiative is supported by international collaborators from (a) scientific and research institutions, (b) business and industry, and (c) national and international agencies. Research products include: a System for Harmonized Information Processing (SHIP), a software platform, and diverse applications in research and education which are anticipated to significantly impact the way complex organizations, and society in general, understand and manage critical challenges in NHS, ECS, and ASE.
Homeland Security, Information Globalization and Harmonization Technologies
|
|
|
31.
|
|
|
Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
06 Feb 03
|
|
Last Revised:
|
|
06 Jan 06
|
|
145 (58,358)
|
7
|
|
| |
Abstract:
Web aggregation has been available regionally for several years, but this service has not been offered globally. As an example, using multiple regional comparison aggregators, we analyze the global prices for a Sony camcorder, which differ by more than three times. We further explain that lack of global comparison aggregation services partially contribute to such huge price dispersion. We also discuss difficulties encountered in the manual integration of global web sources. Motivated by this example, we propose a context mediation architecture for global aggregation to address semantic disparities of global information sources. Global aggregation services can bring efficiency to the global market and can be useful for market research and other business uses.
Web Aggregation, Context, Semantic Integration
|
|
|
32.
|
|
|
Allen Moulton Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
06 Feb 03
|
|
Last Revised:
|
|
06 Feb 03
|
|
143 (59,080)
|
|
|
| |
Abstract:
We examine semantic interoperability problems in the fixed income securities industry and propose a knowledge representation architecture for context interchange mediation to support dynamic integration of autonomous database, web, and procedural sources of information. For sources and receivers sharing a common subject domain, the mediator's reasoning engine can devise query plans integrating multiple sources and resolving semantic heterogeneity. Receiver applications can obtain the data they need in the form they need it without imposing changes on sources. The architecture includes: 1) data models for each source and receiver, 2) subject ontologies, containing abstract subject matter conceptualizations that would be known to experienced practitioners in the industry, and 3) context models for each source and receiver that explain how each data model implements the abstract concepts from a subject ontology.
Semantic Interoperability Problems, Fixed Income Securities, Context Interchange Mediation
|
|
|
33.
|
|
|
Allen Moulton Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
06 Feb 03
|
|
Last Revised:
|
|
31 Mar 03
|
|
139 (60,599)
|
|
|
| |
Abstract:
We examine a knowledge representation architecture to support context interchange mediation. For autonomous receivers and sources sharing a common subject domain, the mediator's reasoning engine can devise query plans integrating multiple sources and resolving semantic heterogeneity. Receiver applications obtain the data they need in the form they need it without imposing changes on sources. The KR architecture includes: 1) data models for each source and receiver, 2) subject domain ontologies, containing abstract subject matter conceptualizations that would be known to experienced practitioners in the industry, and 3) context models for each source and receiver that explain how each source or receiver data model implements the abstract concepts from a subject domain ontology. Examples drawn from the fixed income securities industry illustrate problems and solutions enabled by the proposed architecture.
Knowledge Representation Architecture, Context Interchange Mediation
|
|
|
34.
|
|
|
Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
06 Jan 03
|
|
Last Revised:
|
|
07 Jan 06
|
|
135 (62,127)
|
1
|
|
| |
Abstract:
The development of web technology has led to the emergence of web aggregation, a service that collects existing web data and turns them into more useful information. We review the development of both comparison and relationship aggregation and discuss their impacts on various stakeholders. The aggregator's capability of transparently extracting web data has raised challenging issues in database and privacy protection. Consequently, new regulations are introduced or being proposed. We analyze the interactions between aggregation and related policies and provide our insights about the implications of new policies on the development of web aggregation.
International IP Law, Privacy Law, Web Aggregation
|
|
|
35.
|
|
|
Nazli Choucri Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
12 May 05
|
|
Last Revised:
|
|
27 Jan 06
|
|
126 (65,845)
|
|
|
| |
Abstract:
In its Preface, The 9/11 Commission Report states: We learned that the institutions charted with protecting . . . national security did not understand how grave this threat can be, and did not adjust their policies, plans, and practices to deter or defeat it (2004: xvi). Given current realities and uncertainties better preparedness can be achieved by identifying, controlling and managing the elusive linkages and situational factors that impact state stability and fuel state decay and destruction - and hence create new threats to the nation's security. We propose to focus on the use of system dynamics modeling techniques to help understand, measure and model the complex dynamics shaping state stability, initially for two regions. We will specifically consider the impacts of unanticipated disruptions, such as a tsunami and its aftermath, on the dynamics of the two regions. For each region, we will deliver a detailed country model, including 3-5 futures predictions in the 6-12 month range along with an analysis of conditions and casual links between predicted futures plus corresponding mitigated options.
system dynamics, state stability
|
|
|
36.
|
|
|
Thomas Gannon MITRE Corporation Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Allen Moulton Massachusetts Institute of Technology (MIT) Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management Marwan Sabbouh MITRE Corporation Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT)
|
| Posted: |
|
12 May 05
|
|
Last Revised:
|
|
02 Sep 05
|
|
126 (65,845)
|
|
|
| |
Abstract:
There is pressing need for effectively integrating information from an ever increasing number of available sources both on the web and in other existing systems. A key difficulty of achieving this goal comes from the pervasive heterogeneities in all levels of information systems. Existing and emerging technologies, such as the Web, ODBC, XML, and Web Services, provide essential capabilities in resolving heterogeneities in the hardware and software platforms, but they do not address the semantic heterogeneity of the data itself. A robust solution to this problem needs to be adaptable, extensible, and scalable. In this paper, we identify the deficiencies of traditional approaches that address this problem using hand-coded programs or require complete data standardization. The COntext INterchange (COIN) approach overcomes these deficiencies by declaratively representing data semantics and using a mediator to create the necessary conversion programs using a small number of conversion rules. The capabilities of COIN is demonstrated using an intelligence information integration example consisting of 150 data sources, where COIN can automatically generate the over 22,000 conversion programs needed to enable semantic integration using only six parametizable conversion rules. This paper makes a unique contribution by providing a systematic evaluation of COIN and other commonly practiced approaches.
semantic integration, adaptability, extensibility, scalability, context
|
|
|
37.
|
|
|
Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Allen Moulton Massachusetts Institute of Technology (MIT) Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
12 May 05
|
|
Last Revised:
|
|
02 Sep 05
|
|
119 (69,003)
|
|
|
| |
Abstract:
In this report, we demonstrate the applicability and value of the context mediation approach in facilitating the effective and correct use of counter-terrorism intelligence information coming from diverse heterogeneous sources.
Context Mediation, Counter-Terrorism Intelligence
|
|
|
38.
|
|
|
Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
28 Aug 06
|
|
Last Revised:
|
|
29 Aug 06
|
|
116 (70,438)
|
1
|
|
| |
Abstract:
The availability of data on the web and the improvement of technologies have made it increasingly easy to reuse existing data to create new databases and provide valueadded services. Meanwhile, initial database creators have been seeking legal protection for their data. After presenting a brief history of legislation related to legal protection for non-copyrightable database contents, we discuss challenging issues to be considered in formulating a database protection regulation. These issues can be addressed from the perspective of economics. Results from a preliminary economic analysis are presented. The findings indicate that depending on investment required to create the initial database and the level of differentiation between the initial database and the reuser database, the choice of a social welfare-enhancing regulation can allow for no reuse, free reuse, or fee-paying reuse.
database protection, data reuse, economic analysis
|
|
|
39.
|
|
|
Hongjun Lu National University of Singapore (NUS) - School of Computing Weiguo Fan Virginia Polytechnic Institute & State University - Department of Accounting and Information Systems Cheng Hian Goh National University of Singapore (NUS) - School of Computing Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management David W. Cheung University of Hong Kong - Department of Computer Science and Information Systems
|
| Posted: |
|
25 Oct 02
|
|
Last Revised:
|
|
09 Dec 02
|
|
113 (71,984)
|
1
|
|
| |
Abstract:
The integration of data from autonomous and heterogeneous sources calls for the prior identification and resolution of semantic conflicts that may be present. Unfortunately, this requires the system integrator to sift through the data from disparate systems in a painstaking manner. In this paper, we suggest that this process can be (at least) partially automated by presenting a methodology and techniques for the discovery of potential semantic conflicts as well as the underlying data transformation needed to resolve the conflicts. Our methodology begins by classifying data value conflicts into two categories: context independent and context dependent. While context independent conflicts are usually caused by unexpected errors, the context dependent conflicts are primarily a result of the heterogeneity of underlying data sources. To facilitate data integration, data value conversion rules are proposed to describe the quantitative relationships among data values involving context dependent conflicts. A general approach is proposed to discover data value conversion rules from the data. The approach consists of five major steps: relevant attribute analysis, candidate model selection, conversion function generation, conversion function selection and conversion rule formation. It is being implemented in a prototype system, DIRECT, for business data using statistics based techniques. Preliminary study indicated that the proposed approach is promising.
Value Conflicts, Data Integration, Autonomous Database Systems
|
|
|
40.
|
|
|
Aykut Firat Massachusetts Institute of Technology (MIT) - Sloan School of Management Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Benjamin Grosof Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
29 Oct 04
|
|
Last Revised:
|
|
07 Jan 06
|
|
111 (73,020)
|
|
|
| |
Abstract:
Virtually all of the existing approaches to ontology integration assume that each of the individual ontologies (and the integrated ontology) corresponds to a single set of semantics at a given time. We first claim that this single integrated view assumption is unnecessarily restrictive, and defend the view that ontologies can simultaneously accommodate multiple integrated views provided the accompaniment of contexts - a set of axioms on the interpretation of data allowing local variations in representation and nuances in meaning, and a conversion function network between contexts to reconcile contextual differences. Then, we propose an ontology integration methodology based on the alignment of contexts and linking conversion function networks defined between contexts. The flexibility of our approach and methodology is illustrated with the alignment of air travel and car rental domains, an actual example from our prototype implementation.
ontology integration
|
|
|
41.
|
|
|
Wee Horng Ang Massachusetts Institute of Technology (MIT) Yang Lee Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Dinsha Mistree Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management Diane M. Strong Worcester Polytechnic Institute (WPI) Richard Y. Wang Massachusetts Institute of Technology (MIT) Chrisy Yao Suffolk University
|
| Posted: |
|
28 Aug 06
|
|
Last Revised:
|
|
13 Jul 08
|
|
110 (73,512)
|
|
|
| |
Abstract:
In this paper we redefine information security by extending its definition in three salient avenues: locale (beyond the boundary of an enterprise to include partner organizations), role (beyond the information custodians' view to include information consumers' and managers' views), and resource (beyond technical dimensions to include managerial dimensions). Based on our definition, we develop a model of information security, which we call the House of Security. This model has eight constructs, Vulnerability, Accessibility, Confidentiality, IT Resources for Security, Financial Resources for Security, Business Strategy for Security, Security Policy and Procedures, and Security Culture. We have developed a questionnaire to measure the assessment and importance of information security along these eight aspects. The questionnaire covers multiple locales and questionnaire respondents cover multiple roles. Data collection is currently in process. Results from our analysis of the collected data will be ready for presentation at the conference.
Information security, Security vulnerabilities, Information confidentiality, Security policy, Security procedures, Security culture
|
|
|
42.
|
|
|
Nazli Choucri Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Allen Moulton Massachusetts Institute of Technology (MIT) Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT)
|
| Posted: |
|
13 Apr 04
|
|
Last Revised:
|
|
29 Dec 04
|
|
102 (77,843)
|
1
|
|
| |
Abstract:
The National Research Council has noted that although there are many private and public databases that contain information potentially relevant to counterterrorism programs, they lack the necessary context definitions (i.e., metadata) and access tools to enable interoperation with other databases and the extraction of meaningful and timely information. In this paper we present examples of these problems and a technology developed at MIT, called context mediation, which provides a novel approach for addressing these problems.
context mediation, heterogeneous contexts
|
|
|
43.
|
|
|
Aykut Firat Massachusetts Institute of Technology (MIT) - Sloan School of Management Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Nor Adnan Yahaya Malaysia University of Science and Technology Choo Wai Kuan Malaysia University of Science and Technology Stéphane Bressan National University of Singapore (NUS) - School of Computing
|
| Posted: |
|
28 Jul 05
|
|
Last Revised:
|
|
02 Sep 05
|
|
100 (78,944)
|
2
|
|
| |
Abstract:
Cameleon# is a web data extraction and management tool that provides information aggregation with advanced capabilities that are useful for developing value-added applications and services for electronic business and electronic commerce. To illustrate its features, we use an airfare aggregation example that collects data from eight online sites, including Travelocity, Orbitz, and Expedia. This paper covers the integration of Cameleon# with commercial database management systems, such as MS SQL Server, and XML query languages, such as XQuery.
Cameleon#, web data extraction, web data management
|
|
|
44.
|
|
|
Allen Moulton Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
06 Jan 03
|
|
Last Revised:
|
|
06 Jan 03
|
|
97 (80,684)
|
5
|
|
| |
Abstract:
Using securities industry examples, the context interchange mediation knowledge architecture is applied to interoperability problems for enumerated data types, such as codes and other symbols used to represent conceptual distinctions. Ongoing efforts in the securities industry to develop new XML-based standards for information interchange are examined. Using components representing similar securities information, drawn from different but complementary securities standards and sources, example problems of information interoperability are examined. We show that transforming data representation into an autonomously specified context model and thence into a general domain ontology allows successful interoperability in several ways depending on how each context is explained to the mediator.
Semantic Interoperability, Securities, Semantic Differences, Data Types, XML, Enumerated Data Types
|
|
|
45.
|
|
|
Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
18 Jan 06
|
|
Last Revised:
|
|
04 May 06
|
|
96 (81,276)
|
1
|
|
| |
Abstract:
With the increasing use of the Internet, many of us feel strongly about the free and unfettered exchange and use of information. But the actual situation is not that simple. After the European Union adopted the Database Directive to provide legal protection for non-copyrightable database contents, the U.S. has introduced six legislative proposals, all of which failed to become a law. One of the major difficulties of formulating a socially beneficial database law is in finding the right balance between protecting the incentives of creating publicly accessible databases (including semi-structured web sites) and preserving adequate access to factual data for value creating activities. We address the problem by developing an extended spatial competition model that explicitly considers the inefficiencies in policy administration. With the model, we can determine various conditions and the corresponding socially beneficial policy choices. The results show that, depending on the cost level of database creation, the degree of differentiation of the reuser database, and the efficiency of policy administration, the socially beneficial policy choice can be protecting a legal monopoly, encouraging competition via compulsory licensing, discouraging voluntary licensing, or even allowing free riding. The results provide useful insights to the formulation of a socially beneficial database protection policy.
database protection, data reuse, policy, intellectual property
|
|
|
46.
|
|
|
Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
13 Apr 04
|
|
Last Revised:
|
|
07 Jan 06
|
|
95 (81,925)
|
2
|
|
| |
Abstract:
The change in meaning of data over time poses significant challenges for the use of that data. These challenges exist in the use of an individual data source and are further compounded with the integration of multiple sources. In this paper, we identify three types of temporal semantic heterogeneities, which have not been addressed by existing research. We propose a solution that is based on extensions to the Context Interchange framework. This approach provides mechanisms for capturing semantics using ontology and temporal context. It also provides a mediation service that automatically resolves semantic conflicts. We show the feasibility of this approach by demonstrating a prototype that implements a subset of the proposed extensions.
Context Interchange framework, ontology and temporal context
|
|
|
47.
|
|
|
Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
19 Dec 07
|
|
Last Revised:
|
|
01 Jun 08
|
|
94 (82,529)
|
|
|
| |
Abstract:
Sell Globally and Shop Globally have been seen as a potential benefit of web-enabled electronic business. One important step toward realizing this benefit is to know how things are selling in various parts of the world. A global price comparison service would address this need. But there have not been many such services. In this paper, we use a case study of global price dispersion to illustrate the need and the value of a global price comparison service. Then we identify and discuss several technology challenges including semantic heterogeneity, in providing a global price comparison service. We propose a mediation architecture to address the semantic heterogeneity problem, and demonstrate the feasibility of the proposed architecture by implementing a prototype that enables global price comparison using data from web sources in several countries.
Global Price Comparison, Shopbots, Context, Semantic Data
|
|
|
48.
|
|
|
Nazli Choucri Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management Richard Y. Wang Massachusetts Institute of Technology (MIT)
|
| Posted: |
|
18 Nov 03
|
|
Last Revised:
|
|
18 Nov 03
|
|
89 (85,788)
|
|
|
| |
Abstract:
Three important trends - unrelenting globalization, growing worldwide electronic connectivity, and increasing knowledge intensity of economic activity - are creating new opportunities for global politics, with challenging demands for information access, interpretation, provision and overall use. This has serious implications for two diverse domains of scholarship: Information Technology (IT) and International Relations (IR) in political science. Unless IT advances remain "one step ahead" of such realities and complexities, strategies for better understanding and responding to emergent global challenges will be severely impeded. For example, the new Department of Homeland Security will rely on intelligence information from all over the world to develop strategic responses to a wide range of security threats. However, relevant information is stored throughout the world and by diverse agencies and in different media, formats, quality, and contexts. Intelligent integration of that information and improved modes of access and use are critical to developing policies designed to identify and anticipate sources of threat, to strengthen protection against threats on the United States, and to enhance the security of the nation.
|
|
|
49.
|
|
|
Subhash Bhalla The University of Aizu - Database Systems Laboratory Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
01 Apr 03
|
|
Last Revised:
|
|
07 Jan 06
|
|
84 (89,133)
|
|
|
| |
Abstract:
A possibility of a temporary disconnection of database service exists in many computing environments. It is a common need to permit a participating site to lag behind and re-initialize to full recovery. It is also necessary that active transactions view a globally consistent system state for ongoing operations. We present an algorithm for on-the-fly backup and site-initialization. The technique is non-blocking in the sense that failure and recovery procedures do not interfere with ordinary transactions. As a result the system can tolerate disconnection of services and reconnection of disconnected services, without incurring high overheads.
Concurrency Control, Distributed Algorithms, Distributed Databases, Non-blocking Protocols, Serializability
|
|
|
50.
|
|
|
Subhash Bhalla The University of Aizu - Database Systems Laboratory Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
13 Nov 02
|
|
Last Revised:
|
|
07 Jan 06
|
|
82 (90,563)
|
|
|
| |
Abstract:
To recover from media failures, a database is 'restored' from an earlier backup copy. A recovery log of transactions is used to roll forward from the backup version to the desired time (the current time). High availability requires that the backup copying be fast, and be in parallel with on-going update activity. It also necessitates, frequently obtaining a consistent copy of an entire database. Such concurrent generation of a database copy, interferes with system activity. It introduces blocking and delays for many update transactions. We propose an algorithm that reads current database entities without interference with update activity. The algorithm is simple to implement as compared with previous proposals. It assigns a color to each entity read by the global-read. Normal transactions commit by declaring a color for the committed updates. Subsequently, these markings are used for generation of a consistent copy of the entire database.
Concurrency Control Algorithms, Database Recovery, Global-read Transactions, Locking, Long-lived Transactions, Read-only Transactions, Synchronization, Transaction Processing
|
|
|
51.
|
|
|
John Lyneis Massachusetts Institute of Technology (MIT) - Sloan School of Management Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management MIT Sloan Working Papers Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
27 Aug 08
|
|
Last Revised:
|
|
27 Aug 08
|
|
78 (93,426)
|
|
|
| |
Abstract:
Research has approached the topic of safety in organizations from a number of different perspectives. On the one hand, psychological research on safety climate gives evidence for a range of organizational factors that predict safety across organizations. On the other hand, organizational learning theorists view safety as a dynamic problem in which organizations must learn from mistakes. Here, we synthesize these two streams of research by incorporating key organizational factors from the safety climate literature into a dynamic simulation model that also includes the possibility for learning. Analysis of simulation results sheds insight into the nature of reliability and confirms the dangers of over-reliance on 'single loop learning' as a mechanism for controlling safety behaviors. Special emphasis is placed on strategies that managers might use to encourage learning and prevent erosion in safety behaviors over time.
safety, simulations, models
|
|
|
52.
|
|
|
Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Wee Horng Ang Massachusetts Institute of Technology (MIT) Yang Lee Massachusetts Institute of Technology (MIT) Dinsha Mistree Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management Diane M. Strong Worcester Polytechnic Institute (WPI) Richard Y. Wang Massachusetts Institute of Technology (MIT)
|
| Posted: |
|
11 Sep 07
|
|
Last Revised:
|
|
13 Jul 08
|
|
78 (93,426)
|
|
|
| |
Abstract:
In this paper we introduce a methodology for analyzing differences regarding security perceptions within and between stakeholders, and the elements which affect these perceptions. We have designed the "House of Security", a security assessment model that provides the basic framework for considering eight different constructs of security: Vulnerability, Accessibility, Confidentiality, Technology Resources for Security, Financial Resources for Security, Business Strategy for Security, Security Policy and Procedures, and Security Culture. We designed and performed a survey of about 1500 professionals in various industries, levels, and functions resulting in a gap analysis to uncover differences (1) between the different constructs and aspects of security, (2) between different enterprise stakeholder roles, and (3) between different organizations. This paper briefly describes the development of the security constructs and some of the preliminary findings.
Security Assessment, Business Strategy for Security, Security Policy
|
|
|
53.
|
|
|
Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
29 Oct 04
|
|
Last Revised:
|
|
29 Oct 04
|
|
77 (94,237)
|
9
|
|
| |
Abstract:
Many online services access a large number of autonomous data sources and at the same time need to meet different user requirements. It is essential for these services to achieve semantic interoperability among these information exchange entities. In the presence of an increasing number of proprietary business processes, heterogeneous data standards, and diverse user requirements, it is critical that the services are implemented using adaptable, extensible, and scalable technology. The Context Interchange (COIN) approach, inspired by similar goals of the Semantic Web, provides a robust solution. In this paper, we describe how COIN can be used to implement dynamic online services where semantic differences are reconciled on the fly. We show that COIN is flexible and scalable by comparing it with several conventional approaches. With a given ontology, the number of conversions in COIN is quadratic to the semantic aspect that has the largest number of distinctions. These semantic aspects are modeled as modifiers in a conceptual ontology; in most cases the number of conversions is linear with the number of modifiers, which is significantly smaller than traditional hard-wiring middleware approach where the number of conversion programs is quadratic to the number of sources and data receivers. In the example scenario in the paper, the COIN approach needs only 5 conversions to be defined while traditional approaches require 20,000 to 100 million. COIN achieves this scalability by automatically composing all the comprehensive conversions from a small number of declaratively defined sub-conversions.
ontology, semantics, scalability, data integration, heterogeneous sources
|
|
|
54.
|
|
|
Georgeta Vidican-Sgouridis Masdar Institute of Science and Technology (MIST) Wei Lee Woon Masdar Institute of Science and Technology (MIST) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
17 Apr 09
|
|
Last Revised:
|
|
17 Apr 09
|
|
73 (97,439)
|
|
|
| |
Abstract:
In this paper, we use feature extraction and data analysis techniques for the elucidation of patterns and trends in technological innovation. In studying innovation, we focus on the role of public research institutions (research universities and national laboratories) in the development of new industries. More specifically, we are interested in measuring innovation through research collaborations between these institutions and the private sector. The proposed methods are primarily drawn from the field of bibliometrics – i.e. the analysis of information and trends in the publication of text documents, rather than the contents of these documents. In particular, we seek to explore the relationship between joint publication patterns and trends, R&D funding, technology development choices, and the viability and effectiveness of industry-university collaborations. To focus the discussions and to provide concrete examples of their applicability, this study will have an initial emphasis on the solar photovoltaic (PV) sector in the U.S., though the techniques and general approach devised here will be applicable to a broad range of industries, situations, and locations. Our analysis suggests that interesting information and conclusions can be derived from this line of analysis. The results obtained using our data extraction techniques allow us to identify early technology focus in different areas within solar PV technologies, and to determine potential technology pathways, which is critical for innovation policy in the renewable energy domain.
bibliometrics, photoelectric, innovation
|
|
|
55.
|
|
|
Wei Lee Woon Masdar Institute of Science and Technology (MIST) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
27 Aug 08
|
|
Last Revised:
|
|
27 Aug 08
|
|
70 (100,002)
|
1
|
|
| |
Abstract:
This paper presents a novel approach to the visualization and subsequent elucidation of research domains in science and technology. The proposed methodology is based on the use of bibliometrics; i.e., analysis is conducted using information regarding trends and patterns of publication rather than the contents of these publications. In particular, we explore the use of term co-occurence frequencies as an indicator of the semantic closeness between pairs of words or phrases. To demonstrate the utility of this approach, a case study on renewable energy technologies is conducted, where the above techniques are used to visualize the interrelationships within a collection of energy-related keywords. As these are regarded as manifestations of the underlying research topics, we contend that the proposed visualizations can be interpreted as representations of the underlying technology landscape. These techniques have many potential applications, but one interesting challenge in which we are particularly interested is the mapping and subsequent prediction of future developments in the technological fields being studied.
landscape visualization
|
|
|
56.
|
|
|
Aykut Firat Massachusetts Institute of Technology (MIT) - Sloan School of Management Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Benjamin Grosof Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
18 Jan 06
|
|
Last Revised:
|
|
18 Jan 06
|
|
70 (100,002)
|
|
|
| |
Abstract:
The prospect of combining information from diverse sources for superior decision making is plagued by the challenge of semantic heterogeneity, as data sources often adopt different conventions and interpretations when there is no coordination. An emerging solution in information integration is to develop an ontology as a standard data model for a domain of interest, and then to define the correspondences between the data sources and this common model to eliminate their semantic heterogeneity and produce a single integrated view of the data sources. We first claim that this single integrated view approach is unnecessarily restrictive, and instead offer the view that ontologies can simultaneously accommodate multiple integrated views provided the accompaniment of contexts, a set of axioms on the interpretation of data allowing local variations in representation and nuances in meaning, and a conversion function network between contexts to reconcile contextual differences. Then, we illustrate how to achieve semantic interoperability between multiple ontology-based applications. During this process, application ontologies are aligned through the reconciliation of their context models, and a new application with a virtual merged ontology is created. We illustrate this alternative approach with the alignment of air travel and car rental domains, an actual example from our prototype implementation.
Intelligent Information Integration, Query Rewriting, Ontology Merging
|
|
|
57.
|
|
|
Nazli Choucri Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
26 May 06
|
|
Last Revised:
|
|
13 Jun 06
|
|
69 (100,840)
|
|
|
| |
Abstract:
In the aftermath of the 9/11 tragedy it has became clear that the lack of effective information exchange among government agencies hindered the capability of identifying potential threats and preventing terrorism actions. It has been noted by the National Research Council that Although there are many private and public databases that contain information potentially relevant to counterterrorism programs, they lack the necessary context definitions (i.e., metadata) and access tools to enable interoperation with other databases and the extraction of meaningful and timely information1. This report clearly recognized the important problem that the semantic data integration research community has been studying. In this chapter, we describe the Laboratory for Information Globalization and Harmonization Technologies (LIGHT) developed at MIT. LIGHT arises from previous research, most notably the COntext INterchange (COIN) context mediation technology and the Global System for Sustainable Development (GSSD). Context Mediation technology addresses the above problem and deals directly with the integration of heterogeneous contexts (i.e. data meaning) in a flexible, scalable and extensible environment. This approach makes it easier and more transparent for receivers (e.g., applications, sensors, users) to exploit distributed sources (e.g., databases, web, information repositories, sensors). In this paper we define context as the assumptions of the source and receiver that affect correct interpretation of the meaning of the information. Receivers are able to specify their desired context so that there will be no uncertainty in the interpretation of the information coming from heterogeneous sources. The COIN context knowledge representation approach and associated reasoning tools significantly reduce the overhead involved in the integration of multiple sources and simplifies maintenance in an environment of changing source and receiver context. This technology is essential in the counter-terrorism environment in a number of areas including: (1) allowing for receivers (i.e., applications, analysts) to have multiple views of the same data (e.g., different semantic assumptions - two analysts may have a different meaning for Soviet Union depending on the application), (2) allowing for the collection of information into a single data warehouse, and (3) use in a dynamic federated environment where applications may have changing contexts and sources are added and removed from the grid. This approach is essential to the agile integration of information to support counter terrorism.
Homeland Security, Context Knowledge Representation, Reasoning Technologies
|
|
|
58.
|
|
|
Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
28 Aug 06
|
|
Last Revised:
|
|
18 Oct 06
|
|
68 (101,719)
|
2
|
|
| |
Abstract:
There are many different kinds of ontologies used for different purposes in modern computing. Lightweight ontologies are easy to create, but difficult to deploy; formal ontolgies are relatively easy to deploy, but difficult to create. This paper presents an approach that combines the strengths and avoids the weaknesses of lightweight and formal ontologies. In this approach, the ontology includes only high level concepts; subtle differences in the interpretation of the concepts are captured as context descriptions outside the ontology. The resulting ontology is simple, thus it is easy to create. The context descriptions facilitate data conversion composition, which leads to a scalable solution to semantic interoperability among disparate data sources and contexts.
lightweight ontology, context, mediation, scalability
|
|
|
59.
|
|
|
Nicolas Prat ESSEC Business School Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
19 Dec 07
|
|
Last Revised:
|
|
19 Dec 07
|
|
67 (102,585)
|
1
|
|
| |
Abstract:
Data quality is crucial for operational efficiency and sound decision making. This paper focuses on believability, a major aspect of quality, measured along three dimensions: trustworthiness, reasonableness, and temporality. We ground ourapproach on provenance, i.e. the origin and subsequent processing history of data. We present our provenance model and our approach for computing believability based on provenance metadata. The approach is structured into three increasingly complex building blocks: (1) definition of metrics for assessing the believability of data sources, (2) definition of metrics for assessing the believability of data resulting from one process run and (3) assessment of believability based on all the sources and processing history of data. We illustrate our approach with a scenario based on Internet data. To our knowledge, this is the first work to develop a precise approach to measuring data believability and making explicit use of provenance-based measurements.
data quality, provenance metadata
|
|
|
60.
|
|
|
Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Steven Y. Tu Soochow University
|
| Posted: |
|
02 Oct 01
|
|
Last Revised:
|
|
03 Oct 01
|
|
66 (103,490)
|
|
|
| |
Abstract:
Source selection allows the users to express what they want while the system automatically performs the identification and selection of relevant sources to answer the query request. To automate that process, the system must be able to represent the contents of data sources in a description language. Descriptions of source contents can be characterized by the two concepts of scope and size. This paper builds upon and extends the concept language, description logic (DL), to propose a novel representation system to achieve that goal. We point out that there are technical barriers within description logic limiting the types of data sources that can be represented. Specifically, we show that (1) DL is awkward in representing sufficient conditions, and (2) DL can describe properties of a concept itself only in the case of existential quantification. These barriers limit expressions of size information in source descriptions and thus cause us to extend DL with the notion of generalized quantifiers to make them inter-operable with traditional logic. The proposed formalism integrates the nice features of generalized quantifiers into description logic, and hence achieves more expressive power than previous representation systems based purely on description logic. It is also shown that the proposed language preserves those mathematical properties that traditional logic-based formalisms are known to hold.
|
|
|
61.
|
|
|
Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Mihai Lupu affiliation not provided to SSRN
|
| Posted: |
|
17 Dec 07
|
|
Last Revised:
|
|
17 Dec 07
|
|
65 (104,389)
|
1
|
|
| |
Abstract:
The Context Interchange Strategy (COIN) is an approach to solving the problem of interoperability of semantically heterogeneous data sources through context mediation. The existing implementation of COIN uses its own notation and syntax for representing ontologies. More recently, the OWL Web Ontology Language is becoming established as the W3C recommended ontology language. A bridge is needed between these two areas and an explanation on how each of the two approaches can learn from each other. We propose the use of the COIN strategy to solve context disparity and ontology interoperability problems in the emerging Semantic Web both at the ontology level and at the data level. In this work we showcase how the problems that arise from context-dependent representation of facts can be mitigated by Semantic Web techniques, as tools of the conceptual framework developed over 15 years of COIN research.
Web Ontology Language, Semantic Web
|
|
|
62.
|
|
|
Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
17 Aug 04
|
|
Last Revised:
|
|
16 Aug 05
|
|
65 (104,389)
|
2
|
|
| |
Abstract:
Changes of semantics in data sources further complicate the semantic heterogeneity problem. We identify four types of semantic heterogeneities related to changing semantics and present a solution based on an extension to the Context Interchange (COIN) framework. Changing semantics is represented as multi-valued contextual attributes in a shared ontology; however, only a single value is valid over a certain time interval. A mediator, implemented in abductive constraint logic programming, processes the semantics by solving temporal constraints for single-valued time intervals and automatically applying conversions to resolve semantic differences over these intervals. We also discuss the scalability of the approach and its applicability to the Semantic Web.
semantic heterogeneity problem, Context Interchange (COIN) framework
|
|
|
63.
|
|
|
Nicolas Prat ESSEC Business School Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
19 Dec 07
|
|
Last Revised:
|
|
06 Jan 09
|
|
63 (106,175)
|
|
|
| |
Abstract:
Data quality is crucial for operational efficiency and sound decision making. This paper focuses on believability, a major aspect of data quality. The issue of believability is particularly relevant in the context of Web 2.0, where mashups facilitate the combination of data from different sources. Our approach for assessing data believability is based on provenance and lineage, i.e. the origin and subsequent processing history of data. We present the main concepts of our model for representing and storing data provenance, and an ontology of the sub-dimensions of data believability. We then use aggregation operators to compute believability across the sub-dimensions of data believability and the provenance of data. We illustrate our approach with a scenario based on Internet data. Our contribution lies in three main design artifacts (1) the provenance model (2) the ontology of believability subdimensions and (3) the method for computing and aggregating data believability. To our knowledge, this is the first work to operationalize provenance-based assessment of data believability.
Quality Sub-Dimensions, Data Lineage
|
|
|
64.
|
|
|
Lynn Wu Massachusetts Institute of Technology (MIT) Aykut Firat Massachusetts Institute of Technology (MIT) - Sloan School of Management Tarik Alatovic Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
28 Aug 06
|
|
Last Revised:
|
|
29 Aug 06
|
|
63 (106,175)
|
|
|
| |
Abstract:
The web is undoubtedly the largest and most diverse repository of data, but it was not designed to offer the capabilities of traditional data base management systems - which is unfortunate. In a true data federation, all types of data sources, such as relational databases and semi-structured websites, could be used together. IBM WebSphere uses the "request-reply-compensate" protocol to communicate with wrappers in a data federation. This protocol expects wrappers to reply to query requests by indicating the portion of the queries they can answer. While this provides a very generic approach to data federation, it also requires the wrapper developer to deal with some of the complexities of capability considerations through custom coding. Alternative approaches based on declarative capability restrictions have been proposed in the literature, but they have not found their way into commercial systems, perhaps due to their complexity. We offer a practical middle-ground solution to querying web-sources, using IBM's data federation system as an example. In lieu of a two-layered architecture consisting of wrapper and source layers, we propose to move the capability declaration from the wrapper layer to a single component between the wrapper and the native data source. The advantage of this three-layered architecture is that each new web-source only needs to register its capability with the capability-declaration component once, which saves the work of writing a new wrapper each time. Thus the inclusion of web-sources through this mechanism can be accelerated in a way that doesn't require a change in existing data federation technology.
federated data, web data sources, capabilities, query handling
|
|
|
65.
|
|
|
Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
28 Aug 06
|
|
Last Revised:
|
|
29 Aug 06
|
|
60 (108,959)
|
|
|
| |
Abstract:
One of the challenges of dealing with multiple contexts is the significant effort required to provide all necessary lifting rules so that statements in one context can be viewed and understood in other contexts. In this paper, we introduce the notion of structured contexts, where a lightweight ontology is used to provide a structure for representing contexts. With structured contexts, specialized inference algorithms can be used to significantly reduce the number of lifting rules required. We use a semantic data integration example to illustrate the concept of structured contexts and the benefits of this novel use of lightweight ontology.
structured contexts, lightweight ontology
|
|
|
66.
|
|
|
Wei Lee Woon Masdar Institute of Science and Technology (MIST) Andreas Henschel Masdar Institute of Science and Technology (MIST) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
25 Sep 09
|
|
Last Revised:
|
|
25 Sep 09
|
|
56 (112,756)
|
|
|
| |
Abstract:
This paper presents a novel framework for supporting the development of well-informed research policies and plans. The proposed methodology is based on the use of bibliometrics; i.e., analysis is conducted using information regarding trends and patterns of publication. Information thus obtained is analyzed to predict probable future developments in the technological fields being studied. While using bibliometric techniques to study science and technology is not a new idea, the proposed approach extends previous studies in a number of important ways. Firstly, instead of being purely exploratory, the focus of our research has been on developing techniques for detecting technologies that are in the early growth phase, characterized by a rapid increase in the number of relevant publications. Secondly, to increase the reliability of the forecasting effort, we propose the use of automatically generated keyword taxonomies, allowing the growth potentials of subordinate technologies to aggregated into the overall potential of larger technology categories. As a demonstration, a proof-of-concept implementation of each component of the framework is presented, and is used to study the domain of renewable energy technologies. Results from this analysis are presented and discussed.
|
|
|
67.
|
|
|
Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
17 Aug 04
|
|
Last Revised:
|
|
16 Aug 05
|
|
53 (115,775)
|
4
|
|
| |
Abstract:
The underlying assumptions for interpreting the meaning of data often change over time, which further complicates the problem of semantic heterogeneities among autonomous data sources. As an extension to the Context Interchange (COIN) framework, this paper introduces the notion of temporal context as a formalization of the problem. We represent temporal context as a multi-valued method in F-Logic; however, only one value is valid at any point in time, the determination of which is constrained by temporal relations. This representation is then mapped to an abductive constraint logic programming framework with temporal relations being treated as constraints. A mediation engine that implements the framework automatically detects and reconciles semantic differences at different times. We articulate that this extended COIN framework is suitable for reasoning on the Semantic Web.
Context Interchange (COIN) framework, temporal context, Semantic Web
|
|
|
68.
|
|
|
Nathan A. Minami Massachusetts Institute of Technology (MIT) - Sloan School of Management Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
29 Jun 07
|
|
Last Revised:
|
|
29 Jun 07
|
|
52 (116,738)
|
1
|
|
| |
Abstract:
Despite extraordinary efforts by leaders at all levels throughout the U.S. Army, dozens of soldiers are killed each year as a result of both combat and motor vehicle accidents. The objective of this study is to look beyond the events and symptoms of accidents which normally indicate human error, and instead study the upper-level organizational processes and problems that may constitute the actual root causes of accidents. Critical to this process is identifying critical variables, establishing causality between variables, and quantifying variables that lead to both resilience against accidents and propensities for accidents. After reviewing the available literature we report on our development of a System Dynamics model, which is an analytical model of the system that allows for extensive simulation. The results of these simulations suggest that high-level decisions that balance mission rate and operations tempo with troop availability, careful management of the work-rest cycle for deployed troops, and improvement of the processes for evaluating the lessons learned from accidents, will lead to a reduction in Army combat and motor vehicle accidents.
combat vehicle accidents, army, system dynamics
|
|
|
69.
|
|
|
Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT)
|
| Posted: |
|
18 Jan 06
|
|
Last Revised:
|
|
23 Mar 06
|
|
49 (119,954)
|
|
|
| |
Abstract:
In this paper, we first identify semantic heterogeneities that, when not resolved, often cause serious data quality problems. We discuss the especially challenging problems of temporal and aggregational ontological heterogeneity, which concerns how complex entities and their relationships are aggregated and reinterpreted over time. Then we illustrate how the COntext INterchange (COIN) technology can be used to capture data semantics and reconcile semantic heterogeneities in a scalable manner, thereby improving data quality.
Data Semantics, Semantic Heterogeneity, Aggregation, Temporal, Ontology, Context
|
|
|
70.
|
|
|
Xitong Li Tsinghua University Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Quan Z. Sheng University of Adelaide Yushun Fan Tsinghua University
|
| Posted: |
|
18 Sep 08
|
|
Last Revised:
|
|
18 Sep 08
|
|
48 (121,038)
|
|
|
| |
Abstract:
With the increasing popularity of Service Oriented Architecture (SOA), service composition is gaining momentum as the potential silver bullet for application integration. However, services are not always perfectly compatible and therefore can not be directly composed. Service mediation, roughly classified into signature and protocol ones, thus becomes one key working area in SOA. As a challenging problem, protocol mediation is still open and existing approaches only provide partial solutions. In this paper, a systematic approach based on mediator patterns is proposed to generate executable mediators and glue partially compatible services together. The mediation process and its main steps are introduced. By utilizing message mapping, a heuristic technique for identifying protocol mismatches and selecting appropriate mediator patterns is presented. The corresponding BPEL templates of these patterns are also developed. Moreover, a prototype system, namely Service Mediation Toolkit (SMT), is implemented to validate the feasibility and effectiveness of our approach.
Service oriented architecture, Web service, Service composition, Protocol mediation, Mediator
|
|
|
71.
|
|
|
Nathan A. Minami Massachusetts Institute of Technology (MIT) - Sloan School of Management Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
18 Dec 07
|
|
Last Revised:
|
|
18 Dec 07
|
|
48 (121,038)
|
|
|
| |
Abstract:
Dozens of U.S. soldiers are killed each year as a result of both combat and motor vehicle accidents. The objective of this study is to look beyond the events and symptoms of accidents which normally indicate human error, and instead study the complex and poorly understood upper-level organizational processes and problems that may constitute the actual root causes of accidents - this is particularly challenging because the causes often involve nonlinear dynamic phenomena and have behaviors that are counter-intuitive to normal human thinking, these are often called wicked problems. After reviewing the available literature, a System Dynamics model was created to provide an analytical model of this multifaceted system that allows for extensive simulation. The results of these simulations suggest that high-level decisions that balance mission rate and operations tempo with troop availability, careful management of the work-rest cycle for deployed troops, and improvement of the processes for evaluating the lessons learned from accidents, will lead to a reduction in Army combat and motor vehicle accidents.
Safety, System Dynamics, Complexity
|
|
|
72.
|
|
|
Ayse Kaya Firat Massachusetts Institute of Technology (MIT) - Department of Electrical Engineering and Computer Science Wei Lee Woon Masdar Institute of Science and Technology (MIST) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
11 Mar 09
|
|
Last Revised:
|
|
11 Mar 09
|
|
44 (125,495)
|
|
|
| |
Abstract:
We are investigating trend extrapolation using historical data from academic publications to forecast future technology directions. Many sources of academic information on the Web (e.g., Google Scholar, Scirus) provide a wealth of relevant information, yet they are not structured for programmatic access. We can use Web wrappers, programs that can harvest text data from Web pages and present them in a structured format, to overcome this problem.
|
|
|
73.
|
|
|
Wei Lee Woon Masdar Institute of Science and Technology (MIST) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
27 Aug 08
|
|
Last Revised:
|
|
27 Aug 08
|
|
43 (126,675)
|
2
|
|
| |
Abstract:
A novel method for automatically constructing taxonomies for specific research domains is presented. The proposed methodology uses term co-occurence frequencies as an indicator of the semantic closeness between terms. To support the automated creation of taxonomies or subject classifications we present a simple modification to the basic distance measure, and describe a set of procedures by which these measures may be converted into estimates of the desired taxonomy. To demonstrate the viability of this approach, a pilot study on renewable energy technologies is conducted, where the proposed method is used to construct a hierarchy of terms related to alternative energy. These techniques have many potential applications, but one activity in which we are particularly interested is the mapping and subsequent prediction of future developments in the technology and research.
Taxonomy Construction, Asymmetric Information
|
|
|
74.
|
|
|
Wei Lee Woon Masdar Institute of Science and Technology (MIST) Hatem Zeineldin Masdar Institute of Science and Technology (MIST) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
07 Apr 09
|
|
Last Revised:
|
|
07 Apr 09
|
|
33 (139,494)
|
|
|
| |
Abstract:
This paper describes the application of data mining techniques for eludicating patterns and trends in technological innovation. Specifically, we focus on the use of bibliometric methods, viz techniques which focus on trends in the publication of text documents rather than the content of these documents. Of particular interest is the relationship between publication patterns, as characterized by term occurrence frequencies, and the underlying technological trends and developments which drive these trends. To focus the discussions and to provide a concrete example of their applicability, a detailed case study focussing on research in the area of Distributed Generation (DG) is also presented; however, the techniques and general approach devised here will be applicable to a broad range of industries, situations, and locations. Our results are promising and indicate that interesting information and conclusions can be derived from this line of analysis. The results obtained using data extraction techniques highlight and present the evolution of DG-related technology focus areas, and their relative importance within this field.
bibliometric analysis
|
|
|
75.
|
|
|
Thomas Gannon MITRE Corporation Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Allen Moulton Massachusetts Institute of Technology (MIT) Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management Marwan Sabbouh MITRE Corporation Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT)
|
| Posted: |
|
11 Mar 09
|
|
Last Revised:
|
|
11 Mar 09
|
|
31 (142,387)
|
|
|
| |
Abstract:
Technological advances such as Service Oriented Architecture (SOA) have increased the feasibility and importance of effectively integrating information from an ever widening number of systems within and across enterprises. A key difficulty of achieving this goal comes from the pervasive heterogeneity in all levels of information systems. A robust solution to this problem needs to be adaptable, extensible, and scalable. In this paper, we identify the deficiencies of traditional semantic integration approaches. The COntext INterchange (COIN) approach overcomes these deficiencies by declaratively representing data semantics and using a mediator to create the necessary conversion programs from a small number of conversion rules. The capabilities of COIN is demonstrated using an example with 150 data sources, where COIN can automatically generate the over 22,000 conversion programs needed to enable semantic interoperability using only six parametizable conversion rules. This paper presents a framework for evaluating adaptability, extensibility, and scalability of semantic integration approaches. The application of the framework is demonstrated with a systematic evaluation of COIN and other commonly practiced approaches.
service oriented architecture, Context Interchange
|
|
|
76.
|
|
|
Wei Lee Woon Masdar Institute of Science and Technology (MIST) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Blaine Ziegler Massachusetts Institute of Technology (MIT) - Department of Electrical Engineering and Computer Science
|
| Posted: |
|
11 Mar 09
|
|
Last Revised:
|
|
03 May 09
|
|
31 (142,387)
|
1
|
|
| |
Abstract:
This paper presents an approach to bibliometric analysis in the context of technology mining. Bibliometric analysis refers to the use of publication database statistics, e.g., hit counts relevant to a topic of interest. Technology mining facilitates the identification of a technology's research landscape. Our contribution to bibliometrics in this context is the use of a technique known as Latent Semantic Analysis (LSA) to reveal the concepts that underlie the terms relevant to a field. Using this technique, we can analyze coherent concepts, rather than individual terms. This can lead to more useful results from our bibliometric analysis. We present results that demonstrate the ability of Latent Semantic Analysis to uncover the concepts underlying sets of key terms, used in a case study on the technologies of renewable energy.
tech mining, latent semantic analysis
|
|
|
77.
|
|
|
Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Mihai Lupu affiliation not provided to SSRN
|
| Posted: |
|
18 Sep 08
|
|
Last Revised:
|
|
24 Sep 08
|
|
27 (149,394)
|
|
|
| |
Abstract:
The COntext INterchange (COIN) strategy is an approach to solving the problem of interoperability of semantically heterogeneous data sources through context mediation. The existing implementation of COIN uses its own notation and syntax for representing ontologies. More recently, the OWL Web Ontology Language is becoming established as the W3C recommended ontology language. A bridge is needed between these two areas and an explanation on how each of the two approaches can learn from each other. We propose the use of the COIN strategy to solve context disparity and ontology interoperability problems in the emerging Semantic Web both at the ontology level and at the data level. In this work we showcase how the problems that arise from context-dependant representation of facts can be mitigated by Semantic Web techniques, as tools of the conceptual framework developed over 15 years of COIN research.
OWL Web Ontology Language, Context interchange, semantic web
|
|
|
78.
|
|
|
Xitong Li Tsinghua University Yushun Fan Tsinghua University Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Quan Z. Sheng University of Adelaide
|
| Posted: |
|
11 Mar 09
|
|
Last Revised:
|
|
11 Mar 09
|
|
20 (167,186)
|
|
|
| |
Abstract:
In the era of Global Services, Service Oriented Architecture (SOA) has been gaining momentum for building Web-based information systems. Service composition is one of the key objectives for adopting SOA. Unfortunately, Web services are not always exactly compatible and it is a non-trivial task to address the mismatches between them. To this end, an approach based on mediator patterns is proposed to develop mediators for reconciling protocol mismatches of partially compatible services and mediating them together. A heuristic technique is developed for identifying protocol mismatches and selecting appropriate patterns. The main steps of the reconciliation approach are presented.
Service Oriented Architecture
|
|
|
79.
|
|
|
Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT) Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
18 Sep 08
|
|
Last Revised:
|
|
18 Sep 08
|
|
20 (167,186)
|
|
|
| |
Abstract:
The change in meaning of data over time poses significant challenges for the use of that data. These challenges exist in the use of an individual data source and are further compounded with the integration of multiple sources. In this paper, we identify three types of temporal semantic heterogeneity. We propose a solution based on extensions to the Context Interchange framework, which has mechanisms for capturing semantics using ontology and temporal context. It also provides a mediation service that automatically reconciles semantic conflicts. We show the feasibility of this approach with a prototype that implements a subset of the proposed extensions.
temporal context, semantic heterogeneity, ontology, logic programming
|
|
|
80.
|
|
|
Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Xitong Li Tsinghua University Nazli Choucri Massachusetts Institute of Technology (MIT)
|
| Posted: |
|
25 Sep 09
|
|
Last Revised:
|
|
25 Sep 09
|
|
13 (187,291)
|
|
|
| |
Abstract:
With the increasing interconnection of computer networks and sophistication of cyber attacks, it is important to understand the dynamics of such situations, especially in regards to cyber international relations. The Explorations in Cyber International Relations (ECIR) Data Dashboard Project is an initiative to gather worldwide cybersecurity data publicly provided by nation-level Computer Emergency Response Teams (CERTs) and to provide a set of tools to analyze the cybersecurity data. The unique contributions of this paper are: (1) an evaluation of the current state of the diverse nation-level CERT cybersecurity data sources, (2) a description of the Data Dashboard tool developed and some interesting analyses from using our tool, and (3) a summary of some challenges with the CERT data availability and usability uncovered in our research.
Cybersecurity, Computer Emergency Response Teams, Data Dashboard, Country Comparisons
|
|
|
81.
|
|
|
Blaine Ziegler Massachusetts Institute of Technology (MIT) - Department of Electrical Engineering and Computer Science Ayse Kaya Firat Massachusetts Institute of Technology (MIT) - Department of Electrical Engineering and Computer Science Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Wei Lee Woon Masdar Institute of Science and Technology (MIST) Steven Camina Massachusetts Institute of Technology (MIT) - Electrical Engineering and Computer Science Clare Li affiliation not provided to SSRN Erik Fogg Massachusetts Institute of Technology (MIT)
|
| Posted: |
|
24 Sep 09
|
|
Last Revised:
|
|
24 Sep 09
|
|
13 (187,291)
|
|
|
| |
Abstract:
Even experts cannot be fully aware of all the promising developments in broad and complex fields of technology, such as renewable energy. Fortunately, there exist many diverse sources of information that report new technological developments, such as journal publications, news stories, and blogs. However, the volume of data contained in these sources is enormous; it would be difficult for a human to read and digest all of this information - especially in a timely manner. This paper describes a novel application of technology mining techniques to these diverse information sources to study, visualize, and identify the evolution of promising new technologies - a challenge we call 'early growth technology analysis.' For the work reported herein, we use as inputs information about millions of published documents contained in sources such as SCIRCUS, Inspec, and Compendex. We accomplish this analysis through the use of bibliometric analysis, consisting of three key steps: 1. Extract related keywords (from keywords in articles) 2. Determine the annual occurrence frequencies of these keywords 3. Identify those exhibiting rapid growth, particularly if starting from a low base. To provide a focus for the experiments and subsequent discussions, a pilot study was conducted in the area of 'renewable energy,' though the techniques and methods developed are neutral to the domain of study. Preliminary results and conclusions from the case study are presented and are discussed in the context of the effectiveness of the proposed methodology.
|
|
|
82.
|
|
|
Aykut Firat Massachusetts Institute of Technology (MIT) - Sloan School of Management Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management Benjamin Grosof Massachusetts Institute of Technology (MIT) - Sloan School of Management Frank Manola Independent Consultant
|
| Posted: |
|
23 Sep 09
|
|
Last Revised:
|
|
23 Sep 09
|
|
10 (196,016)
|
|
|
| |
Abstract:
Mappings in most federated databases are conceptualized and implemented as black-box transformations between source schemas and a federated schema. This approach does not allow specific mappings to be declared once and reused in other situations. We present an alternative approach, in which data-level mappings are represented independent of source and federated schemas as a network between “contexts”. This compendious representation expedites the data federation process via mapping reuse and automated mapping composition from simpler mappings. We illustrate the benefits of mapping reuse and composition by using an example that incorporates equational mappings and the application of symbolic equation solving techniques.
Federated DBs, Logic programming, Heterogeneous information, Mediators and Wrappers
|
|
|
83.
|
|
|
Xitong Li Tsinghua University Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Hongwei (Harry) Zhu Massachusetts Institute of Technology (MIT) Yushun Fan Tsinghua University
|
| Posted: |
|
24 Sep 09
|
|
Last Revised:
|
|
24 Sep 09
|
|
7 (203,520)
|
|
|
| |
Abstract:
Service Oriented Computing (SOC) is a popular computing paradigm for the development of distributed Web applications. Service composition, a key element of SOC, is severely hampered by various types of semantic heterogeneity among the services. In this paper, we address the various semantic differences from the context perspective and use a lightweight ontology to describe the concepts and their specializations. Atomic conversions between the contexts are implemented using XPath functions and external services. The correspondences between the syntactic service descriptions and the semantic concepts are established using a flexible, standard-compliant mechanism. Given the naive BPEL composition ignoring semantic differences, our reconciliation approach can automatically determine and reconcile the semantic differences. The mediated BPEL composition incorporates necessary conversions to convert the data exchanged between different services. Our solution has the desirable properties (e.g., adaptability, extensibility and scalability) and can significantly alleviate the reconciliation efforts for Web services composition.
Web service, service composition, semantic heterogeneity, ontology, context
|
|
|
84.
|
|
|
Andreas Henschel Masdar Institute of Science and Technology (MIST) Wei Lee Woon Masdar Institute of Science and Technology (MIST) Thomas Wachter affiliation not provided to SSRN Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
25 Sep 09
|
|
Last Revised:
|
|
25 Sep 09
|
|
5 (207,894)
|
|
|
| |
Abstract:
We compare a family of algorithms for the automatic generation of taxonomies by adapting the Heymannalgorithm in various ways. The core algorithm determines the generality of terms and iteratively inserts them in a growing taxonomy. Variants of the algorithm are created by altering the way and the frequency, generality of terms is calculated. We analyse the performance and the complexity of the variants combined with a systematic threshold evaluation on a set of seven manually created benchmark sets. As a result, betweenness centrality calculated on unweighted similarity graphs often performs best but requires threshold fine-tuning and is computationally more expensive than closeness centrality. Finally, we show how an entropy-based filter can lead to more precise taxonomies.
|
|
|
85.
|
|
|
Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Nazli Choucri Massachusetts Institute of Technology (MIT) Steven Camina Massachusetts Institute of Technology (MIT) - Electrical Engineering and Computer Science Erik Fogg Massachusetts Institute of Technology (MIT) Xitong Li Tsinghua University Wei Fan University of Electronic Science and Technology of China
|
| Posted: |
|
24 Sep 09
|
|
Last Revised:
|
|
24 Sep 09
|
|
5 (207,894)
|
|
|
| |
Abstract:
Growing global interconnection and interdependency of computer networks, in combination with increased sophistication of cyber attacks over time, demonstrate the need for better understanding of the collective and cooperative security measures needed to prevent and respond to cybersecurity emergencies. The Exploring Cyber International Relations (ECIR) Data Dashboard project is an initial effort to gather and analyze such data within and between countries. This report describes the prototype ECIR Data Dashboard and the initial data sources used.
In 1988, the United States Department of Defense and Carnegie Mellon University formed the Computer Emergency Response Team (CERT) to lead and coordinate national and international efforts to combat cybsersecurity threats. Since then, the number of CERTs worldwide has grown dramatically, leading to the potential for a sophisticated and coordinated global cybersecurity response network. This report focuses primarily on the current state of the worldwide CERTs, including the data publiclyavailable, the extent of coordination, and the maturity of data management and responses. The report summarizes, analyses, and critiques the worldwide CERT network.
Additionally, the report describes the ECIR team's Data Dashboard project, designed to provide scholars, policymakers, IT professionals, and other stakeholders with a comprehensive set of data on national-level cybersecurity, information technology, and demographic data. The Dashboard allows these stakeholders to observe chronological trends and multivariate correlations that can lead to insight into the current state, potential future trends, and approximate causes of global cybersecurity issues. This report summarizes the purpose, state, progress, and challenges of developing the Data Dashboard project.
Disclaimer: This report relies on publicly available information, especially from the CERTs’ pubic web sites. They have not yet been contacted to confirm our understanding of their data. That will be done in subsequent phases of this effort.
|
|
|
86.
|
|
|
Adil Daruwala Novell - General Cheng Goh Deceased Scott Hofmeister affiliation not provided to SSRN Karim Hussein Massachusetts Institute of Technology (MIT) - General Stuart E. Madnick Massachusetts Institute of Technology (MIT) - Sloan School of Management Michael Siegel Massachusetts Institute of Technology (MIT) - Sloan School of Management
|
| Posted: |
|
07 Jan 01
|
|
Last Revised:
|
|
14 May 08
|
|
0 (0)
|
|
|
| |
Abstract:
In this paper we describe a prototype implementation of the Context Interchange Network (CIN). The prototype is described in terms of a financial application. The CIN is designed to provide for the intelligent integration of contextually (i.e., semantically) heterogeneous data. The system uses explicit context knowledge representation and a context mediator to automatically detect conflicts and resolve them through context conversion. The network also allows for context explication; making it possible for a receiver of data to understand meaning of the information represented by the source data.
|
|