Methodology for Geospatial Data Source Discovery in Ontology-Driven Geo-Information Integration Architectures
32 Pages Posted: 6 Jul 2018 First Look: Accepted
Due to advances in information, communication and sensing technologies, the amount of generated geospatial information is constantly growing. Since geospatial information is generated and made available over the Internet by different stakeholders, heterogeneity of geospatial data and geospatial data sources becomes inevitable. This heterogeneity of globally available geospatial data sources introduces great challenges for individuals who are trying to discover and assemble geospatial data from distributed geospatial data sources. From the standpoint of individuals who do not belong to Geo-Information System (GIS) realm but belong to a group of geoportal users group, the process of discovering and accessing heterogeneous geospatial data has proven to be a difficult one to govern. Since geoportals are foreseen as single points of discovery and access to geo-information, geoportal users expect geoportal to provide them with a mechanism to easily find (discover) what they are searching for, using their own language. Such mechanism would significantly improve the usability of a geoportal and user satisfaction. The implementation of such mechanism highly depends on the infrastructure a geoportal relies on—an interoperable geo-information dissemination environment.
To enhance discoverability of geospatial data and lay the knowledge foundation that can be used for geoportal usability improvement, scientists and engineers have developed interoperable geoinformation dissemination environments based on the following approaches: syntactic standardization and semantic annotation of Web-accessible geo-information sources, and ontology-driven geoinformation integration. Despite these approaches enhance discovery of heterogeneous and distributed geospatial data sources, there are still some issues which should be addressed to make geospatial data sources fully discoverable. As an example, in most cases geoportal users are not provided with an explicit description of the meaning of geospatial resources or may not know what keywords they should use to discover appropriate geo-information.
Given these challenges, we have defined a novel methodology used for geoportal usability improvement. Our methodology is foreseen to be used within geoportals relying on ontology-driven geo-information integration architectures. An approach implemented within this methodology utilizes terms extracted from a natural language description of geo-information, defined by the end users. User-defined description is disambiguated through means of a combination of unsupervised word sense disambiguation methods. Once disambiguated, this description is matched with the sense of the (domain/local) ontology concept names. The matching process is performed through semantic similarity measurement between disambiguated user-defined description and the sense of the (domain/local) ontology concepts.
Our methodology simplifies geospatial data discovery and can be easily implemented because it uses ontological components of the underlying architectures in their original form. The methodology we describe in this paper will be discussed in the context of similar prominent solutions. Also, this paper will present a prototype which will demonstrate the applicability of the proposed methodology. The implemented prototype is a stand-alone desktop application, which uses two resources as input: userdefined geo-information description and (domain/local) ontologies developed using Web Ontology Language (OWL). We will provide an overview of benefits over approaches that have been previously used for geospatial data discovery and offer guidelines for future improvement and development.
Keywords: Geo-information, Discovery, Ontology, Geoportal, Usability
Suggested Citation: Suggested Citation