On Building and Publishing Linked Open Schema from Social Web Sites
30 Pages Posted: 12 Sep 2018 Publication Status: Accepted
Abstract
Schema-level knowledge is important for different semantic applications, such as reasoning, data integration and question answering. Compared with billions of triples describing millions of instances, current Linking Open Data has only a limited number of triples representing schema-level knowledge. To facilitate multilingual schema-level knowledge mining, we propose a general approach to learn Linked Open Schema (LOS) in different languages from social Web sites, which contain rich sources (i.e. taxonomies composed of categories and folksonomies consisting of tags) for mining large-scale schema-level knowledge. The core part of the proposed approach is a semi-supervised learning method integrating rules to capture equal, subClassOf and relate relations among the collected categories and tags. We respectively apply the proposed approach to the selected English social Web sites and the Chinese ones, resulting in an English LOS and a Chinese LOS.We publish the English LOS and the Chinese one as open data on the Web with three access levels, i.e. data dump, lookup service and SPARQL endpoint. Experimental results show the high accuracy of the relations in the English LOS and the Chinese one. Compared with DBpedia, Yago, BabelNet, and Freebase, both the English LOS and the Chinese one not only have large-scale concepts, but also contain the largest number of subClassOf relations.
Keywords: Linked Data, Linked Open Schema, Schema-Level Knowledge, Social Web Sites
Suggested Citation: Suggested Citation