header

Linked Hypernyms: Enriching DBpedia with Targeted Hypernym Discovery

14 Pages Posted: 3 Jul 2018 First Look: Accepted

See all articles by Tomas Kliegr

Tomas Kliegr

University of Economics, Prague - Department of Information and Knowledge Engineering; Queen Mary University of London - Multimedia and Vision Research Group (MMV Group)

Abstract

The Linked Hypernyms Dataset (LHD) provides entities described by Dutch, English and German Wikipedia articles with types in the DBpedia namespace. The types are extracted from the first sentences of Wikipedia articles using Hearst pattern matching over part-of-speech annotated text and disambiguated to DBpedia concepts. The dataset covers 1.3 million RDF type triples from English Wikipedia, out of which 1 million RDF type triples were found not to overlap with DBpedia, and 0.4 million with YAGO2s. There are about 770 thousand German and 650 thousand Dutch Wikipedia entities assigned a novel type, which exceeds the number of entities in the localized DBpedia for the respective language. RDF type triples from the German dataset have been incorporated to the German DBpedia. Quality assessment was performed altogether based on 16.500 human ratings and annotations. For the English dataset, the average accuracy is 0.86, for German 0.77 and for Dutch 0.88. The accuracy of raw plain text hypernyms exceeds 0.90 for all languages. The LHD release described and evaluated in this article targets DBpedia 3.8, LHD version for the DBpedia 3.9 containing approximately 4.5 million RDF type triples is also available.

Keywords: DBpedia, Hearst patterns, Hypernym, Linked data, YAGO, Wikipedia, Type inference

Suggested Citation

Kliegr, Tomas, Linked Hypernyms: Enriching DBpedia with Targeted Hypernym Discovery (2015). Journal of Web Semantics First Look. Available at SSRN: https://ssrn.com/abstract=3199181 or http://dx.doi.org/10.2139/ssrn.3199181

Tomas Kliegr (Contact Author)

University of Economics, Prague - Department of Information and Knowledge Engineering ( email )

Nam. W. Churchilla 4
Praha 3
Czech Republic

Queen Mary University of London - Multimedia and Vision Research Group (MMV Group) ( email )

Mile End Road, Mile End
London, England E1 4NS
United Kingdom

Register to save articles to
your library

Register

Paper statistics

Abstract Views
128
Downloads
4