header

Sar-Graphs: A Language Resource Connecting Linguistic Knowledge with Semantic Relations from Knowledge Graphs

28 Pages Posted: 10 Jul 2018 Publication Status: Accepted

See all articles by Sebastian Krause

Sebastian Krause

DFKI GmbH Berlin - Language Technology Lab (LT)

Leonhard Hennig

DFKI GmbH Berlin - Language Technology Lab (LT)

Andrea Moro

Sapienza University of Rome - Department of Computer Science

Dirk Weissenborn

DFKI GmbH Berlin - Language Technology Lab (LT)

Feiyu Xu

DFKI GmbH Berlin - Language Technology Lab (LT)

Hans Uszkoreit

DFKI GmbH Berlin - Language Technology Lab (LT)

Roberto Navigli

Sapienza University of Rome - Department of Computer Science

Abstract

Recent years have seen a significant growth and increased usage of large-scale knowledge resources in both academic research and industry. We can distinguish two main types of knowledge resources: those that store factual information about entities in the form of semantic relations (e.g., Freebase), namely so-called knowledge graphs, and those that represent general linguistic knowledge (e.g., WordNet or UWN). In this article, we present a third type of knowledge resource which completes the picture by connecting the two first types. Instances of this resource are graphs of semantically-associated relations (sar-graphs), whose purpose is to link semantic relations from factual knowledge graphs with their linguistic representations in human language.

We present a general method for constructing sar-graphs using a language- and relation-independent, distantly supervised approach which, apart from generic language processing tools, relies solely on the availability of a lexical semantic resource, providing sense information for words, as well as a knowledge base containing seed relation instances. Using these seeds, our method extracts, validates and merges relation-specific linguistic patterns from text to create sar-graphs. To cope with the noisily labeled data arising in a distantly supervised setting, we propose several automatic pattern confidence estimation strategies, and also show how manual supervision can be used to improve the quality of sar-graph instances. We demonstrate the applicability of our method by constructing sar-graphs for 25 semantic relations, of which we make a subset publicly available at http://sargraph.dfki.de.

We believe sar-graphs will prove to be useful linguistic resources for a wide variety of natural language processing tasks, and in particular for information extraction and knowledge base population. We illustrate their usefulness with experiments in relation extraction and in computer assisted language learning.

Keywords: Knowledge graphs, Language resources, Linguistic patterns, Relation extraction

Suggested Citation

Krause, Sebastian and Hennig, Leonhard and Moro, Andrea and Weissenborn, Dirk and Xu, Feiyu and Uszkoreit, Hans and Navigli, Roberto, Sar-Graphs: A Language Resource Connecting Linguistic Knowledge with Semantic Relations from Knowledge Graphs (2016). Available at SSRN: https://ssrn.com/abstract=3199232 or http://dx.doi.org/10.2139/ssrn.3199232

Sebastian Krause (Contact Author)

DFKI GmbH Berlin - Language Technology Lab (LT)

Alt-Moabit 91c
Berlin, D-10559
Germany

Leonhard Hennig

DFKI GmbH Berlin - Language Technology Lab (LT)

Alt-Moabit 91c
Berlin, D-10559
Germany

Andrea Moro

Sapienza University of Rome - Department of Computer Science

Via Salaria 113
Rome, 00198
Italy

Dirk Weissenborn

DFKI GmbH Berlin - Language Technology Lab (LT)

Alt-Moabit 91c
Berlin, D-10559
Germany

Feiyu Xu

DFKI GmbH Berlin - Language Technology Lab (LT)

Alt-Moabit 91c
Berlin, D-10559
Germany

Hans Uszkoreit

DFKI GmbH Berlin - Language Technology Lab (LT)

Alt-Moabit 91c
Berlin, D-10559
Germany

Roberto Navigli

Sapienza University of Rome - Department of Computer Science

Via Salaria 113
Rome, 00198
Italy

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
41
Abstract Views
864
PlumX Metrics