header

CohEEL: Coherent and Efficient Named Entity Linking Through Random Walks

25 Pages Posted: 24 Jun 2018 First Look: Accepted

See all articles by Toni Gruetze

Toni Gruetze

University of Potsdam - Hasso Plattner Institute (HPI)

Gjergji Kasneci

University of Potsdam - Hasso Plattner Institute (HPI); Max Planck Society for the Advancement of the Sciences - Institute for Computer Science

Zhe Zuo

University of Potsdam - Hasso Plattner Institute (HPI)

Felix Naumann

University of Potsdam - Hasso Plattner Institute (HPI)

Abstract

In recent years, the ever-growing amount of documents on the Web as well as in digital libraries led to a considerable increase of valuable textual information about entities. Harvesting entity knowledge from these large text collections is a major challenge. It requires the linkage of textual mentions within the documents with their real-world entities. This process is called entity linking. Solutions to this entity linking problem have typically aimed at balancing the rate of linking correctness (precision) and the linking coverage rate (recall). While entity links in texts could be used to improve various Information Retrieval tasks, such as text summarization, document classification, or topic-based clustering, the linking precision is the decisive factor. For example, for topic-based clustering a method that produces mostly correct links would be more desirable than a high-coverage method that leads to more but also more uncertain clusters. We propose an efficient linking method that uses a random walk strategy to combine a precision-oriented and a recall-oriented classifier in such a way that a high precision is maintained, while recall is elevated to the maximum possible level without affecting precision. An evaluation on three datasets with distinct characteristics demonstrates that our approach outperforms seminal work in the area and shows higher precision and time performance than the most closely related state-of-the-art methods.

Keywords: Entity Linking, Named Entity Disambiguation, Random Walk, Machine Learning

Suggested Citation

Gruetze, Toni and Kasneci, Gjergji and Zuo, Zhe and Naumann, Felix, CohEEL: Coherent and Efficient Named Entity Linking Through Random Walks (March 2016). Journal of Web Semantics First Look. Available at SSRN: https://ssrn.com/abstract=3199229 or http://dx.doi.org/10.2139/ssrn.3199229

Toni Gruetze (Contact Author)

University of Potsdam - Hasso Plattner Institute (HPI) ( email )

Prof.-Dr.-Helmert-Str. 2-3,
Potsdam
Germany

Gjergji Kasneci

University of Potsdam - Hasso Plattner Institute (HPI) ( email )

Prof.-Dr.-Helmert-Str. 2-3,
Potsdam
Germany

Max Planck Society for the Advancement of the Sciences - Institute for Computer Science ( email )

Saarbruecken
Germany

Zhe Zuo

University of Potsdam - Hasso Plattner Institute (HPI) ( email )

Prof.-Dr.-Helmert-Str. 2-3,
Potsdam
Germany

Felix Naumann

University of Potsdam - Hasso Plattner Institute (HPI) ( email )

Prof.-Dr.-Helmert-Str. 2-3,
Potsdam
Germany

Register to save articles to
your library

Register

Paper statistics

Abstract Views
127
Downloads
2