header

CohEEL: Coherent and Efficient Named Entity Linking Through Random Walks

25 Pages Posted: 24 Jun 2018 Publication Status: Accepted

See all articles by Toni Gruetze

Toni Gruetze

University of Potsdam - Hasso Plattner Institute (HPI)

Gjergji Kasneci

University of Potsdam - Hasso Plattner Institute (HPI); Max Planck Society for the Advancement of the Sciences - Institute for Computer Science

Zhe Zuo

University of Potsdam - Hasso Plattner Institute (HPI)

Felix Naumann

University of Potsdam - Hasso Plattner Institute (HPI)

Abstract

In recent years, the ever-growing amount of documents on the Web as well as in digital libraries led to a considerable increase of valuable textual information about entities. Harvesting entity knowledge from these large text collections is a major challenge. It requires the linkage of textual mentions within the documents with their real-world entities. This process is called entity linking. Solutions to this entity linking problem have typically aimed at balancing the rate of linking correctness (precision) and the linking coverage rate (recall). While entity links in texts could be used to improve various Information Retrieval tasks, such as text summarization, document classification, or topic-based clustering, the linking precision is the decisive factor. For example, for topic-based clustering a method that produces mostly correct links would be more desirable than a high-coverage method that leads to more but also more uncertain clusters. We propose an efficient linking method that uses a random walk strategy to combine a precision-oriented and a recall-oriented classifier in such a way that a high precision is maintained, while recall is elevated to the maximum possible level without affecting precision. An evaluation on three datasets with distinct characteristics demonstrates that our approach outperforms seminal work in the area and shows higher precision and time performance than the most closely related state-of-the-art methods.

Keywords: Entity Linking, Named Entity Disambiguation, Random Walk, Machine Learning

Suggested Citation

Gruetze, Toni and Kasneci, Gjergji and Zuo, Zhe and Naumann, Felix, CohEEL: Coherent and Efficient Named Entity Linking Through Random Walks (March 2016). Available at SSRN: https://ssrn.com/abstract=3199229 or http://dx.doi.org/10.2139/ssrn.3199229

Toni Gruetze (Contact Author)

University of Potsdam - Hasso Plattner Institute (HPI) ( email )

Prof.-Dr.-Helmert-Str. 2-3,
Potsdam
Germany

Gjergji Kasneci

University of Potsdam - Hasso Plattner Institute (HPI) ( email )

Prof.-Dr.-Helmert-Str. 2-3,
Potsdam
Germany

Max Planck Society for the Advancement of the Sciences - Institute for Computer Science ( email )

Saarbruecken
Germany

Zhe Zuo

University of Potsdam - Hasso Plattner Institute (HPI) ( email )

Prof.-Dr.-Helmert-Str. 2-3,
Potsdam
Germany

Felix Naumann

University of Potsdam - Hasso Plattner Institute (HPI) ( email )

Prof.-Dr.-Helmert-Str. 2-3,
Potsdam
Germany

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
19
Abstract Views
498
PlumX Metrics