header

Mímir: An Open-Source Semantic Search Framework for Interactive Information Seeking and Discovery

24 Pages Posted: 9 Jul 2018 First Look: Accepted

See all articles by Valentin T. Tablan

Valentin T. Tablan

University of Sheffield - Department of Computer Science

Kalina Bontcheva

University of Sheffield - Department of Computer Science

Hamish Cunningham

University of Sheffield - Department of Computer Science

Abstract

Semantic search is gradually establishing itself as the next generation search paradigm, which meets better a wider range of information needs, as compared to traditional full-text search. At the same time, however, expanding search towards document structure and external, formal knowledge sources (e.g. LOD resources) remains challenging, especially with respect to efficiency, usability, and scalability.

This paper introduces Mímir—an open-source framework for integrated semantic search over text, document structure, linguistic annotations, and formal semantic knowledge. Mímir supports complex structural queries, as well as basic keyword search.

Exploratory search and sense-making are supported through information visualisation interfaces, such as co-occurrence matrices and term clouds. There is also an interactive retrieval interface, where users can save, refine, and analyse the results of a semantic search over time. The more well-studied precision-oriented information seeking searches are also well supported.

The generic and extensible nature of the Mímir platform is demonstrated through three different, real-world applications, one of which required indexing and search over tens of millions of documents and fifty to hundred times as many semantic annotations. Scaling up to over 150 million documents was also accomplished, via index federation and cloud-based deployment.

Keywords: Natural language processing, Semantic search, Scalable semantic search framework, Expressive semantic queries, Integrated semantic search

Suggested Citation

Tablan, Valentin T. and Bontcheva, Kalina and Cunningham, Hamish, Mímir: An Open-Source Semantic Search Framework for Interactive Information Seeking and Discovery (2015). Journal of Web Semantics First Look. Available at SSRN: https://ssrn.com/abstract=3199175 or http://dx.doi.org/10.2139/ssrn.3199175

Valentin T. Tablan (Contact Author)

University of Sheffield - Department of Computer Science ( email )

Regent Court, 211 Portobello
Sheffield
United Kingdom

Kalina Bontcheva

University of Sheffield - Department of Computer Science ( email )

Regent Court, 211 Portobello
Sheffield
United Kingdom

Hamish Cunningham

University of Sheffield - Department of Computer Science ( email )

Regent Court, 211 Portobello
Sheffield
United Kingdom