Information Retrieval with Root- and Rule-Based Terms
9 Pages Posted: 6 Apr 2020 Publication Status: Review Complete
More...Abstract
Root- and rule-based terms are structured representations of natural language phrases that can be automatically generated using a combination of statistical and symbolic methods. These terms are able to represent and normalize syntactic information about natural language phrases, making them richer than basic n-grams while greatly reducing the vocabulary size. In this paper, we discuss the use of root- and rule-based terms for information retrieval. We represent documents and queries as collections of root- and rule-based terms and show that this improves conventional information retrieval methods such as Latent Semantic Indexing and Latent Direchlet Allocation. Root- and rule-based terms improve on state of the art evaluation scores for the TREC 2016 clinical decision support track.
Keywords: artificial intelligence, data science, taxonomy, unsupervised learning, information retrieval, automatic terminology generation
Suggested Citation: Suggested Citation
