Towards Life Sciences Search with Blazing Speed

Posted: 6 Dec 2019 Last revised: 16 Dec 2019

Date Written: December 3, 2019

Abstract

EMBASE (Excerpta Medica dataBASE) is a biomedical and pharmacological bibliographic system consisting of more than 37 million records from over 8,500 journals. It enables comprehensive tracking and retrieval of drug information. We show and benchmark several approaches to improve search efficiency and reliability by using advanced search techniques in the context of a transition from a NoSQL and XML database to a full-fledged search engine. This includes an overview of the scalable infrastructure topology, data modeling and optimization of the indexing and search schema, writing the optimal queries for search, and techniques to support efficient faceting and exports of large amounts of records.

Keywords: Biomedical search engine, search efficiency, life sciences, bibliographic search, search engine migration, evaluation

Suggested Citation

Zhang, Junte and Aretakis, Vassilis and Denisov, Igor and Dmitriev, Alexander and Golubev, Yuriy and Grygorenko, Iaroslav and Ilmov, Vladimir and Panchenko, Roman and Petrova, Iveta and Prakash, Ashwani and Rybnytskyi, Maksym and Upadhyay, Sakshi and van Weert, Boudewijn, Towards Life Sciences Search with Blazing Speed (December 3, 2019). Proceedings of the 3rd Annual RELX Search Summit, Available at SSRN: https://ssrn.com/abstract=3497914

Vassilis Aretakis

Elsevier ( email )

Igor Denisov

Elsevier ( email )

Alexander Dmitriev

Independent ( email )

Yuriy Golubev

Independent ( email )

Iaroslav Grygorenko

Elsevier ( email )

Vladimir Ilmov

Elsevier ( email )

Roman Panchenko

Elsevier ( email )

Iveta Petrova

Elsevier ( email )

Ashwani Prakash

Elsevier ( email )

Maksym Rybnytskyi

Independent ( email )

Sakshi Upadhyay

Elsevier ( email )

Boudewijn Van Weert

Elsevier ( email )

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Abstract Views
214
PlumX Metrics