Finding Doppelgängers in Scopus: How to Build Scientists Control Groups Using Sosia
23 Pages Posted: 8 Dec 2020
Date Written: December 3, 2020
The construction of control groups of scientists is often a daunting effort. This paper presents sosia, an open-source Python-based software designed to query efﬁciently the Scopus database via RESTful API. sosia searches for researchers with publication proﬁles similar to a given researcher up to a given year based on all main standard bibliometric indicators. The user can choose ﬂexibly a set of parameters to restrict the search to more or less narrow boundaries upfront and obtain additional similarity indicators to select a subset of authors after the search. Advanced settings also allow to narrow the search to a list of afﬁliations and to minimize the possible errors arising from ambiguous author proﬁles. One basic search can be set up in a few command lines and the average time of computation goes between 60 and 300 minutes. We discuss the functioning, characteristics, limitations and possible extension of the software.
Keywords: Statistical Software, Control Group, Diff-in-Diff, Scopus
JEL Classification: C00, A14
Suggested Citation: Suggested Citation