Finding Doppelgängers in Scopus: How to Build Scientists Control Groups Using Sosia

21 Pages Posted: 8 Dec 2020 Last revised: 27 Nov 2024

See all articles by Michael E. Rose

Michael E. Rose

Max Planck Institute for Innovation and Competition

Stefano Baruffaldi

Polytechnic University of Milan - Department of Management, Economics and Industrial Engineering; Max Planck Institute for Innovation and Competition

Date Written: December 3, 2020

Abstract

The construction of control groups of scientists is often a daunting effort. This paper presents sosia, an open-source Python-based software designed to query efficiently the Scopus database via RESTful API. sosia searches for researchers with publication profiles similar to a given researcher up to a given year based on all main standard bibliometric indicators. The user can choose flexibly a set of parameters to restrict the search to more or less narrow boundaries upfront and obtain additional similarity indicators to select a subset of authors after the search. Advanced settings also allow to narrow the search to a list of affiliations and to minimize the possible errors arising from ambiguous author profiles. One basic search can be set up in a few command lines and the average time of computation goes between 60 and 300 minutes. We discuss the functioning, characteristics, limitations and possible extension of the software.

Keywords: Statistical Software, Control Group, Diff-in-Diff, Scopus

JEL Classification: C00, A14

Suggested Citation

Rose, Michael and Baruffaldi, Stefano, Finding Doppelgängers in Scopus: How to Build Scientists Control Groups Using Sosia (December 3, 2020). Max Planck Institute for Innovation & Competition Research Paper No. 20-20, Available at SSRN: https://ssrn.com/abstract=3742602 or http://dx.doi.org/10.2139/ssrn.3742602

Michael Rose (Contact Author)

Max Planck Institute for Innovation and Competition ( email )

Marstallplatz 1
Munich, Bayern 80539
Germany

HOME PAGE: http://https://www.ip.mpg.de/en/persons/rose-michael.html

Stefano Baruffaldi

Polytechnic University of Milan - Department of Management, Economics and Industrial Engineering ( email )

Via Lambruschini 4C - building 26/A
Milano, 20156
Italy

Max Planck Institute for Innovation and Competition ( email )

Marstallplatz 1
Munich, Bayern 80539
Germany

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
195
Abstract Views
1,192
Rank
325,140
PlumX Metrics