36 Pages Posted: 9 Dec 2010
Date Written: November 1, 2010
The German Socio Economic Panel Study (SOEP) offers the rare opportunity to look at patterns of given names amongst a representative sample of more than 50,000 people born since 1900. This article develops an exemplary picture of typical frequency distributions for given names and their developments over time. In this paper, we first discuss the advantages and limitations of various data bases which have been widely used to study the distribution of given names. Second, we address the problem that name distributions are typically characterized by a "Large Number of Rare Events" (LNRE) zone. With regard to this, we focus our attention on the difficulties associated with comparing name distributions. Third, we apply some measures of the concentration of distributions from other lines of research (economics and computational linguistics). Finally, we stress the problem of the statistical significance of differences in name distributions based on samples.
Keywords: Given names, large number of rare events (LNRE), concentration of distributions, SOEP
JEL Classification: C49, C83, Y8
Suggested Citation: Suggested Citation
Huschka, Denis and Wagner, Gert G., Statistical Problems and Solutions in Onomastic Research - Exemplified by a Comparison of Given Name Distributions in Germany Throughout the 20th Century (November 1, 2010). SOEPpaper No. 332. Available at SSRN: https://ssrn.com/abstract=1722527 or http://dx.doi.org/10.2139/ssrn.1722527