Statistical Problems and Solutions in Onomastic Research - Exemplified by a Comparison of Given Name Distributions in Germany Throughout the 20th Century
German Data Forum
Gert G. Wagner
German Institute for Economic Research (DIW Berlin); Berlin University of Technology; German Socio-Economic Panel Study (SOEP)
November 1, 2010
SOEPpaper No. 332
The German Socio Economic Panel Study (SOEP) offers the rare opportunity to look at patterns of given names amongst a representative sample of more than 50,000 people born since 1900. This article develops an exemplary picture of typical frequency distributions for given names and their developments over time. In this paper, we first discuss the advantages and limitations of various data bases which have been widely used to study the distribution of given names. Second, we address the problem that name distributions are typically characterized by a "Large Number of Rare Events" (LNRE) zone. With regard to this, we focus our attention on the difficulties associated with comparing name distributions. Third, we apply some measures of the concentration of distributions from other lines of research (economics and computational linguistics). Finally, we stress the problem of the statistical significance of differences in name distributions based on samples.
Number of Pages in PDF File: 36
Keywords: Given names, large number of rare events (LNRE), concentration of distributions, SOEP
JEL Classification: C49, C83, Y8
Date posted: December 9, 2010
© 2015 Social Science Electronic Publishing, Inc. All Rights Reserved.
This page was processed by apollo2 in 0.296 seconds