Power in Text: Grammar and Language in Comparative Delegation
34 Pages Posted: 8 Aug 2013 Last revised: 30 Aug 2013
Date Written: August 5, 2013
Throughout the political science and legal literatures, scholars often use statutes and national constitutions as key sources of data. However, most of these analyses rely on labor-intensive coding schemes, offering precise results but requiring long hours from trained researchers. Existing quantitative measures (e.g., word counts of particular documents) have produced insightful results, but provide imprecise measures for variables of interest. Computational linguistics techniques, including natural language processing (NLP), provide an alternative approach; because legal documents are written so systematically, these texts lend themselves well to automated analysis, allowing computers to extract information in a repeatable fashion. By combining NLP tools with existing coding schemes and close readings of individual documents, scholars can identify and measure key traits of particular texts, creating powerful and innovative measurement schemes.
As a sample application of these techniques, I use NLP programming packages to develop a new measure for the level of executive discretion offered by a particular legal document. I conceptualize “discretion” as the average number of other players involved at each decision point in a statute or national constitution. I then use subject-object grammatical relationships to develop an alternative metric, which I implement in a Python script. Finally, I conduct validity tests on the measure, as well as on a competing approache from the literature. For my validity tests, I use data obtained from Elkins, Ginsburg, and Melton’s Comparative Constitutions Project (CCP), which hand-codes national constitutions based on a wide array of attributes. Using CCP data, I generate summary “discretion” statistics for a sample of post-1945 constitutions, which I treat as the “true values” for each document. I then compare the results for each of my measures to the CCP data. Generally speaking, I find that the NLP-based measure is more strongly correlated with these “true values” than the competing approach, highlighting the potential power of the tool.
Keywords: Power, delegation, natural language processing, grammatical parsers
Suggested Citation: Suggested Citation
Here is the Coronavirus
related research on SSRN
By Tom Ginsburg