A Hierarchical Dictionary Method for Analyzing Legal and Political Texts Via Nested Ngrams

12 Pages Posted: 17 Aug 2023

See all articles by Kevin L. Cope

Kevin L. Cope

University of Virginia School of Law

Li Zhang

University of Virginia School of Law

Date Written: August 15, 2023

Abstract

We develop a lexicon-based method for identifying and assigning values to complex phrases and expressions in large legal or political corpora. The method allows a researcher to identify ngrams nested within other ngrams and to assign value only to the most informative one, that is, the one at the top of the hierarchy. By reading expressions of indefinite length instead of single words, the method can distinguish differences in meaning resulting from small textual variations, including word order (unlike word frequency analysis) and can assess the value of terms and phrases on multiple dimensions with reasonably high precision. It is especially adept at measurement tasks for which the elimination of incidences of unit-level measurement errors is critical. Unlike purely manual coding, it provides high degrees of consistency and replicability. The method therefore offers several advantages over machine learning, topic models, uni-gram-based dictionaries, manual coding methods, and other approaches. We conclude by showing how this analysis might be applied to study large sets of documents – including, e.g., IGO/NGO country reports, corporate disclosures, treaties, court opinions, contracts, and expert evaluations – in law and social science.

Suggested Citation

Cope, Kevin L. and Zhang, Li, A Hierarchical Dictionary Method for Analyzing Legal and Political Texts Via Nested Ngrams (August 15, 2023). Virginia Law and Economics Research Paper No. 2023-16, Available at SSRN: https://ssrn.com/abstract=4541799 or http://dx.doi.org/10.2139/ssrn.4541799

Kevin L. Cope (Contact Author)

University of Virginia School of Law ( email )

580 Massie Road
WB345
Charlottesville, VA 22903
United States

Li Zhang

University of Virginia School of Law ( email )

580 Massie Road
Charlottesville, VA 22903
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
61
Abstract Views
335
Rank
664,441
PlumX Metrics