Recounting the Courts? Applying Automated Content Analysis to Enhance Empirical Legal Research

25 Pages Posted: 6 Jul 2006

See all articles by Michael C. Evans

Michael C. Evans

University of Maryland - Department of Government & Politics

Wayne V. McIntosh

University of Maryland - Department of Government & Politics

Jimmy Lin

UMD iSchool

Cynthia L. Cates

Towson University

Date Written: August 28, 2006

Abstract

Political scientists in general and public law specialists in particular have only recently begun to exploit text classification using machine learning techniques to enable the reliable and detailed content analysis of political/legal documents on a large scale. This paper provides an overview and assessment of this methodology. We describe the basics of text classification, suggest applications of this technique to enhance empirical legal research (and political science more broadly), and report results of experiments designed to test the strengths and weaknesses of alternative text classification models for classifying the positions and interpreting the content of briefs submitted to the U.S. Supreme Court. We find that the Wordscores method (introduced by Laver, Benoit, et. al. 2003), and various models using a Naïve Bayes classifier, perform well at accurately classifying the ideological direction of amicus curiae briefs submitted in the Bakke (1978) and Bollinger (2003) affirmative action cases. We also find that automated feature selection techniques can enable the detection of disparate issue conceptualizations by opposing sides in a single case, and facilitate analysis of relative linguistic "reliance" and "dominance" over time. We conclude by discussing the implications of our results and pointing to areas where technical and infrastructural improvement are most needed.

Keywords: computational linguistics, machine learning, content analysis, amicus curiae, legal rhetoric

Suggested Citation

Evans, Michael C. and McIntosh, Wayne V. and Lin, Jimmy and Cates, Cynthia L., Recounting the Courts? Applying Automated Content Analysis to Enhance Empirical Legal Research (August 28, 2006). Available at SSRN: https://ssrn.com/abstract=914126 or http://dx.doi.org/10.2139/ssrn.914126

Michael C. Evans (Contact Author)

University of Maryland - Department of Government & Politics ( email )

3140 Tydings Hall
College Park, MD 20742
United States

Wayne V. McIntosh

University of Maryland - Department of Government & Politics ( email )

3140 Tydings Hall
College Park, MD 20742
United States

Jimmy Lin

UMD iSchool ( email )

4th Floor, 4130 Campus Dr
College Park, MD 20742
United States

Cynthia L. Cates

Towson University ( email )

8000 York Road, ST 100A
Towson, MD 21204

Register to save articles to
your library

Register

Paper statistics

Downloads
405
rank
67,901
Abstract Views
1,859
PlumX Metrics