Split Decisions: Practical Machine Learning for Empirical Legal Scholarship

60 Pages Posted: 11 Dec 2020 Last revised: 9 Sep 2021

See all articles by James Ming Chen

James Ming Chen

Michigan State University - College of Law

Date Written: November 16, 2020

Abstract

Multivariable regression may be the most prevalent and useful task in social science. Empirical legal studies rely heavily on the ordinary least squares method. Conventional regression methods have attained credibility in court, but by no means do they dictate legal outcomes. Using the iconic Boston housing study as a source of price data, this Article introduces machine-learning regression methods. Although decision trees and forest ensembles lack the overt interpretability of linear regression, these methods reduce the opacity of black-box techniques by scoring the relative importance of dataset features. This Article will also address the theoretical tradeoff between bias and variance, as well as the importance of training, cross-validation, and reserving a holdout dataset for testing.

Suggested Citation

Chen, James Ming, Split Decisions: Practical Machine Learning for Empirical Legal Scholarship (November 16, 2020). 2020 Michigan State Law Review 1301, Available at SSRN: https://ssrn.com/abstract=3731307 or http://dx.doi.org/10.2139/ssrn.3731307

James Ming Chen (Contact Author)

Michigan State University - College of Law ( email )

318 Law College Building
East Lansing, MI 48824-1300
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
78
Abstract Views
922
Rank
559,388
PlumX Metrics