Sense and Similarity: Automating Legal Text Comparison
Ryan Whalen (ed.), Computational Legal Studies: The Promise and Challenge of Data-driven Research, Edward Elgar (Forthcoming, 2020).
16 Pages Posted: 12 Mar 2019 Last revised: 17 Nov 2019
Date Written: February 20, 2019
Abstract
Lawyers routinely compare legal texts. Advances in technology now help automate such comparison by treating legal texts as data and quantifying their similarity. This renders legal text comparisons scalable and enables lawyers to investigate similarity patterns in large legal text corpora. In this paper, I outline the technological underpinnings of automated legal text comparison and discuss the practical limitations, implications and applications arising from its increasing use. Automated text comparisons, amongst others, improve legal search and recommender systems, allow to better track legal processes and can reveal patterns in legal corpora impossible to detect through traditional means. Automated text comparison, however, also suffers from limitations. It focuses on similarity of text and not similarity of meaning and cannot (yet) differentiate between legally significant and legally insignificant text differences. Therefore, automated text comparison is best seen as a complement to, rather than a substitute for, a manual or concept-based comparison of legal texts.
Keywords: automated text comparisons, law, computational legal studies, text-as-data
Suggested Citation: Suggested Citation