LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

17 Pages Posted: 5 Oct 2021 Last revised: 22 Mar 2023

See all articles by Ilias Chalkidis

Ilias Chalkidis

University of Copenhagen

Abhik Jana

University of Hamburg; Language Technology Group, Department of Informatics, Universität Hamburg

Dirk Hartung

Bucerius Law School - Center for Legal Technology and Data Science; Stanford University - Stanford Codex Center

Michael James Bommarito

273 Ventures; Licensio, LLC; Stanford Center for Legal Informatics; Michigan State College of Law; Bommarito Consulting, LLC

Ion Androutsopoulos

Athens University of Economics and Business

Daniel Martin Katz

Illinois Tech - Chicago Kent College of Law; Bucerius Center for Legal Technology & Data Science; Stanford CodeX - The Center for Legal Informatics; 273 Ventures

Nikolaos Aletras

University of Sheffield

Date Written: October 13, 2021

Abstract

Law, interpretations of law, legal arguments, agreements, etc. are typically expressed in writing, leading to the production of vast corpora of legal text. Their analysis, which is at the center of legal practice, becomes increasingly elaborate as these collections grow in size. Natural language understanding (NLU) technologies can be a valuable tool to support legal practitioners in these endeavors. Their usefulness, however, largely depends on whether current state-of-the-art models can generalize across various tasks in the legal domain. To answer this currently open question, we introduce the Legal General Language Understanding Evaluation (LexGLUE) benchmark, a collection of datasets for evaluating model performance across a diverse set of legal NLU tasks in a standardized way. We also provide an evaluation and analysis of several generic and legal-oriented models demonstrating that the latter consistently offer performance improvements across multiple tasks.

Keywords: natural language processing, legal data, legal tech, natural language understanding, evaluation, machine learning, artificial intelligence, artificial intelligence and law

JEL Classification: C88, C80, K00, K40, M49

Suggested Citation

Chalkidis, Ilias and Jana, Abhik and Jana, Abhik and Hartung, Dirk and Bommarito, Michael James and Androutsopoulos, Ion and Katz, Daniel Martin and Aletras, Nikolaos, LexGLUE: A Benchmark Dataset for Legal Language Understanding in English (October 13, 2021). Available at SSRN: https://ssrn.com/abstract=3936759 or http://dx.doi.org/10.2139/ssrn.3936759

Ilias Chalkidis

University of Copenhagen ( email )

Universitetsparken 1
Copenhagen, København DK-2100
Denmark

Abhik Jana

University of Hamburg ( email )

Hamburg

Language Technology Group, Department of Informatics, Universität Hamburg ( email )

Allende-Platz 1
Hamburg, 20146
Germany

Dirk Hartung

Bucerius Law School - Center for Legal Technology and Data Science ( email )

Jungiusstr. 6
Hamburg, 20355
Germany

Stanford University - Stanford Codex Center ( email )

559 Nathan Abbott Way
Stanford, CA 94305-8610
United States

HOME PAGE: http://https://law.stanford.edu/directory/dirk-hartung/

Licensio, LLC ( email )

Okemos, MI 48864
United States

Stanford Center for Legal Informatics ( email )

559 Nathan Abbott Way
Stanford, CA 94305-8610
United States

Michigan State College of Law ( email )

318 Law College Building
East Lansing, MI 48824-1300
United States

Bommarito Consulting, LLC ( email )

MI 48098
United States

Ion Androutsopoulos

Athens University of Economics and Business ( email )

76 Patission Street
Athens, 104 34
Greece

Daniel Martin Katz (Contact Author)

Illinois Tech - Chicago Kent College of Law ( email )

565 W. Adams St.
Chicago, IL 60661-3691
United States

HOME PAGE: http://www.danielmartinkatz.com/

Bucerius Center for Legal Technology & Data Science ( email )

Jungiusstr. 6
Hamburg, 20355
Germany

HOME PAGE: http://legaltechcenter.de/

Stanford CodeX - The Center for Legal Informatics ( email )

559 Nathan Abbott Way
Stanford, CA 94305-8610
United States

HOME PAGE: http://law.stanford.edu/directory/daniel-katz/

273 Ventures ( email )

HOME PAGE: http://273ventures.com

Nikolaos Aletras

University of Sheffield ( email )

17 Mappin Street
Sheffield, Sheffield S1 4DT
United Kingdom

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
506
Abstract Views
2,120
Rank
101,990
PlumX Metrics