LexNLP: Natural Language Processing and Information Extraction For Legal and Regulatory Texts

7 Pages Posted: 21 Jun 2018

See all articles by Michael James Bommarito

Michael James Bommarito

Bommarito Consulting, LLC; Licensio, LLC; Stanford Center for Legal Informatics; Michigan State College of Law

Daniel Martin Katz

Illinois Tech - Chicago Kent College of Law; Stanford CodeX - The Center for Legal Informatics; LexPredict

Eric Detterman

LexPredict, LLC

Date Written: June 6, 2018

Abstract

LexNLP is an open source Python package focused on natural language processing and machine learning for legal and regulatory text. The package includes functionality to (i) segment documents, (ii) identify key text such as titles and section headings, (iii) extract over eighteen types of structured information like distances and dates, (iv) extract named entities such as companies and geopolitical entities, (v) transform text into features for model training, and (vi) build unsupervised and supervised models such as word embedding or tagging models. LexNLP includes pre-trained models based on thousands of unit tests drawn from real documents available from the SEC EDGAR database as well as various judicial and regulatory proceedings. LexNLP is designed for use in both academic research and industrial applications.

Keywords: natural language processing, legal, regulatory, machine learning, segmentation, extraction, open source, Python

JEL Classification: C19, C53, C55, C38, C45, C63, C88

Suggested Citation

Bommarito, Michael James and Katz, Daniel Martin and Detterman, Eric, LexNLP: Natural Language Processing and Information Extraction For Legal and Regulatory Texts (June 6, 2018). Available at SSRN: https://ssrn.com/abstract=3192101 or http://dx.doi.org/10.2139/ssrn.3192101

Michael James Bommarito

Bommarito Consulting, LLC ( email )

MI 48098
United States

HOME PAGE: http://bommaritollc.com

Licensio, LLC ( email )

Okemos, MI 48864
United States

Stanford Center for Legal Informatics ( email )

559 Nathan Abbott Way
Stanford, CA 94305-8610
United States

Michigan State College of Law ( email )

318 Law College Building
East Lansing, MI 48824-1300
United States

Daniel Martin Katz (Contact Author)

Illinois Tech - Chicago Kent College of Law ( email )

565 W. Adams St.
Chicago, IL 60661-3691
United States

HOME PAGE: http://www.danielmartinkatz.com/

Stanford CodeX - The Center for Legal Informatics ( email )

559 Nathan Abbott Way
Stanford, CA 94305-8610
United States

HOME PAGE: http://law.stanford.edu/directory/daniel-katz/

LexPredict ( email )

Chicago, IL
United States

HOME PAGE: http://www.lexpredict.com/

Eric Detterman

LexPredict, LLC ( email )

MI
United States

Register to save articles to
your library

Register

Paper statistics

Downloads
1,071
Abstract Views
3,850
rank
19,422
PlumX Metrics