Genesys: A Deep Learning Approach for High-recall Multi-label Legal Topical Classification

Punuru, Janardhana; Ramesh, Nikhil; Shewhart, Mark; Sharma, Sanjay

Not Available for Download

Add Paper to My Library

Genesys: A Deep Learning Approach for High-recall Multi-label Legal Topical Classification

Proceedings of the 4th Annual RELX Search Summit

Posted: 29 Jan 2021

See all articles by Janardhana Punuru

Sanjay Sharma

LexisNexis

Date Written: November 6, 2020

Abstract

Caselaw documents on LexisNexis contain headnotes identifying key points of law discussed in cases. Currently, legal editors manually assign legal topics from the US legal taxonomy to each headnote. The assigned topics are useful for improving search results, filtering, document recommendations, and many other applications. Even though legal topics have many applications, manual assignment of topics is expensive and time-consuming. To address this issue, we present a novel method for automating the application of legal topics to caselaw headnotes. The system we have developed uses a deep learning-based classification approach to predict multiple legal topics for each headnote. The current distribution of legal topics in headnotes is very unbalanced where a few topics account for most of the labels and a large number of topics are rarely applied. To address this lack of coverage for rare topics, we have built a separate model for each topic. These models are built with glove embeddings and convolutional neural networks. Given the vast number of topics in the legal taxonomy, we have developed methods for sharing embeddings across the models and compressing embeddings for efficient matching of all the models at inference time. We believe that our approach will allow the dual benefits of a) Enabling classification of any content including documents, RFCs, passages in other content types, and b) Equipping LexisNexis to fully leverage the power of legal topics for significantly improving search results on Lexis Advance.

Suggested Citation: Suggested Citation

Punuru, Janardhana and Ramesh, Nikhil and Shewhart, Mark and Sharma, Sanjay, Genesys: A Deep Learning Approach for High-recall Multi-label Legal Topical Classification (November 6, 2020). Proceedings of the 4th Annual RELX Search Summit, Available at SSRN: https://ssrn.com/abstract=3775667