Corpus Based Classification of Text in Australian Contracts

Proceedings of the Australasian Language Technology Association Workshop 2010

9 Pages Posted: 14 Jul 2011

See all articles by Michael Curtotti

Michael Curtotti

Australian National University (ANU)

Eric McCreath

Australian National University (ANU)

Date Written: December 10, 2010

Abstract

Written contracts are a fundamental framework for commercial and cooperative transactions and relationships. Limited research has been published on the application of machine learning and natural language processing (NLP) to contracts. In this paper we report the classification of components of contract texts using machine learning and hand-coded methods. Authors studying a range of domains have found that combining machine learning and rule based approaches increases accuracy of machine learning. We find similar results which suggest the utility of considering leveraging hand coded classification rules for machine learning. We attained an average accuracy of 83.48% on a multi-class labeling task on 20 contracts combining machine learning and rule based approaches, increasing performance over machine learning alone.

Suggested Citation

Curtotti, Michael and McCreath, Eric, Corpus Based Classification of Text in Australian Contracts (December 10, 2010). Proceedings of the Australasian Language Technology Association Workshop 2010 . Available at SSRN: https://ssrn.com/abstract=1885490

Michael Curtotti (Contact Author)

Australian National University (ANU) ( email )

Canberra, Australian Capital Territory 2601
Australia

Eric McCreath

Australian National University (ANU) ( email )

Australia

Register to save articles to
your library

Register

Paper statistics

Downloads
92
rank
270,315
Abstract Views
479
PlumX Metrics