A Free Access, Automated Law Citator with International Scope: The LawCite Project
31 Pages Posted: 22 Apr 2016 Last revised: 1 Jun 2016
Date Written: April 18, 2016
Twelve free access and non-profit providers of legal information from across the common law world have collaborated to produce an automated international citator for case law and legal scholarship, access to end-users free of any user charges. This context imposes constraints on project development: only modest grant funding for development; open source plus purpose-developed software; no significant editorial input to develop application data, only automation; very limited maintenance funds; and no user charges or advertising revenue.
The LawCite citator is the principal output to date of the LawCite Project, developed by the Australasian Legal Information Institute (AustLII) for the consortium of participating legal information institutes (LIIs). LawCite currently contains index records of the citation histories of almost five million cases, law journal articles, law reform documents and treaties. The citator is international, containing citation records in significant numbers from court decisions in 75 countries (primarily but not exclusively from common law countries).
The purpose of this paper is to explain how the components of the LawCite project work, in some technical detail, and to outline the applications that have already resulted (both the citator and others), and future applications that are planned and possible. Our goal is for the LawCite project to play a key role in future global development of free access to legal information, provided collaboratively by free access providers across the globe.
The LawCite project has developed and uses 3 main databases in an iterative fashion to produce its applications (the citator and other applications). They are: (i) The Citations database (basic information related to a citation); (ii) The Series database (information about each series of law reports, law journals, treaties, or law reform reports recognised by LawCite); and (iii) The Document database (an XML record for each recorded case or journal article).
Oracle Berkeley DB (database software) is used for data mining and markup, and for construction of the Citations and Series databases. The Series and Document databases are used to generate the Citations database. The Series database is manually updated periodically. The Document and Citations databases are recreated automatically during each iteration of the data mining and ‘unmining’ processes. Raw citation lists are gathered from participating LII and non-LII data sources by the Citation Miner. Once collected, these are analysed, combined and normalised by the “Unminer” and from the combined list, the LawCite databases are generated. These databases are used both by the LawCite Citator as well as for the markup of text. The article explains these processes.
The article concludes with a brief account of the citator from the user perspective, other outputs from the project (the publicly accessible LawCite Automated Markup Tool, and the markup tools used by LIIs), and potential future applications (informing policy debates through citation information; data visualisation of citation flows; and contextual ranking).
Keywords: legal information system, citator, free access to law, data mining, case law, scholarship
Suggested Citation: Suggested Citation