Analysis of Case Law: Measuring Similarity as an Aid to Coding Factors in Cases
Posted: 8 Feb 1999
The subject matter of this paper combines the domains of case law analysis and conceptual information retrieval. At the Centre for Computers and Law (CCL), sets of cases on several subjects have been analyzed. The aim of these studies was to reveal the relationship between the factors present in legal cases and the decision reached by the judge in that case. This research has yielded promising results so far. Knowledge was obtained concerning the weight of these factors (i.e. their relative importance) and based on these weights, cases were ranked according to their strength. However, coding the factors present in cases is a very time consuming procedure. The texts of all the cases have to be read thoroughly in order to ascertain whether a certain factor appears in a case. In order to recognize the presence of factors, expert knowledge of the specific legal domain is essential.
A major improvement, which would make the technique much easier to apply, would be a computer algorithm that could assist in tracing the relevant factors in the text of a case. Several possibilities for the construction of such an algorithm have been under consideration by the CCL for the past few years. Some promising results have been achieved using a method based on Bayesian statistics that has been applied to the word usage in the texts of the cases. This method involves the identification of example cases and counter examples. Cases can then be ranked based on the probability that they resemble the given examples and these data can then be used to predict whether these factors would be present in other cases.
This paper also presents an alternative technique that could be used either separately or in combination with the method mentioned above. A similarity measure, again based on word use, is defined. This measure is applied to each case to calculate its similarity to cases that are known to contain a certain factor, and to cases that are known not to contain this factor. The case is classified according to the results of this comparison. The experiment shows that this technique can be of possible assistance in coding the factors in cases.
Suggested Citation: Suggested Citation