Structure and Content in the United States Code
17 Pages Posted: 11 Sep 2020
Date Written: September 10, 2020
This paper explores the relationship between statutory structure—as realized through hierarchical organization and a cross-reference citation network—and semantic content. We report several novel descriptive statistics concerning the United States Code (USC), including the results of the first application to this corpus of the machine learning technique of topic modeling. We find that the topic model performs quite well in discovering relevant legal categories, as expressed in the subject-matter organization of the USC. We estimate relationships between formal structure and topic content and find that the assortativity of “titles” (the highest level hierarchy within the USC) and the assortativity of related topics are highly similar. We also examine the degree of mutual information between statutory structure and content and find that alternative machine learning techniques are able to recover information about structure from content, indicating some level of mutual information. Our analysis can be used to develop superior measures of legal complexity with the potential to improve studies that seek to understand the importance of legal complexity for social outcomes (such as compliance costs or economic growth). Our work also has potential applications for the study of law search and in developing tools to facilitate public access to the law.
Suggested Citation: Suggested Citation