The Necessary and Proper Stewardship of Judicial Data

58 Pages Posted: 5 Oct 2023 Last revised: 17 Oct 2023

Date Written: September 20, 2023


Governments and commercial firms create profits and social gain by exploiting large pools of data. One source of valuable data, however, lies in public hands yet remains largely untapped. While the deep reservoirs of data produced by Congress and federal agencies have long been available for public use, the data produced by federal judiciary is only loosely regulated, imperfectly used (except by a small number of well-resourced private data cartels), and largely ignored by scholars. But the ordinary process of litigation in federal courts generates an enormous volume of data. Especially after recent developments in large language models, this data holds immense potential for private gain and public good. It can be used to predict case outcomes or clarify the law in ways that advance legality and judicial access. It can reveal shortfalls in judicial practice and enable the provision of cheaper, better access to justice. It can make legible many otherwise invisible social facts that, if brought to light, can help improve public policy. Or else it can serve as a private profit center, its benefits accruing to a small coterie of data brokering firms capable of monopolizing its commercial use.

This Article is the first to address the complex empirical, legal, and normative questions raised by the untapped public asset of judicial data. It first develops a positive, descriptive account of how federal courts produce, dissipate, preserve, or disclose information. This includes a map of the known sources of Article III data (e.g., opinions, orders, briefs), but also extends, however, to a massive volume of ‘dark data’ produced but either lost or buried by the courts. This positive analysis further uncovers a complex administrative framework by which a plethora of manifold walls and hurdles—some categorical, and some individuated—are thrown up to slow down or stop public access.

With this positive understanding in hand, we offer a careful analysis of the constitutional questions implicated in decisions to disclose, or to render opaque, judicial data. Drawing attention to the key question of who controls judicial data flows, we demonstrate the existence of sweeping congressional power to regulate judicial data outside of a small zone of inherent judicial authority and a handful of instances in which privacy or safety are in play. Congressional authority, therefore, is the rule and not the exception. With these empirical and legal foundations in hand, the Article offers a normative evaluation of how Congress should regulate the production and dissemination of judicial data, in light of the capabilities and incentives of relevant actors. The information produced by the federal courts should not be exclusively a source of private profit for the data-centered firms presently monopolizing access. It is a public asset that should be elicited and disseminated in ways that advance the federal courts’ mission of equal justice under law.

Keywords: Federal judiciary; article III; PACER; judicial records

Suggested Citation

Huq, Aziz Z. and Clopton, Zachary D., The Necessary and Proper Stewardship of Judicial Data (September 20, 2023). Stanford Law Review, Vol. 76, Forthcoming , Northwestern Public Law Research Paper No. 23-55, Available at SSRN:

Aziz Z. Huq (Contact Author)

University of Chicago - Law School ( email )

1111 E. 60th St.
Chicago, IL 60637
United States

Zachary D. Clopton

Northwestern University - Northwestern Pritzker School of Law ( email )

750 N. Lake Shore Drive
Chicago, IL 60611
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Abstract Views
PlumX Metrics