Download this Paper Open PDF in Browser

Mass Digitization of Chinese Court Decisions: How to Use Text as Data in the Field of Chinese Law

38 Pages Posted: 15 Jun 2017 Last revised: 14 Jul 2017

Benjamin L. Liebman

Columbia University - Law School

Margaret Roberts

University of California, San Diego (UCSD) - 21st Century China Center

Rachel E. Stern

University of California, Berkeley - Department of Jurisprudence & Social Policy

Alice Z. Wang

Columbia Law School

Date Written: June 13, 2017

Abstract

Over the past five years, Chinese courts have placed tens of millions of court judgments online. We analyze the promise and pitfalls of using this remarkable new data source through the construction and examination of a dataset of 1,058,990 documents from Henan province. Courts posted judgments in roughly half of all cases in 2014 and, although the percent of cases posted online has likely risen since then, the single greatest challenge facing researchers remains documenting gaps in the data. We find that missing data varies widely by court, and that intermediate courts disclose significantly more documents than basic level courts. But court level, GDP per capita, population, and mediation rates are insufficient fully to explain variation in disclosure rates. Further work is needed to better understand how resources and incentives might be skewing the data. Despite incomplete information, however, a topic model of 20,321 administrative court judgments demonstrates how mass digitization of court decisions opens a new window into the practice of everyday law in China. Unsupervised machine learning combined with close reading of selected cases reveals surprising trends in administrative disputes as well as important research questions. Taken together, our findings suggest a need for humility and methodological pluralism among scholars seeking to use large-scale data from Chinese courts. The vast amount of incomplete data now available may frustrate attempts to find quick answers to existing questions, but the data excel at opening new pathways for research and at adding nuance to existing assumptions about the role of courts in Chinese society.

Keywords: Data, Law, Chinese Courts, Court Cases, Text as Data, Court Judgements

Suggested Citation

Liebman, Benjamin L. and Roberts, Margaret and Stern, Rachel E. and Wang, Alice Z., Mass Digitization of Chinese Court Decisions: How to Use Text as Data in the Field of Chinese Law (June 13, 2017). 21st Century China Center Research Paper No. 2017-01; Columbia Public Law Research Paper No. 14-551. Available at SSRN: https://ssrn.com/abstract=2985861

Benjamin Liebman

Columbia University - Law School ( email )

435 West 116th Street
New York, NY 10025
United States

Margaret Roberts (Contact Author)

University of California, San Diego (UCSD) - 21st Century China Center ( email )

9500 Gilman Drive #0519
La Jolla, CA 92093-0519
United States

Rachel Stern

University of California, Berkeley - Department of Jurisprudence & Social Policy ( email )

School of Law
University of California
Berkeley, CA 94720-2150
United States

Alice Wang

Columbia Law School ( email )

435 West 116th Street
New York, NY 10025
United States

Paper statistics

Downloads
325
Rank
77,958
Abstract Views
1,927