Orphan Works as Grist for the Data Mill

39 Pages Posted: 12 Apr 2012 Last revised: 20 Sep 2014

See all articles by Matthew Sag

Matthew Sag

Emory University School of Law

Date Written: August 30, 2012


The phenomenon of library digitization in general, and the digitization of so-called ‘orphan works’ in particular, raises many important copyright law questions. However, as this article explains, correctly understood, there is no orphan works problem for certain kinds of library digitization.

The distinction between expressive and nonexpressive works is already well recognized in copyright law as the gatekeeper to copyright protection - novels are protected by copyright, telephone books and other uncreative compilations of data are not. The same distinction should generally be made in relation to potential acts of infringement. Preserving the functional force of the idea - expression distinction in the digital context requires that copying for purely nonexpressive purposes (also referred to as non-consumptive use), such as the automated extraction of data, should not be regarded as infringing.

The nonexpressive use of copyrighted works has tremendous potential social value: it makes search engines possible, it provides an important data source for research in computational linguistics, automated translation and natural language processing. And increasingly, the macro-analysis of text is being used in fields such as the study of literature itself. So long as digitization is confined to data processing applications that do not result in infringing expressive or consumptive uses of individual works, there is no orphan works problem because the exclusive rights of the copyright owner are limited to the expressive elements of their works and the expressive uses of their works.

Keywords: Nonexpressive, expressive, expression, library, digitization, fair use, original, copying, software, copyright

JEL Classification: K00

Suggested Citation

Sag, Matthew, Orphan Works as Grist for the Data Mill (August 30, 2012). Berkeley Technology Law Journal, Forthcoming, Available at SSRN: https://ssrn.com/abstract=2038889 or http://dx.doi.org/10.2139/ssrn.2038889

Matthew Sag (Contact Author)

Emory University School of Law ( email )

1301 Clifton Road
Atlanta, GA 30322
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Abstract Views
PlumX Metrics