The Paper of Record Meets an Ephemeral Web: An Examination of Linkrot and Content Drift within The New York Times

13 Pages Posted: 27 Apr 2021 Last revised: 4 May 2021

See all articles by Jonathan L. Zittrain

Jonathan L. Zittrain

Harvard University - Harvard Law School; Harvard School of Engineering and Applied Sciences; Harvard University - Harvard Kennedy School (HKS); Harvard University - Berkman Klein Center for Internet & Society

John Bowers

Berkman Klein Center for Internet & Society

Clare Stanton

Harvard University - Harvard Law School

Date Written: April 26, 2021

Abstract

Hyperlinks are a powerful tool for journalists and their readers. Diving deep into the context of an article is just a click away. But hyperlinks are a double-edged sword; for all of the internet’s boundlessness, what’s found on the web can also be modified, moved, or entirely disappeared. This often-irreversible decay of web content is commonly known as linkrot. It comes with a similar problem of content drift, or the often-unannounced changes––retractions, additions, replacement––to the content at a particular URL.

Our team of researchers at Harvard Law School has undertaken a project to gain insight into the extent and characteristics of journalistic linkrot and content drift. We examined hyperlinks in New York Times articles starting with the launch of the Times website in 1996 up through mid-2019, developed on the basis of a dataset provided to us by the Times. We focus on the Times not because it is an influential publication whose archives are often used to help form a historical record. Rather, the substantial linkrot and content drift we find here across the New York Times corpus accurately reflects the inherent difficulties of long-term linking to pieces of a volatile web.

Results show a near linear increase of linkrot over time, with interesting patterns emerging within certain sections of the paper or across top level domains. Over half of articles containing at least one URL also contained a dead link. Additionally, of the ostensibly “healthy” links existing in articles, a hand review revealed additional erosion to citations via content drift.

Suggested Citation

Zittrain, Jonathan and Bowers, John and Stanton, Clare, The Paper of Record Meets an Ephemeral Web: An Examination of Linkrot and Content Drift within The New York Times (April 26, 2021). Available at SSRN: https://ssrn.com/abstract=3833133 or http://dx.doi.org/10.2139/ssrn.3833133

Jonathan Zittrain (Contact Author)

Harvard University - Harvard Law School ( email )

1563 Massachusetts Avenue
Cambridge, MA 02138
United States

Harvard School of Engineering and Applied Sciences

1875 Cambridge Street
Cambridge, MA 02138
United States

Harvard University - Harvard Kennedy School (HKS) ( email )

79 John F. Kennedy Street
Cambridge, MA 02138
United States

Harvard University - Berkman Klein Center for Internet & Society

Cambridge, MA 02138
United States

HOME PAGE: http://cyber.harvard.edu

John Bowers

Berkman Klein Center for Internet & Society ( email )

Harvard Law School
23 Everett, 2nd Floor
Cambridge, MA 02138
United States

Clare Stanton

Harvard University - Harvard Law School ( email )

1563 Massachusetts Avenue
Cambridge, MA 02138
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
699
Abstract Views
4,847
Rank
75,192
PlumX Metrics