Reliance on Science by Inventors: Hybrid Extraction of In-Text Patent-to-Article Citations

35 Pages Posted: 27 Oct 2020 Last revised: 21 Dec 2024

See all articles by Matt Marx

Matt Marx

Cornell SC Johnson College of Business; NBER

Aaron Fuegi

Boston University

Date Written: October 2020

Abstract

We curate and characterize a complete set of citations from patents to scientific articles, including nearly 16 million from the full text of USPTO and EPO patents. Combining heuristics and machine learning, we achieve 25% higher performance than machine learning alone. At 99.4% accuracy, coverage of 87.6% is achieved, and coverage above 90% with accuracy above 93%. Performance is evaluated with a set of 5,939 randomly-sampled, cross-verified “known good” citations, which the authors have never seen. We compare these “in-text” citations with the “official” citations on the front page of patents. In-text citations are more diverse temporally, geographically, and topically. They are less self-referential and less likely to be recycled from one patent to the next. That said, in-text citations have been overshadowed by front-page in the past few decades, dropping from 80% of all paper-to-patent citations to less than 40%. In replicating two published articles that use only citations on the front page of patents, we show that failing to capture those in the body text leads to understating the relationship between academic science and commercial invention. All patent-to-article citations, as well as the known-good test set, are available at http://relianceonscience.org.

Suggested Citation

Marx, Matt and Fuegi, Aaron, Reliance on Science by Inventors: Hybrid Extraction of In-Text Patent-to-Article Citations (October 2020). NBER Working Paper No. w27987, Available at SSRN: https://ssrn.com/abstract=3718899

Matt Marx (Contact Author)

Cornell SC Johnson College of Business ( email )

Ithaca, NY 14850
United States

NBER ( email )

1050 Massachusetts Avenue
Cambridge, MA 02138
United States

Aaron Fuegi

Boston University ( email )

595 Commonwealth Avenue
Boston, MA 02215
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
61
Abstract Views
780
Rank
739,476
PlumX Metrics