CiteReader Technology: A Method of Capturing Citation Data from Paper Collections
9 Pages Posted: 1 Jun 2010 Last revised: 25 Jun 2014
Date Written: May 29, 2010
The Social Studies Research Network eLibrary consists of two parts: an Abstract Database containing abstracts on over 300,000 scholarly working papers and forthcoming papers and an Electronic Paper Collection currently containing over 240,000 downloadable full text documents in Adobe Acrobat pdf format. All data as of time of writing of this paper – which is August 2010.
Around 2005, we decided to take on a task of extracting references from full text papers and establishing citation links between papers in SSRN database. The project has evolved into what we now call CiteReader Technology. By now, CiteReader Technology has captured over 5 million references, 5.6 million footnotes, and established more than 1.2 million citation links.
This paper describes the approaches and methods we used to accomplish this task.
Suggested Citation: Suggested Citation