CiteReader Technology: A Method of Capturing Citation Data from Paper Collections
Anatoly Y. Chukhin
Ralph J. Dandrea
ITX Corp.; Social Science Research Network (SSRN)
May 29, 2010
The Social Studies Research Network eLibrary consists of two parts: an Abstract Database containing abstracts on over 300,000 scholarly working papers and forthcoming papers and an Electronic Paper Collection currently containing over 240,000 downloadable full text documents in Adobe Acrobat pdf format. All data as of time of writing of this paper – which is August 2010.
Around 2005, we decided to take on a task of extracting references from full text papers and establishing citation links between papers in SSRN database. The project has evolved into what we now call CiteReader Technology. By now, CiteReader Technology has captured over 5 million references, 5.6 million footnotes, and established more than 1.2 million citation links.
This paper describes the approaches and methods we used to accomplish this task.
Number of Pages in PDF File: 9working papers series
Date posted: June 1, 2010 ; Last revised: February 23, 2012
© 2013 Social Science Electronic Publishing, Inc. All Rights Reserved.
This page was processed by apollo7 in 0.313 seconds