Abstract

http://ssrn.com/abstract=1643761
 
 

References (21)



 


 



Automated Production of High-Volume, Real-Time Political Event Data


Philip A. Schrodt


Pennsylvania State University

2010

APSA 2010 Annual Meeting Paper

Abstract:     
This paper summarizes the current state-of-the-art for generating high-volume, near-real-time event data using automated coding methods, based on recent efforts for the DARPA Integrated Crisis Early Warning System (ICEWS) and NSF-funded research. The ICEWS work expanded by more than two orders of magnitude previous automated coding efforts, coding of about 26-million sentences generated from 8-million stories condensed from around 30 gigabytes of text. The actual coding took six minutes. The paper is largely a general "how-to" guide to the pragmatic challenges and solutions to various elements of the process of generating event data using automated techniques. It also discusses a number of ways that this could be augmented with existing open-source natural language processing software to generate a third-generation event data coding system.

Number of Pages in PDF File: 26

Keywords: event data, ICEWS, prediction, natural language processing, DARPA, open source

working papers series


Download This Paper

Date posted: July 19, 2010 ; Last revised: August 31, 2010

Suggested Citation

Schrodt, Philip A., Automated Production of High-Volume, Real-Time Political Event Data (2010). APSA 2010 Annual Meeting Paper. Available at SSRN: http://ssrn.com/abstract=1643761

Contact Information

Philip A. Schrodt (Contact Author)
Pennsylvania State University ( email )
University Park
State College, PA 16802
United States
Feedback to SSRN


Paper statistics
Abstract Views: 313
Downloads: 48
References:  21

© 2014 Social Science Electronic Publishing, Inc. All Rights Reserved.  FAQ   Terms of Use   Privacy Policy   Copyright   Contact Us
This page was processed by apollo8 in 0.454 seconds