Downloading Wisdom from Online Crowds
54 Pages Posted: 11 Nov 2008
There are 2 versions of this paper
Downloading Wisdom from Online Crowds
Abstract
The internet and other large textual databases contain billions of documents: is there useful information in the number of documents written about different topics? We propose, based on the premise that the occurrence of a phenomenon increases the likelihood that people write about it, that the relative frequency of documents discussing a phenomenon can be used to proxy for the corresponding occurrence-frequency. After establishing the conditions under which such proxying is likely to be successful, we construct proxies for a number of demographic variables in the US and for corruption across countries and US states and cities, obtaining average correlations with occurrence-frequencies of 0.47 and 0.61 respectively. We also replicate results from two separate published papers establishing the correlates of corruption at both the state and country level. Finally, we construct the first index of corruption in US cities and study its correlates.
Keywords: proxy variables, document-frequency, textual databases, internet
JEL Classification: J11, C81, B40
Suggested Citation: Suggested Citation
Do you have a job opening that you would like to promote on SSRN?
Recommended Papers
-
Giving Content to Investor Sentiment: The Role of Media in the Stock Market
-
More than Words: Quantifying Language to Measure Firms' Fundamentals
By Paul C. Tetlock, Maytal Saar-tsechansky, ...
-
Is All that Talk Just Noise? The Information Content of Internet Stock Message Boards
By Murray Z. Frank and Werner Antweiler
-
Media Coverage and the Cross-Section of Stock Returns
By Lily H. Fang and Joel Peress
-
When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10-Ks
By Tim Loughran and Bill Mcdonald
-
Do Stock Market Investors Understand the Risk Sentiment of Corporate Annual Reports?
By Feng Li
-
Yahoo! For Amazon: Sentiment Parsing from Small Talk on the Web
By Sanjiv Ranjan Das and Mike Y. Chen
-
By Zhi Da, Joseph Engelberg, ...
-
By Joshua D. Coval and Tyler Shumway
-
The Impact of Credibility on the Pricing of Managerial Textual Content
By Elizabeth Demers and Clara Vega