Discovering Knowledge from Relational Data Extracted from Business News

15 Pages Posted: 13 Oct 2008

See all articles by Abraham Bernstein

Abraham Bernstein

University of Zurich - Dynamic and Distributed Information Systems Group

Scott Clearwater

affiliation not provided to SSRN

Shawndra Hill

Microsoft Research

Claudia Perlich

IBM Corporation - Thomas J. Watson Research Center

Foster Provost

New York University

Multiple version iconThere are 2 versions of this paper

Date Written: 2002

Abstract

Thousands of business news stories (including press releases, earningsreports, general business news, etc.) are released each day. Recently, informationtechnology advances have partially automated the processing ofdocuments, reducing the amount of text that must be read. Current techniques(e.g., text classification and information extraction) for full-text analysis for themost part are limited to discovering information that can be found in singledocuments. Often, however, important information does not reside in a singledocument, but in the relationships between information distributed over multipledocuments. This paper reports on an investigation into whether knowledgecan be discovered automatically from relational data extracted from large corporaof business news stories. We use a combination of information extraction,network analysis, and statistical techniques. We show that relationally interlinkedpatterns distributed over multiple documents can indeed be extracted,and (specifically) that knowledge about companies’ interrelationships can bediscovered. We evaluate the extracted relationships in several ways: we give abroad visualization of related companies, showing intuitive industry clusters;we use network analysis to ask who are the central players, and finally, weshow that the extracted interrelationships can be used for important tasks, suchas for classifying companies by industry membership.

Suggested Citation

Bernstein, Abraham and Clearwater, Scott and Hill, Shawndra and Perlich, Claudia and Provost, Foster, Discovering Knowledge from Relational Data Extracted from Business News (2002). NYU Working Paper No. 2451/14157, Available at SSRN: https://ssrn.com/abstract=1282999

Abraham Bernstein (Contact Author)

University of Zurich - Dynamic and Distributed Information Systems Group ( email )

Plattenstrasse 14
Zurich
Switzerland

Scott Clearwater

affiliation not provided to SSRN

No Address Available

Shawndra Hill

Microsoft Research ( email )

New York, NY 10011
United States

Claudia Perlich

IBM Corporation - Thomas J. Watson Research Center ( email )

Route 134
Kitchawan Road
Yorktown Heights, NY 10598
United States

Foster Provost

New York University ( email )

44 West Fourth Street
New York, NY 10012
United States

Here is the Coronavirus
related research on SSRN

Paper statistics

Downloads
79
Abstract Views
881
rank
255,494
PlumX Metrics