The Relational Vector-Space Model

11 Pages Posted: 13 Oct 2008

See all articles by Abraham Bernstein

Abraham Bernstein

University of Zurich - Dynamic and Distributed Information Systems Group

Scott Clearwater

affiliation not provided to SSRN

Foster Provost

New York University

Date Written: 2003


This paper addresses the classification of linked entities. Weintroduce a relational vector (VS) model (in analogy to theVS model used in information retrieval) that abstracts the linkedstructure, representing entities by vectors of weights. Givenlabeled data as background knowledge training data, classificationprocedures can be defined for this model, including astraightforward, "direct" model using weighted adjacency vectors.Using a large set of tasks from the domain of company affiliationidentification, we demonstrate that such classification procedurescan be effective. We then examine the method in more detail,showing that as expected the classification performance correlateswith the- relational auto correlation of the data set. We then turnthe tables and use the relational VS scores as a way toanalyze/visualize the relational autocorrelation present in acomplex linked structure. The main contribution of the paper 1s tointroduce the relational VS model as a potentially useful additionto the toolkit for relational data mining. It could provide usefulconstructed features for domains with low to moderate relationalautocorrelation; it may be effective by itself for domains with high levels of relational autocorrelation, and it provides a usefulabstraction for analyzing the properties of linked data.

Keywords: Relational Data Mining, Vector-space models, lndustry Classification

Suggested Citation

Bernstein, Abraham and Clearwater, Scott and Provost, Foster, The Relational Vector-Space Model (2003). NYU Working Paper No. 2451/14130, Available at SSRN:

Abraham Bernstein (Contact Author)

University of Zurich - Dynamic and Distributed Information Systems Group ( email )

Plattenstrasse 14

Scott Clearwater

affiliation not provided to SSRN

No Address Available

Foster Provost

New York University ( email )

44 West Fourth Street
New York, NY 10012
United States

Here is the Coronavirus
related research on SSRN

Paper statistics

Abstract Views
PlumX Metrics