Download this Paper Open PDF in Browser

A Link Mining Algorithm for Earnings Forecast and Trading

Data Mining and Knowledge Discovery, Vol. 18, No. 3

20 Pages Posted: 17 Oct 2006 Last revised: 20 Feb 2013

Germán G. Creamer

Stevens Institute of Technology

Salvatore Stolfo

Columbia University - Computer Science Department

Date Written: June 1, 2009


The objective of this paper is to present and discuss a link mining algorithm called CorpInterlock and its application to the financial domain. This algorithm selects the largest strongly connected component of a social network and ranks its vertices using several indicators of distance and centrality. These indicators are merged with other relevant indicators in order to forecast new variables using a boosting algorithm. We applied the algorithm CorpInterlock to integrate the metrics of an extended corporate interlock (social network of directors and financial analysts) with corporate fundamental variables and analysts' predictions (consensus). CorpInterlock used these metrics to forecast the trend of the cumulative abnormal return and earnings surprise of S&P 500 companies.

The rationality behind this approach is that the corporate interlock has a direct effect on future earnings and returns because these variables affect directors and managers' compensation. The financial analysts engage in what the agency theory calls the "earnings game'': Managers want to meet the financial forecasts of the analysts and analysts want to increase their compensation or business of the company that they follow.

Following the CorpInterlock algorithm, we calculated a group of well-known social network metrics and integrated with economic variables using Logitboost. We used the results of the CorpInterlock algorithm to evaluate several trading strategies. We observed an improvement of the Sharpe ratio (risk-adjustment return) when we used "long only'' trading strategies with the extended corporate interlock instead of the basic corporate interlock before the regulation Fair Disclosure (FD) was adopted (1998-2001). There was no major difference among the trading strategies after 2001. Additionally, the CorpInterlock algorithm implemented with Logitboost showed a significantly lower test error than when the CorpInterlock algorithm was implemented with logistic regression. We conclude that the CorpInterlock algorithm showed to be an effective forecasting algorithm and supported profitable trading strategies.

Keywords: Link mining, link analysis, social network, machine learning, computational finance, boosting, time series, pattern analysis, data mining applications

JEL Classification: C49, C63, G14

Suggested Citation

Creamer, Germán G. and Stolfo, Salvatore, A Link Mining Algorithm for Earnings Forecast and Trading (June 1, 2009). Data Mining and Knowledge Discovery, Vol. 18, No. 3. Available at SSRN:

Germán G. Creamer (Contact Author)

Stevens Institute of Technology ( email )

1 Castle Point on Hudson
Hoboken, NJ 07030
United States
2012168986 (Phone)


Salvatore Stolfo

Columbia University - Computer Science Department ( email )

500 W 120 St
New York, NY NY 10027
United States
646-775-6043 (Phone)


Paper statistics

Abstract Views