An Information-Theoretic Approach to Dimension Reduction of Financial Data

13 Pages Posted: 4 May 2013 Last revised: 2 Jun 2020

See all articles by Brian Fleming

Brian Fleming

Dimensionless Ltd

Jens Kroeske

Aberdeen Standard Investments

Date Written: June 3, 2013


The task of statistically analysing and understanding high-dimensional financial data sets is one that is increasingly pertinent in an age of burgeoning information. With high frequency measurements and a global investment universe of hundreds of thousands of securities, reducing the dimension of large data sets by projecting them onto a smaller set of dominant underlying factors or components is often a first step. While principal component analysis has been a standard dimension reduction tool for many decades, a theoretically sound measure of the number of components that should be retained has been lacking. Here we show that the effective rank offers a potential model-independent solution to the problem. We demonstrate that the explanatory power of the number of components indicated by the effective rank is remarkably stable for a wide range of global financial market data while the effective rank itself can vary dramatically over time, offering a potential indicator of systemic risk. The results suggest a certain universality to the measure and we provide some theoretical results supporting this view, derive lower bounds for its explanatory power and highlight links to measures of diversification in areas ranging from ecology to quantum mechanics. Our results demonstrate that the time-varying drivers of financial markets do exhibit some persistent structure. We anticipate our results will prompt further investigation of the effective rank in principal component analysis given the latters’ wide appeal in diverse fields of research ranging from psychology to atmospheric science. We also hope our results provide some direction to solving related dimensional problems such as in cluster analysis where the longstanding question of how many clusters should be used remains unanswered.

Keywords: Systemic risk, principal component, entropy, effective number, effective support, information theory

JEL Classification: C10, C40, C49, E66, G00

Suggested Citation

Fleming, Brian and Kroeske, Jens, An Information-Theoretic Approach to Dimension Reduction of Financial Data (June 3, 2013). Available at SSRN: or

Brian Fleming (Contact Author)

Dimensionless Ltd ( email )

United Kingdom

Jens Kroeske

Aberdeen Standard Investments ( email )

1 George Street
Edinburgh, EH2 2LL
United Kingdom

Do you want regular updates from SSRN on Twitter?

Paper statistics

Abstract Views
PlumX Metrics