Set Identification and Estimation of Factor and Topic Models

20 Pages Posted: 4 Nov 2015

Date Written: November 2, 2015

Abstract

The paper presents sharp bounds on the identified set for classical factor models and non-parametric topic models based on results from the non-negative factorization literature. It compares the standard assumption (for factor models) of orthonormality of the factors (principal components analysis) to the "natural" assumption of topic models of additivity and non-negativity. For the former, the model is point identified when the number of factors is "small" but further restrictions such as those presented in Bai and Ng (2013) are needed to identify larger models. Under the latter, the paper characterizes the identified set and shows the necessary condition for point identification presented in Huang et al (2013) is also sufficient. In the two factor case this condition states that for each latent factor there must be some asset whose return gives it zero weight and there must be some time periods where each factor's normalized return is zero. These "sparsity" conditions are characteristics of the observed data, not assumptions on the data generating process. The paper presents a "least squares" estimator where the number of parameters to be estimated is not increasing in the size of the data set. The paper shows that this estimator is consistent both when the number time periods increases in the factor model and when the number of documents increases in the topic model. Unlike the similar estimator presented in the classical factor model literature (Stock and Watson (2002), Bai (2003)) this estimator does not rely on orthonormality.

Keywords: factor model, topic model, set identification, non-negative matrix factorization

JEL Classification: C14, C23

Suggested Citation

Adams, Christopher, Set Identification and Estimation of Factor and Topic Models (November 2, 2015). Available at SSRN: https://ssrn.com/abstract=2685218 or http://dx.doi.org/10.2139/ssrn.2685218

Christopher Adams (Contact Author)

CBO ( email )

Ford House Office Building
2nd & D Streets, SW
Washington, DC 20515-6925
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
47
Abstract Views
572
PlumX Metrics