Build, Compute, Critique, Repeat: Data Analysis with Latent Variable Models

Posted: 7 Mar 2014

See all articles by David M. Blei

David M. Blei

Princeton University - Department of Computer Science

Date Written: January 2014

Abstract

We survey latent variable models for solving data-analysis problems. A latent variable model is a probabilistic model that encodes hidden patterns in the data. We uncover these patterns from their conditional distribution and use them to summarize data and form predictions. Latent variable models are important in many fields, including computational biology, natural language processing, and social network analysis. Our perspective is that models are developed iteratively: We build a model, use it to analyze data, assess how it succeeds and fails, revise it, and repeat. We describe how new research has transformed these essential activities. First, we describe probabilistic graphical models, a language for formulating latent variable models. Second, we describe mean field variational inference, a generic algorithm for approximating conditional distributions. Third, we describe how to use our analyses to solve problems: exploring the data, forming predictions, and pointing us in the direction of improved models.

Suggested Citation

Blei, David M., Build, Compute, Critique, Repeat: Data Analysis with Latent Variable Models (January 2014). Annual Review of Statistics and Its Application, Vol. 1, Issue 1, pp. 203-232, 2014, Available at SSRN: https://ssrn.com/abstract=2405899 or http://dx.doi.org/10.1146/annurev-statistics-022513-115657

David M. Blei (Contact Author)

Princeton University - Department of Computer Science ( email )

35 Olden Street
Princeton, NJ 08540
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Abstract Views
1,072
PlumX Metrics