Unsupervised Learning: What is a Sports Car?

Rentzmann, Simon; Wuthrich, Mario V.

doi:10.2139/ssrn.3439358

Download This Paper

Open PDF in Browser

Add Paper to My Library

Unsupervised Learning: What is a Sports Car?

54 Pages Posted: 22 Aug 2019 Last revised: 14 Oct 2019

See all articles by Simon Rentzmann

Mario V. Wuthrich

RiskLab, ETH Zurich

Date Written: October 9, 2019

Abstract

This tutorial studies unsupervised learning methods. Unsupervised learning methods are techniques that aim at reducing the dimension of data (covariables, features), cluster cases with similar features, and graphically illustrate high dimensional data. These techniques do not consider response variables, but they are solely based on the features themselves by studying incorporated similarities. For this reason, these methods belong to the field of unsupervised learning methods. The methods studied in this tutorial comprise principal components analysis (PCA) and bottleneck neural networks (BNNs) for dimension reduction, K-means clustering, K-medoids clustering, partitioning around medoids (PAM) algorithm and clustering with Gaussian mixture models (GMMs) for clustering, and variational autoencoder (VAE), t-distributed stochastic neighbor embedding (t-SNE), uniform manifold approximation and projection (UMAP), self-organizing maps (SOM) and Kohonen maps for visualizing high dimensional data.

Keywords: PCA, biplot, autoencoder, bottleneck neural network (BNN), K-means clustering, K-medoids clustering, PAM algorithm, EM algorithm, clustering with Gaussian mixture models (GMMs), t-SNE, UMAP, SOM, Kohonen maps

JEL Classification: C2, C38, C45, G22

Suggested Citation: Suggested Citation

Rentzmann, Simon and Wuthrich, Mario V., Unsupervised Learning: What is a Sports Car? (October 9, 2019). Available at SSRN: https://ssrn.com/abstract=3439358 or http://dx.doi.org/10.2139/ssrn.3439358