Persistent-Homology-Based Machine Learning and Its Applications -- A Survey

42 Pages Posted: 15 Dec 2018

See all articles by Chi Seng Pun

Chi Seng Pun

Nanyang Technological University (NTU) - School of Physical and Mathematical Sciences

Kelin Xia

Nanyang Technological University (NTU) - School of Physical and Mathematical Sciences

Si Xian Lee

Nanyang Technological University (NTU) - School of Physical and Mathematical Sciences

Date Written: October 31, 2018

Abstract

A suitable feature representation that can both preserve the data intrinsic information and reduce data complexity and dimensionality is key to the performance of machine learning models. Deeply rooted in algebraic topology, persistent homology (PH) provides a delicate balance between data simplification and intrinsic structure characterization, and has been applied to various areas successfully. However, the combination of PH and machine learning has been hindered greatly by three challenges, namely topological representation of data, PH-based distance measurements or metrics, and PH-based feature representation. With the development of topological data analysis, progresses have been made on all these three problems, but widely scattered in different literatures. In this paper, we provide a systematical review of PH and PH-based supervised and unsupervised models from a computational perspective. Our emphasis is the recent development of mathematical models and tools, including PH softwares and PH-based functions, feature representations, kernels, and similarity models. Essentially, this paper can work as a roadmap for the practical application of PH-based machine learning tools. Further, we consider different topological feature representations in different machine learning models, and investigate their impacts on the protein secondary structure classification.

Keywords: Persistent homology, machine learning, persistent diagram, persistent barcode, kernel, feature extraction

Suggested Citation

Pun, Chi Seng and Xia, Kelin and Lee, Si Xian, Persistent-Homology-Based Machine Learning and Its Applications -- A Survey (October 31, 2018). Available at SSRN: https://ssrn.com/abstract=3275996 or http://dx.doi.org/10.2139/ssrn.3275996

Chi Seng Pun (Contact Author)

Nanyang Technological University (NTU) - School of Physical and Mathematical Sciences ( email )

SPMS-MAS-05-22
21 Nanyang Link
Singapore, 637371
Singapore
(+65) 6513 7468 (Phone)

HOME PAGE: http://www.ntu.edu.sg/home/cspun/

Kelin Xia

Nanyang Technological University (NTU) - School of Physical and Mathematical Sciences ( email )

S3 B2-A28 Nanyang Avenue
Singapore, 639798
Singapore

Si Xian Lee

Nanyang Technological University (NTU) - School of Physical and Mathematical Sciences ( email )

S3 B2-A28 Nanyang Avenue
Singapore, 639798
Singapore

Here is the Coronavirus
related research on SSRN

Paper statistics

Downloads
125
Abstract Views
523
rank
248,068
PlumX Metrics