Factor Models for Cancer Signatures

Physica A 462 (2016) 527-559

70 Pages Posted: 1 May 2016 Last revised: 10 Apr 2017

See all articles by Zura Kakushadze

Zura Kakushadze

Quantigic Solutions LLC; Free University of Tbilisi

Willie Yu

Duke-NUS Medical School - Centre for Computational Biology

Date Written: April 28, 2016

Abstract

We present a novel method for extracting cancer signatures by applying statistical risk models (See our paper at: http://ssrn.com/abstract=2732453) from quantitative finance to cancer genome data. Using 1389 whole genome sequenced samples from 14 cancers, we identify an "overall" mode of somatic mutational noise. We give a prescription for factoring out this noise and source code for fixing the number of signatures. We apply nonnegative matrix factorization (NMF) to genome data aggregated by cancer subtype and filtered using our method. The resultant signatures have substantially lower variability than those from unfiltered data. Also, the computational cost of signature extraction is cut by about a factor of 10. We find 3 novel cancer signatures, including a liver cancer dominant signature (96% contribution) and a renal cell carcinoma signature (70% contribution). Our method accelerates finding new cancer signatures and improves their overall stability. Reciprocally, the methods for extracting cancer signatures could have interesting applications in quantitative finance.

Keywords: factor models, principal components, statistical risk models, nonnegative matrix factorization, somatic mutations, cancer signatures, genome, exome, DNA, eRank, correlation, covariance, serial, cross-sectional, sample, matrix

JEL Classification: G00

Suggested Citation

Kakushadze, Zura and Yu, Willie, Factor Models for Cancer Signatures (April 28, 2016). Physica A 462 (2016) 527-559. Available at SSRN: https://ssrn.com/abstract=2772458 or http://dx.doi.org/10.2139/ssrn.2772458

Zura Kakushadze (Contact Author)

Quantigic Solutions LLC ( email )

1127 High Ridge Road #135
Stamford, CT 06905
United States
6462210440 (Phone)
6467923264 (Fax)

HOME PAGE: http://www.linkedin.com/in/zurakakushadze

Free University of Tbilisi ( email )

Business School and School of Physics
240, David Agmashenebeli Alley
Tbilisi, 0159
Georgia

Willie Yu

Duke-NUS Medical School - Centre for Computational Biology ( email )

8 College Road
Singapore, 169857
Singapore

Register to save articles to
your library

Register

Paper statistics

Downloads
1,422
Abstract Views
5,119
rank
12,645
PlumX Metrics