Weighted-Persistent-Homology-based Machine Learning for RNA Flexibility Analysis
33 Pages Posted: 24 Dec 2019
Date Written: October 28, 2019
With the great significance of biomolecular flexibility in biomolecular dynamics and function analysis, various experimental methods and theoretical models are developed. Experimentally, Debye-Waller factor, also known as B-factor, measures atomic mean-square displacement and is usually considered as an important measurement for flexibilities. Theoretically, elastic network models, Gaussian network model, flexibility-rigidity model, and other computational models, have been proposed for flexility analysis by shedding light on the biomolecular inner topological structures. Recently, a topology-based machine learning model is proposed. By using the features from persistent homology, this model achieves remarkable high accuracy in protein B-factor prediction. Motivated by its success, we propose weighted-persistent-homology (WPH)-based machine learning (WPHML) models for RNA flexibility analysis. Our WPH is a newly-proposed model, which incorporate physical, chemical and biological information into topological measurements using a weight function. In particular, we use local persistent homology (LPH), which is not to consider the topology of a whole RNA structure, but to focus on the topological information of local regions. Our WPHML model is validated on a well-established RNA dataset, and numerical experiments show that our model can achieve a Pearson correlation coefficient up to 0.5822. The comparison with the previous sequence-information-based learning models shows that a consistent increase of accuracy by at least 10% is achieved in our current model.
Keywords: RNA chain, B-factor, Weighted persistent homology, Local persistent homology, Machine learning
Suggested Citation: Suggested Citation