puc-header

Prediction of Liquid-Liquid Phase Separation Proteins Using Machine Learning

32 Pages Posted: 21 Jan 2020 Sneak Peek Status: Under Review

See all articles by Tanlin Sun

Tanlin Sun

Peking University - Center for Quantitative Biology

Qian Li

University of Chinese Academy of Sciences - College of Life Sciences

Youjun Xu

Peking University - Center for Quantitative Biology

Zhuqing Zhang

University of Chinese Academy of Sciences - College of Life Sciences

Luhua Lai

Peking University - Center for Quantitative Biology

Jianfeng Pei

Peking University - Center for Quantitative Biology

More...

Abstract

The liquid-liquid phase separation (LLPS) of biomolecules in cell underpins the formation of membraneless organelles, which are the condensates of protein, nucleic acid, or both, and play critical roles in cellular function. Dysregulation of LLPS is implicated in a number of diseases. Although the LLPS of biomolecules has been investigated intensively in recent years, the knowledge of the prevalence and distribution of phase separation proteins (PSPs) is still lag behind. Development of computational methods to predict PSPs is therefore of great importance for comprehensive understanding of the biological function of LLPS. Based on the PSPs collected in LLPSDB, we developed a sequence-based prediction tool for LLPS proteins (PSPredictor). This tool is the first attempt at general purpose PSP prediction that does not depend on specific protein types. Our model achieves a 10-fold cross-validation accuracy of 94.71%, and outperforms previously reported PSP prediction tools. PSPredictor identifies novel scaffold proteins for stress granules and predicts PSPs candidates in the human genome for further study. We also built an user-friendly PSPredictor web server (http://www.pkumdl.cn/PSPredictor) that predicts potential PSPs.

Suggested Citation

Sun, Tanlin and Li, Qian and Xu, Youjun and Zhang, Zhuqing and Lai, Luhua and Pei, Jianfeng, Prediction of Liquid-Liquid Phase Separation Proteins Using Machine Learning. ISCIENCE-D-20-00003. Available at SSRN: https://ssrn.com/abstract=3515387 or http://dx.doi.org/10.2139/ssrn.3515387
This is a paper under consideration at Cell Press and has not been peer-reviewed.

Tanlin Sun

Peking University - Center for Quantitative Biology

China

Qian Li

University of Chinese Academy of Sciences - College of Life Sciences

Beijing
China

Youjun Xu

Peking University - Center for Quantitative Biology

China

Zhuqing Zhang

University of Chinese Academy of Sciences - College of Life Sciences

Beijing
China

Luhua Lai

Peking University - Center for Quantitative Biology

China

Jianfeng Pei (Contact Author)

Peking University - Center for Quantitative Biology ( email )

China

Click here to go to Cell.com

Paper statistics

Abstract Views
193
Downloads
10