An Ensemble of Multiple Conformations for Protein Structure Prediction

33 Pages Posted: 9 Jul 2024

See all articles by Jiaan Yang

Jiaan Yang

Chinese Academy of Sciences (CAS)

Wenxiang Cheng

Chinese Academy of Sciences (CAS) - Centre for Translational Medicine Research & Development

Gang Wu

affiliation not provided to SSRN

Shi Tong Sheng

affiliation not provided to SSRN

Junjie Yang

affiliation not provided to SSRN

Qiong Shi

Shenzhen University

Suwen Zhao

ShanghaiTech University - School of Life Science and Technology

Qiyue Hu

affiliation not provided to SSRN

Wenxin Ji

Chinese Academy of Sciences (CAS) - Shanghai Advanced Research Institute

Peng Zhang

Chinese Academy of Sciences (CAS) - Centre for Translational Medicine Research & Development

Abstract

A native protein with flexible structure should be better represented by multiple conformations.  However, acquisition of an ensemble of multiple conformation structures for a protein is a challenging task, which is actually involving to the solution for protein folding problem and intrinsically disordered protein (IDP). Despite AlphaFold with artificial intelligence (AI) system achieved unprecedented accuracy in predicting protein structures, its result is limited to a single state of structure and it cannot provide multiple states for protein conformations. To overcome the barrier, the protein structure fingerprint technology applied a single sequence method to expose the protein local folding variations in a matrix, adopted effective AI process to obtain multiple conformations and then constructed protein 3D structures. A set of protein folding shape code (PFSC) as alphabetical letters is first established to completely cover folding space for five amino acid residues. Subsequently, the protein folding variation matrix (PFVM) is built up, which assembled all possible local folding variations in PFSC along sequence. Then, a massive number of folding conformations in PFSC strings for the entire protein is obtained by AI process with combination of various local folding variations. Finally, an ensemble of multiple conformational protein 3D structures are constructed. The P53_HUMAN as a well-known protein and LEF1_HUMAN and Q8GT36_SPIOL as two of typical disordered proteins are token as sample for the benchmark to evaluate the predicted results. The results demonstrated that the protein structure fingerprint provided an algorithm with both biological and physical meaningful process to predict protein multiple conformation structures.

Keywords: protein structure prediction, protein conformation, protein folding, intrinsically disordered protein, artificial intelligence

Suggested Citation

Yang, Jiaan and Cheng, Wenxiang and Wu, Gang and Sheng, Shi Tong and Yang, Junjie and Shi, Qiong and Zhao, Suwen and Hu, Qiyue and Ji, Wenxin and Zhang, Peng, An Ensemble of Multiple Conformations for Protein Structure Prediction. Available at SSRN: https://ssrn.com/abstract=4889489 or http://dx.doi.org/10.2139/ssrn.4889489

Jiaan Yang (Contact Author)

Chinese Academy of Sciences (CAS) ( email )

Chinese Academy of Sciences
Beijing, 100190
China

Wenxiang Cheng

Chinese Academy of Sciences (CAS) - Centre for Translational Medicine Research & Development

Shenzhen
China

Gang Wu

affiliation not provided to SSRN ( email )

Shi Tong Sheng

affiliation not provided to SSRN ( email )

Junjie Yang

affiliation not provided to SSRN ( email )

Qiong Shi

Shenzhen University ( email )

3688 Nanhai Road, Nanshan District
Shenzhen, 518060
China

Suwen Zhao

ShanghaiTech University - School of Life Science and Technology ( email )

100 Haike Road, Zhangjiang Hi-Tech Park, Pudong
Research Building
Shanghai, 201210
China

Qiyue Hu

affiliation not provided to SSRN ( email )

Wenxin Ji

Chinese Academy of Sciences (CAS) - Shanghai Advanced Research Institute ( email )

Peng Zhang

Chinese Academy of Sciences (CAS) - Centre for Translational Medicine Research & Development

Shenzhen
China

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
46
Abstract Views
247
PlumX Metrics