An Ensemble of Multiple Conformations for Protein Structure Prediction
33 Pages Posted: 9 Jul 2024
Abstract
A native protein with flexible structure should be better represented by multiple conformations. However, acquisition of an ensemble of multiple conformation structures for a protein is a challenging task, which is actually involving to the solution for protein folding problem and intrinsically disordered protein (IDP). Despite AlphaFold with artificial intelligence (AI) system achieved unprecedented accuracy in predicting protein structures, its result is limited to a single state of structure and it cannot provide multiple states for protein conformations. To overcome the barrier, the protein structure fingerprint technology applied a single sequence method to expose the protein local folding variations in a matrix, adopted effective AI process to obtain multiple conformations and then constructed protein 3D structures. A set of protein folding shape code (PFSC) as alphabetical letters is first established to completely cover folding space for five amino acid residues. Subsequently, the protein folding variation matrix (PFVM) is built up, which assembled all possible local folding variations in PFSC along sequence. Then, a massive number of folding conformations in PFSC strings for the entire protein is obtained by AI process with combination of various local folding variations. Finally, an ensemble of multiple conformational protein 3D structures are constructed. The P53_HUMAN as a well-known protein and LEF1_HUMAN and Q8GT36_SPIOL as two of typical disordered proteins are token as sample for the benchmark to evaluate the predicted results. The results demonstrated that the protein structure fingerprint provided an algorithm with both biological and physical meaningful process to predict protein multiple conformation structures.
Keywords: protein structure prediction, protein conformation, protein folding, intrinsically disordered protein, artificial intelligence
Suggested Citation: Suggested Citation