Database of Small Proteins of Fewer Than 50 Amino Acids in Different Microbial Species
3 Pages Posted: 6 Jun 2022
Date Written: May 15, 2022
Abstract
Small proteins are traditionally not the focus of biochemical research given the difficulty in profiling them as well as the relatively more important role of larger proteins and macromolecular complexes in underpinning biological functions. But, advent of liquid chromatography mass spectrometry (LC-MS) together with bioinformatic interrogation of annotated sequenced genome and proteome have open the door to investigating the repertoire of small proteins in different species. This work sought to augment this effort by providing a database of small proteins of fewer than 50 amino acids in different microbial species spanning the domains of Bacteria, Archaea, and Eukaryotes. Profiled from annotated proteome information of each species from UniProt by an in-house MATLAB proteome database analysis software, each dataset in the database comprises protein name, amino acid sequence, number of residues, molecular weight and nucleotide sequence of each catalogued small protein. Such information should provide a firm foundation for downstream sequence analysis as well as lending a lens into how evolutionary forces shape the distribution of molecular weight and number of residues in small proteins of a particular microbial species.
Keywords: small protein database, microbial species, proteome, number of residues, molecular weight
Suggested Citation: Suggested Citation