DeepTranSeq: An Image-Based Approach for Bacterial Sigma 70 Promoter Sequence Identification Using Deep Transfer Learning
22 Pages Posted: 14 Mar 2024
Abstract
Identification of biological sequences and their functions is one of the core tasks in bioinformatics. The identification and classification of such biological sequences are essential to facilitate the study of different organisms and their continuous evolution. Moreover, Escherichia coli (E. coli) bacteria is currently the best-understood organism, the preferred host for gene cloning and protein production. Sequence analysis of E. coli bacteria is carried out in most molecular biology and biotechnology laboratories. To cope with the stressful conditions of the environment, E. coli bacteria alter their gene expression patterns. The process of choosing which genes will be transcribed is extremely dependent on particular sequences of nucleotides that are referred to as promoters. Among different sigma promoters of E. coli, the sigma 70 promoter is responsible for starting the transcription of nearly all genes in growing cells. However, due to the high time complexity and expenses of conventional laboratory methods, computational methods offer alternate solutions to the problem of identifying such bacterial promoter sequences. In this paper, we propose a new approach called DeepTranSeq for transforming promoter sequences into image representations with the help of CNN to identify bacterial sigma 70 promoters using a deep transfer learning network-based D-LeNet model. The proposed method obtains 99.32% accuracy compared to a standard benchmark dataset, significantly outperforming other state-of-the-art methods.
Keywords: CNN, Deep Transfer Learning, Promoter Identification, Sigma 70 Promoter, Image-based Approach
Suggested Citation: Suggested Citation