Deep Learning for the Genomic Identification of Multidrug-Resistant Profile Mycobacterium Tuberculosis Isolates

27 Pages Posted: 28 Apr 2025

See all articles by Ricardo Perea-Jacobo

Ricardo Perea-Jacobo

Universidad Autónoma de Baja California (UABC) - Escuela Ciencias de la Salud; Universidad Autónoma de Baja California (UABC) - Laboratorio de Epidemiología y Ecología Molecular

Guillermo Paredes-Gutiérrez

affiliation not provided to SSRN

Héctor Gabriel Acosta Mesa

Universidad Veracruzana

Efrén Mezura-Montes

Universidad Veracruzana - Instituto de Investigaciones en Inteligencia Artificial

José Luis Morales-Reyes

Universidad Veracruzana

Roberto Zenteno-Cuevas

Universidad Veracruzana

Miguel Ángel Guerrero-Chevannier

affiliation not provided to SSRN

Dora-Luz Flores

affiliation not provided to SSRN

Raquel Muñiz-Salazar

Universidad Autónoma de Baja California (UABC) - Laboratorio de Epidemiología y Ecología Molecular

Abstract

Introduction: Tuberculosis continues to be a major global health challenge, largely due to the difficulties in early diagnosis and effective treatment. Rapid and accurate detection of Mycobacterium tuberculosis clinical isolates that are resistant to both first- and second-line medications is essential for implementing appropriate therapies, minimizing complications, and lowering mortality rates.Objective: The objective of the present study was to develop and evaluate deep learning (DL) models aimed at predicting drug resistance through the analysis of whole-genome sequences of clinical M. tuberculosis isolates.Methods: We implemented Convolutional Neural Networks (CNN) and three traditional machine learning (ML) approaches—Random Forests (RF), support vector machines (SVM), and logistic regression (LR). The genomic data were formatted to optimize compatibility with these methods and thereby enhance predictive performance.Results: All models demonstrated an accuracy and recall rate exceeding 90% in effectively differentiating resistant and susceptible isolates. The RF approach exhibited the best performance, while the CNN models produced encouraging results. Nevertheless, further optimization of the CNN methodologies is imperative, particularly regarding the representation of genomic variants. Conclusion: These findings highlight the potential of ML-based techniques, especially RF and CNN, to improve the accuracy and efficiency of drug-resistance prediction.

Note:
Funding declaration: This article was funded by the Universidad Autónoma de Baja California (UABC) and CONACYT (RPJ: 2020-000026-02NACF-07298, GPG:2021-000018-02NACF-15002, MAGC: 2022- 000018-02NACF-15485). The sponsors had no role in the study's design, the collection, and analysis of data, the decision to publish, or the preparation of the manuscript. This research was funded by the UNAM-Huawei Innovation Space under the project "Identification and Prediction of Drug Resistance in the Pangenome of Mycobacterium tuberculosis Using Machine Learning Methods."

Conflict of Interests: The authors declare no conflict of interest.

Keywords: Tuberculosis, Genome, Artificial Intelligence, convolutional neural network, Deep learning, drug resistance

Suggested Citation

Perea-Jacobo, Ricardo and Paredes-Gutiérrez, Guillermo and Mesa, Héctor Gabriel Acosta and Mezura-Montes, Efrén and Morales-Reyes, José Luis and Zenteno-Cuevas, Roberto and Guerrero-Chevannier, Miguel Ángel and Flores, Dora-Luz and Muñiz-Salazar, Raquel, Deep Learning for the Genomic Identification of Multidrug-Resistant Profile Mycobacterium Tuberculosis Isolates. Available at SSRN: https://ssrn.com/abstract=5228996 or http://dx.doi.org/10.2139/ssrn.5228996

Ricardo Perea-Jacobo

Universidad Autónoma de Baja California (UABC) - Escuela Ciencias de la Salud ( email )

Ensenada, Baja California
Mexico

Universidad Autónoma de Baja California (UABC) - Laboratorio de Epidemiología y Ecología Molecular ( email )

Ensenada, Baja California
Mexico

Guillermo Paredes-Gutiérrez

affiliation not provided to SSRN ( email )

Héctor Gabriel Acosta Mesa

Universidad Veracruzana ( email )

Av. Xalapa
S/n
Xalapa Veracruz, 9100
Mexico

Efrén Mezura-Montes

Universidad Veracruzana - Instituto de Investigaciones en Inteligencia Artificial ( email )

José Luis Morales-Reyes

Universidad Veracruzana ( email )

Av. Xalapa
S/n
Xalapa Veracruz, 9100
Mexico

Roberto Zenteno-Cuevas

Universidad Veracruzana ( email )

Av. Xalapa
S/n
Xalapa Veracruz, 9100
Mexico

Miguel Ángel Guerrero-Chevannier

affiliation not provided to SSRN ( email )

Dora-Luz Flores

affiliation not provided to SSRN ( email )

Raquel Muñiz-Salazar (Contact Author)

Universidad Autónoma de Baja California (UABC) - Laboratorio de Epidemiología y Ecología Molecular ( email )

Ensenada, Baja California
Mexico

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
12
Abstract Views
102
PlumX Metrics