Deep Learning for the Genomic Identification of Multidrug-Resistant Profile Mycobacterium Tuberculosis Isolates
27 Pages Posted: 28 Apr 2025
Abstract
Introduction: Tuberculosis continues to be a major global health challenge, largely due to the difficulties in early diagnosis and effective treatment. Rapid and accurate detection of Mycobacterium tuberculosis clinical isolates that are resistant to both first- and second-line medications is essential for implementing appropriate therapies, minimizing complications, and lowering mortality rates.Objective: The objective of the present study was to develop and evaluate deep learning (DL) models aimed at predicting drug resistance through the analysis of whole-genome sequences of clinical M. tuberculosis isolates.Methods: We implemented Convolutional Neural Networks (CNN) and three traditional machine learning (ML) approaches—Random Forests (RF), support vector machines (SVM), and logistic regression (LR). The genomic data were formatted to optimize compatibility with these methods and thereby enhance predictive performance.Results: All models demonstrated an accuracy and recall rate exceeding 90% in effectively differentiating resistant and susceptible isolates. The RF approach exhibited the best performance, while the CNN models produced encouraging results. Nevertheless, further optimization of the CNN methodologies is imperative, particularly regarding the representation of genomic variants. Conclusion: These findings highlight the potential of ML-based techniques, especially RF and CNN, to improve the accuracy and efficiency of drug-resistance prediction.
Note:
Funding declaration: This article was funded by the Universidad Autónoma de Baja California (UABC) and CONACYT (RPJ: 2020-000026-02NACF-07298, GPG:2021-000018-02NACF-15002, MAGC: 2022-
000018-02NACF-15485). The sponsors had no role in the study's design, the collection, and analysis of data, the decision to publish, or the preparation of the manuscript. This research was funded by the UNAM-Huawei Innovation Space under the project "Identification and Prediction of Drug Resistance in the Pangenome of Mycobacterium tuberculosis Using Machine Learning Methods."
Conflict of Interests: The authors declare no conflict of interest.
Keywords: Tuberculosis, Genome, Artificial Intelligence, convolutional neural network, Deep learning, drug resistance
Suggested Citation: Suggested Citation