Global Research Trends in Machine Learning and Scoring Systems for Drug-Resistant Tuberculosis Outcome Prediction: A Bibliometric Analysis

Farida Murtiani; Mondastri Korib Sudaryo; Evi Martha; Diah Handayani; Helwiah Umniyati; Annisa Ayu Lestari

Murtiani F, Sudaryo M. K, Martha E, Handayani D, Umniyati H, Lestari A. A, Febriyani B, Fatimah F, Marisa A. Global Research Trends in Machine Learning and Scoring Systems for Drug-Resistant Tuberculosis Outcome Prediction: A Bibliometric Analysis. Biomed Pharmacol J 2026;19(1).

Manuscript received on :14-02-2026
Manuscript accepted on :11-03-2026
Published online on: 19-03-2026

Plagiarism Check: Yes
Reviewed by: Dr. Arnaw Kishore
Second Review by: Dr. Dunya Abd AL.malik Mohammed Salih
Final Approval by: Dr. Prabhishek Singh

How to Cite | Publication History

Views:

Visited 3 times, 6 visit(s) today

Global Research Trends in Machine Learning and Scoring Systems for Drug-Resistant Tuberculosis Outcome Prediction: A Bibliometric Analysis

Farida Murtiani^1,2*, Mondastri Korib Sudaryo³, Evi Martha⁴, Diah Handayani^5,6, Helwiah Umniyati⁷, Annisa Ayu Lestari³, Bada Febriyani³, Fatimah Fatimah³and Amelia Marisa³

¹Doctoral Program of Epidemiology, Faculty of Public Health, Universitas Indonesia, Indonesia.

²Department of Research, Sulianti Saroso Infectious Disease Hospital, Indonesia.

³Department of Epidemiology, Faculty of Public Health, Universitas Indonesia, Indonesia.

⁴Department of Health Education and Behavioral Sciences, Faculty of Public Health, Universitas Indonesia, Indonesia.

⁵Department Pulmonology and Respiratory Medicine, Faculty of Medicine, Universitas Indonesia, Indonesia.

⁶Universitas Indonesia Hospital, Indonesia.

⁷Department of Dental Public Health, Faculty of Dentistry, YARSI University, Indonesia.

Corresponding Author E-mail:idoel_fh@yahoo.com

Abstract

Drug-Resistant Tuberculosis (DR-TB) continues to be a serious worldwide health concern with a relatively low cure rate, highlighting the importance of timely detection of individuals likely to experience unfavorable treatment results which remains crucial. This study aims to map the global research landscape on the use of machine learning (ML) models and traditional scoring systems (SS) in predicting DR-TB treatment outcomes, focusing on research trends, intellectual structures, and collaboration networks. A systematic and quantitative bibliometric analysis was conducted on 37 eligible studies retrieved from the Scopus and PubMed databases, covering publications from 2015 to 2025. Visualization of publication trends, keyword co-occurrence, and collaboration patterns among authors, institutions, and countries was performed using VOSviewer (version 1.6.20). The findings show that publication output was limited prior to 2021 but increased substantially from 2022 onward. Scoring system-based studies accounted for the largest proportion (57%), followed by ML-based approaches (40%), while hybrid ML-SS models were relatively rare (3%). Highly cited studies were predominantly produced by research groups based in the United Kingdom, United States, and China, frequently focusing on radiomics, deep learning, and drug exposure-response modeling. Keyword and temporal overlay analyses indicate a shift from conventional risk-factor and scoring-based epidemiological models toward data-driven predictive approaches. Collaboration networks reveal analysis further demonstrates strong intra-regional partnerships but relatively limited cross-cluster integration. These findings suggest that although machine learning model development is concentrated in high-resource settings, scoring models remain essential for practical implementation in high-burden, resource-limited regions, and the limited number of hybrid approaches highlights the need for integrative models that balance predictive performance with feasibility.

Keywords

Bibliometrics; Multidrug-Resistant; Machine Learning; Models; Predictive Value of Tests; Statistical; Tuberculosis; Treatment Outcome

Copy the following to cite this article:

Murtiani F, Sudaryo M. K, Martha E, Handayani D, Umniyati H, Lestari A. A, Febriyani B, Fatimah F, Marisa A. Global Research Trends in Machine Learning and Scoring Systems for Drug-Resistant Tuberculosis Outcome Prediction: A Bibliometric Analysis. Biomed Pharmacol J 2026;19(1).

Copy the following to cite this URL:

Murtiani F, Sudaryo M. K, Martha E, Handayani D, Umniyati H, Lestari A. A, Febriyani B, Fatimah F, Marisa A. Global Research Trends in Machine Learning and Scoring Systems for Drug-Resistant Tuberculosis Outcome Prediction: A Bibliometric Analysis. Biomed Pharmacol J 2026;19(1). Available from: https://bit.ly/4lCsPAQ

Introduction

Tuberculosis (TB) persists as a critical global health threat, remaining the primary cause of death from infectious diseases worldwide.¹The World Health Organization’s 2024 report indicates approximately 10.8 million incident TB cases and 1.25 million fatalities occurred globally during the previous year.² Despite considerable advances have been achieved in treating drug-susceptible tuberculosis, Drug-Resistant Tuberculosis (DR-TB) continues to represents the most pressing clinical and epidemiological challenge.

In 2023, about 400,000 people developed MDR/RR-TB, yet only 44% were diagnosed and treated.¹ In Low- and Middle-Income Countries (LMICs), DR-TB diagnostic gaps arise from multifaceted challenges including inadequate laboratory infrastructure for advanced molecular testing, high costs of reagents and equipment maintenance, unreliable electricity supply, and insufficient trained personnel to operate and interpret results of rapid diagnostic.^3,4 Global treatment success rate for MDR/RR-TB is 68%, but under 60% in many resource-limited settings (RLS).^1,3 A minority develop Extensively Drug-Resistant TB (XDR-TB), with particularly poor outcomes.²

DR-TB treatment is characterized by prolonged and potentially toxic regimens that require strict patient adherence. Therefore, the ability to predict Unfavorable Treatment Outcomes (UTO) defined as treatment failure, loss to follow-up, or death is a fundamental component of clinical management.⁵ In the context of healthcare services, specifically family medicine and community health, prediction models play an important role as a triage tool for the allocation of limited resources. The early identification of high-risk patients of unfavorable outcomes enables healthcare providers, especially at the primary level, to promptly implement targeted interventions, such as intensive community-based adherence monitoring or psychosocial support, which are crucial for the prolonged and potentially toxic DR-TB treatment regimens.⁶

Two methodological approaches exist for UTO prediction: traditional clinical scoring systems (SS) and Machine Learning (ML) models. Scoring Systems rely on accessible clinical and laboratory variables, making them practical in RLS and demonstrating good performance with AUC values up to 0.887.^7–9 ML, Such models have consistently shown strong discriminative ability, with reported Area Under the Curve (AUC) values up to 0.09, supported by advances in computing and multimodal data, can detect complex non-linear patterns that may not be captured by conventional statistical method.^5,10 Approaches including Artificial Neural Networks (ANN) and Convolutional Neural Networks (CNN) show high predictive accuracy, including AUCs of 0.90 for culture conversion and strong performance in imaging-based drug resistance detection.^9,11 Particularly, Convolutional Neural Networks (CNN) excel in imaging analysis, improving diagnosis and drug resistance detection even in RLS.^12,13 Despite these advances, scoring systems remain widely used due to their clinical implementability, creating a methodological divide between high-accuracy but resource-dependent ML approaches and simpler, more accessible scoring tools.

Although systematic reviews on ML in TB diagnosis¹³ and DR-TB prediction models exist,¹¹ there has been no comprehensive bibliometric analysis that explicitly maps and compares the parallel and convergent research landscapes of predictive modeling approaches combining machine learning and clinical scoring systems for DR-TB treatment outcomes. Therefore, this study aims to systematically map and compare the research landscape of machine learning and scoring-system based prediction models for DR-TB treatment outcomes using a bibliometric approach.

Materials and Methods

Study Design, Scope, and Database Selection

This study adopts a descriptive quantitative design based on bibliometric analysis¹⁴ The primary objective of this approach is to systematically map and evaluate the development of scientific research related to predictive modeling of Drug-Resistant Tuberculosis (DR-TB) outcomes. The primary databases used for data acquisition were Scopus and Pubmed. The selection of these databases was based on their reputation as leading academic platforms providing comprehensive coverage of high-impact scientific journals in medicine, computer science, and public health.¹⁵ Scopus was selected for its comprehensive citation indexing and bibliometric export features, while PubMed was included to ensure broader biomedical coverage of peer-reviewed literature related to tuberculosis and predictive modeling. The search was conducted on 8 February 2026. Publications indexed between 1 January 2015 and 31 December 2025 were eligible for inclusion. The final publication period analyzed was therefore 2015–2025, representing complete publication years to avoid incomplete-year bias and to capture the evolution of predictive modeling approaches, from early clinical scoring systems to more recent Machine Learning (ML) advancements.

This bibliometric study was conducted to: (1) analyze global publication output and temporal trends in predictive models for DR-TB outcomes; (2) identify influential journals, authors, and collaborative networks; (3) explore the intellectual structure and emerging research themes through keyword co-occurrence analysis; and (4) examine methodological and geographical patterns in ML- and scoring system–based research.

Search Strategy and Data Acquisition

The literature search was specifically developed to identify studies combining DR-TB outcome prediction through artificial intelligence methods and conventional clinical scoring approaches. The search strategy was carefully developed by combining controlled vocabulary terms (Medical Subject Headings [MeSH]) with relevant free-text terms, linked using the Boolean operators AND and OR. The following PubMed search string (Advanced) was applied: ((“Machine Learning”[Mesh] OR “Deep Learning”[Title/Abstract] OR “Random Forest”[Title/Abstract] OR “Neural Networks, Computer”[Mesh] OR “Support Vector Machine”[Title/Abstract] OR “artificial intelligence” [Title/Abstract]) AND (“Tuberculosis, Multidrug-Resistant”[Mesh] OR “Tuberculosis, Rifampin-Resistant”[Title/Abstract] ) AND (“Treatment Outcome”[Mesh] OR “Predictive Value of Tests”[Mesh] OR “Risk Assessment”[Mesh] OR “nomogram*”[Title/Abstract] OR “scoring system*”[Title/Abstract])) AND (“2015/01/01″[PDAT] : “2025/12/31″[PDAT]) AND (english[la]). Scopus search string was applied: TITLE-ABS-KEY ((“machine learning” OR “deep learning” OR “random forest” OR “neural network” OR “convolutional neural network” OR “support vector machine” OR “predictive analysis” OR “artificial intelligence”) AND (“scoring system” OR “predictive model” OR “risk assessment” OR “clinical prediction rule” OR “nomogram”) AND (“outcome prediction” OR “treatment outcome” OR “treatment success” OR “culture conversion” OR “prognos*” OR “unfavorable outcome”OR “unsuccessful”) AND PUBYEAR > 2014 AND (LIMIT-TO (DOCTYPE , “ar”) OR LIMIT-TO (DOCTYPE, “re”)) AND (LIMIT-TO (LANGUAGE, “English”))

Study Eligibility Criteria

For the purpose of this bibliometric analysis, we applied explicit operational criteria to classify each study into ML, SS, or hybrid categories. Studies were coded as ML if they developed or validated prediction models using machine learning algorithms (e.g., random forest, gradient boosting, support vector machines, neural networks, convolutional neural networks) with data-driven parameter optimization. Studies were coded as SS if they proposed or validated point-based clinical scores or nomograms derived from conventional regression modeling, in which the final tool is a fixed scoring rule that can be applied without computational infrastructure. Studies were coded as hybrid if they (1) used ML algorithms to generate a model that was subsequently translated into a simplified scoring system or nomogram, or (2) embedded an existing SS as an input variable within an ML model. When classification was unclear, two reviewers independently assessed the methodology and resolved discrepancies through discussion.

The retrieved records underwent a systematic and structured screening process to ensure relevance and quality. The inclusion criteria were as follows: (1) original papers the development and review articles focusing on the development or validation of DR-TB outcome prediction models, including machine learning (ML), scoring system (SS), or hybrid approaches; (2) publications dated between 1 January 2015 and 31 December 2025; and (3) articles published in English, consistent with standard practice in international bibliometric research.

The exclusion criteria included: (1) irrelevant document types such as meeting abstracts, editorials, letters, book chapters, and non–peer-reviewed conference proceedings; and (2) publications with a purely descriptive focus or those addressing drug discovery, latent TB diagnosis, or drug-sensitive TB without an explicit DR-TB outcome prediction component. Title and abstract screening, followed by full-text assessment, were conducted independently by two reviewers. Discrepancies were resolved through discussion and consensus.

The screening process was documented using a flowchart describing the total number of records identified, filtered by time and language, and excluded based on document type and topical relevance, ultimately resulting in the final dataset for quantitative analysis.

Data Extraction and Cleaning

For each publication included in the final dataset, the following metadata were extracted: title, author names, author affiliations/countries, publication year, journal name, journal thematic category, citation counts, and author provided keywords. A critical step in data preparation was data normalization. Institutional affiliations and country names were standardized to prevent fragmentation during collaboration mapping (for example, merging variations of the same institutional name into a single standardized form). Keywords were similarly (e.g., “MDR-TB” and “multidrug-resistant TB”) to ensure accurate and cohesive co-occurrence mapping. This normalization process is essential for reliable intellectual structure analysis.

Bibliometric Mapping and Visualization

Bibliometric analysis was performed using dedicated visualization software. VOSviewer (version 1.6.20) was used as the primary tool for constructing and visualizing bibliometric networks.¹³ The software was applied to analyze relationships among items such as authors, keywords, and journals, including distance patterns and cluster density within the network.

Results

Search Strategy and Data Selection

Following the search and screening procedures described in the Methods section, a total of 300 records were initially identified (223 from Scopus and 76 from PubMed) on 8 February 2026. After applying the publication year filter (2015 – 2025), 103 records were removed, leaving 196 records for title and abstract screening. Of these, 11 records were excluded due to irrelevant document types (editorials, letters, conference papers, notes, and books), resulting in 185 reports sought for retrieval. One non-English publication was subsequently excluded, leaving 184 articles assessed for eligibility through full-text review. During eligibility assessment, 147 articles were excluded (108 did not develop a machine learning or scoring-based prediction model for DR-TB outcomes and 39 were duplicates). Ultimately, 37 publications met all inclusion criteria and were incorporated into the final bibliometric analysis (Figure 1).

Figure 1: PRISMA 2020 flow diagram illustrating the article selection process for bibliometric analysis (2015-2025).

Click here to view Figure

Global Publication Output and Temporal Trends

Between 2015 and 2025, 37 publications addressing predictive models for DR-TB outcomes were identified. As shown in Figure 2, publication output remained low from 2016 to 2020, followed by a steady increase beginning in 2021 and a marked rise in 2022, reaching a peak in 2024. A slight decrease was observed in 2025 compared with the peak in 2024. Overall, the upward trend demonstrates sustained and growing interest in predictive modelling, reflecting the growing incorporation of digital health data, machine learning, and precision medicine in TB research.

Figure 3a illustrates that scoring system–based studies predominated during the earlier period, whereas ML-based studies increased more rapidly after 2021. From 2022 onward, ML publications experienced a marked increase, reducing the disparity with conventional scoring system studies. Studies integrating both ML and scoring systems began to emerge in 2024, although they remain limited in number. As shown in Figure 3b, conventional scoring systems remain the most frequently applied approach overall (57%), followed by ML-based models (40%), while only 3% of studies adopted hybrid approaches. These findings indicate a gradual methodological shift toward AI-driven DR-TB outcome prediction research, although traditional scoring systems continue to play a central role.

Figure 2: Overall publication trends (2015-2025)

Click here to view Figure

Figure 3: Comparative Trend (a) and Distribution (b) Machine Learning and Scoring System Publications on DR-TB Outcome Prediction, 2015-2025

Click here to view Figure

Influential Authors and Collaboration Networks

Most Influential Authors

Table 1 presents the top 10 most highly cited publications in the field of machine learning and scoring systems for DR-TB outcome prediction between 2015 and 2025. Citation impact was led by Gao et al. from the United Kingdom,¹⁶ with 58 citations, followed by Heyckendorf et al,¹⁷ whose transcriptomic model for predicting treatment duration received 45 citations. Highly cited studies originated predominantly from the United Kingdom, the United States, and China, reflecting strong research contributions from these countries. The most influential publications primarily focused on deep learning and radiomics applied to pulmonary imaging, as well as biomarker- and drug exposure–based predictive models, underscoring the central role of advanced analytical approaches and biologically informed modeling in advancing DR-TB outcome prediction.

Table 1: Top 10 most cited publications in Machine Learning and Scoring Systems DR-TB outcome (ranked by total citations, 2015–2025).

Rank	Author (Year)	Years	Country	Total Citations	Journal
1	Gao, et al.¹⁶	2018	United Kingdom	58	Molecular Pharmaceutics
2	Heyckendorf, et al.¹⁷	2021	Germany	45	European Respiratory Journal
3	Modongo, et al.¹⁸	2016	United States	40	Antimicrobial Agents and Chemotherapy
4	Li, et al.¹⁹	2023	China	36	European Radiology
5	Zheng, et al.²⁰	2022	China	32	European Respiratory Journal
6	Clemens, et al.²¹	2019	United States	24	PLoS ONE
7	Abdelbary, et al.²²	2017	United States	24	Epidemiology and Infection
8	Tola, et al.²³	2021	Iran	15	BMJ Open
9	Nijiati, et al.²⁴	2023	China	14	European Journal of Radiology
10	Arroyo, et al. ²⁵	2019	Brazil	14	Revista de Saude Publica

Collaborative Networks

Figure 4 visualizes the co-authorship network of studies on machine learning and scoring system approaches for DR-TB outcome prediction. Authors are grouped into multiple collaboration clusters, reflecting distinct research teams and institutional collaborations. Node size corresponds to publication output by author, whereas edge thickness represents collaboration intensity. Overall, the network remains fragmented, with several clusters showing strong internal collaboration but limited cross-cluster and inter-regional linkages, suggesting that DR-TB outcome prediction research is largely driven by localized research teams and would benefit from stronger international collaboration.

Figure 4: Visualization of author co-authorship network studies in machine learning and scoring system publications on DR-TB outcome (VOSviewer, ≥ 1 documents).

Click here to view Figure

Figures 5 and 6 shows the international and institutional collaboration networks in DR-TB outcome prediction studies. At the international level (Figure 5), collaboration is most prominent among China, the United States, Ethiopia, and Iran, highlighting linkages between high-burden and technologically advanced countries. At the institutional level (Figure 6), collaborations are concentrated in dense clusters around a limited number of leading institutions, integrating multidisciplinary expertise in biostatistics, biochemistry, radiology, epidemiology, and public health reflecting the interdisciplinary nature of predictive modeling research in DR-TB outcomes. These partnerships indicate increasing global synergy in predictive innovation and model validation, despite collaboration being concentrated among a few leading institutions.

Figure 5: Visualization of author co-authorship network studies in machine learning and scoring system publications on DR-TB outcome (VOSviewer, ≥ 1 documents)

Click here to view Figure

Figure 6: International collaboration network in machine learning and scoring system studies on drug-resistant tuberculosis outcome prediction (VOSviewer, ≥1 document).

Click here to view Figure

Intellectual Structure and Thematic Clusters

Figures 7 and 8 illustrate the intellectual structure of research on machine learning and scoring systems for DR-TB outcome prediction through keyword co-occurrence network and density visualizations. The network analysis (Figure 7) identifies 77 keywords clustered into four major thematic groups. Cluster 1 (red) represents clinical and observational research, emphasizing patient characteristics, and study designs such as adult, male, female, retrospective study, follow-up, comorbidity, logistic regression analysis, and prediction. Cluster 2 (green) represents methodological and machine learning–oriented themes, including machine learning, random forest, sensitivity and specificity, area under the curve, cohort analysis, and controlled study. Cluster 3 (blue) reflects pharmacological and microbiological dimensions of DR-TB research, incorporating antitubercular agents, drug therapy, microbiology, rifampicin, isoniazid, amikacin, and treatment failure. Cluster 4 (yellow) highlights drug-specific and microbiological treatment components, including mycobacterium tuberculosis and second-line agents such as clofazimine, ethambutol, cycloserine, and protionamide. The density visualization (Figure 8) further demonstrates that core terms such as “multidrug-resistant tuberculosis,” “human,” “treatment outcome,” and “prediction” are highly interconnected and centrally positioned, underscoring their integrative role in bridging clinical, epidemiological, and computational domains within DR-TB outcome prediction research.

Figure 7: Keyword co-occurrence network in machine learning and scoring system studies on drug-resistant tuberculosis outcome prediction (VOSviewer, ≥5 occurrences).

Click here to view Figure

Figure 8: Density visualization of keyword co-occurrence in machine learning and scoring system studies on drug-resistant tuberculosis outcome prediction (VOSviewer, ≥5 occurrences).

Click here to view Figure

Temporal overlay mapping of keyword co-occurrences, depicting research theme progression over time. Earlier studies (blue tones) primarily emphasized clinical outcomes, epidemiological factors, and traditional scoring-based approaches, with frequent focus on comorbidities, mortality, risk factors, and treatment outcomes. Increasingly emphasize methodological and predictive modeling concepts, including “machine learning”, “random forest”, “prediction”, “sensitivity and specificity”, and “area under the curve”. These terms indicate growing attention to advanced analytical techniques and performance evaluation metrics in DR-TB outcome prediction. Scoring systems remain present across periods, reflecting their continued use as baseline or comparative methods, while the overall trend indicates a gradual shift toward AI-driven, data-intensive predictive modeling integrating clinical, imaging, and laboratory data (Figure 9).

Figure 9: Overlay visualization of keyword co-occurrence over time in machine learning and scoring system studies on drug-resistant tuberculosis outcomes (VOSviewer, ≥5 occurrences).

Click here to view Figure

Discussion

The research on predictive modeling for DR-TB experienced stagnation from 2016 to 2020, followed by a marked increase after 2021, peaking in 2024 and remaining elevated through 2025. This surge has been driven by the persistently low global cure rate of DR-TB, underscoring an urgent need for early identification of patients at high risk of unfavorable outcomes. In response, a methodological paradigm shift has occurred, with Machine Learning (ML) models now dominating as the primary approach, replacing conventional Scoring Systems (SS). The dominance of ML has been enabled by the availability of multimodal data and advances in computational capacity facilitating complex data analysis.¹² Only a few studies have integrated both approaches, revealing substantial opportunities for the development of hybrid models. Hybrid models that combine the predictive accuracy of ML with the practicality of SS are expected to offer more reliable and feasible solutions,^26,27 particularly in resource-limited settings.

ML research has been primarily driven by the utilization of advanced features such as Deep Learning (DL) and radiomics derived from CT scan imaging,¹⁶ where algorithms like Convolutional Neural Networks (CNN) have been shown to be highly effective for diagnostic analysis with the ability to capture complex, non-visible patterns, often surpassing the performance of traditional ML models.¹³ In contrast, Scoring Systems (SS) focus on variables that are more readily accessible in everyday clinical practice, including routine clinical parameters, simple biomarkers, and demographic data.^28,29 The shifting research trend from conventional SS approaches toward algorithmic ML methods is also reflected in keyword analyses, which indicate a transition in focus from conventional clinical determinants such as treatment failure and HIV infection to computational metrics like random forest and area under the curve.

The difference in feature types creates a dilemma between accuracy and accessibility. On one hand, ML models with complex features offer superior predictive accuracy, yet their implementation is constrained by costly infrastructural requirements such as CT/MRI imaging equipment, high-performance computing servers, and stringent data standards making their adoption challenging in resource-limited settings.¹³ On the other hand, SS that rely on routine clinical data remain more practical and transparent triage tools, although their maximal accuracy may be comparatively limited. Moreover, there is a significant risk that predictive models overly focused on algorithmic data may overlook Social Determinants of Health (SDOH), such as economic and geographic factors, which have been shown to be critical predictors of unfavorable treatment outcomes, including treatment default, often driven by socioeconomic constraints.^30,31 The integration of social and economic data into prediction models is an urgent necessity, given that social determinants of health directly influence treatment success rates and patient retention in TB programs, particularly in low- and middle-income countries.

Although promising high accuracy, ML models predicting DR-TB outcomes have demonstrated only moderate accuracy in practice, particularly when compared to diagnostic models. Their greatest challenge is the generalization crisis, whereby model performance both diagnostic and predictive often declines significantly when validated on data from different locations. This occurs because models developed at a single data center tend to overfit to local population characteristics and protocols, rendering them unreliable when applied across diverse clinical settings, especially in resource-limited areas.³² To overcome these limitations, extensive multicenter validation is necessary. Conversely, SS that utilize more universal clinical variables have shown greater performance stability despite lower peak accuracy. Therefore, a prospective strategy is the development of hybrid models that combine the stability of SS with the analytical capabilities of ML, sacrificing some peak accuracy to achieve substantially improved generalizability.³³

This analysis reveals a pronounced geographic disparity, where advanced methodological innovations such as ML, DL and radiomics are primarily concentrated in countries with well-established computational and research infrastructures, including the United States, China, and the United Kingdom. In contrast, countries bearing a high burden of DR-TB such as Ethiopia, Brazil, Iran, and Mexico are more prominently involved in the development of SS and predictive cohort studies, positioning them as critically important sites for clinical validation processes.

Despite promising high accuracy, ML models for predicting DR-TB outcomes have in practice demonstrated only moderate accuracy, especially when compared to diagnostic models. Their major challenge is the generalization crisis, wherein model performance both diagnostic and predictive often declines significantly when validated on data from different locations.¹³ This occurs because models developed at a single data center are prone to overfitting local population characteristics and protocols, making them unreliable when applied to diverse clinical settings, particularly in resource-limited areas.³¹ To address these limitations, extensive multicenter validation is required. Conversely, SS that utilize more universal clinical variables have shown greater stability in performance, albeit with lower peak accuracy. Therefore, a prospective strategy involves developing hybrid models that combine the stability of SS with the analytical capabilities of ML, sacrificing some peak accuracy to achieve considerably better generalizability.

The integration of ML and clinical SS is crucial for optimizing resource allocation among high-risk DR-TB patients. For use in resource-limited settings, advanced ML models should be translated into practical, lightweight tools such as nomograms or mobile applications suitable for primary care decision-making. Future research should emphasize robust external validation and hybrid models that balance ML predictive performance with the simplicity of scoring systems. Concurrently, policy efforts should strengthen standardized data infrastructures and promote collaboration between technology centers and national TB programs in high-burden low- and middle-income countries to ensure AI tools are relevant, ethical, and equitable.

Study Limitations

This bibliometric study has several limitations. It includes only English-language publications indexed in Scopus and PubMed, potentially excluding relevant scoring system studies from high-burden regions. Bibliometric methods also cannot assess methodological quality or clinical validity of models. In addition, indexing bias may favor ML/AI studies, as computer science journals are generally better indexed than regional clinical journals.

Conclusion

Research on predicting drug-resistant tuberculosis (DR-TB) treatment outcomes has increased markedly since 2022, reflecting growing efforts to improve early risk identification. This bibliometric analysis shows that traditional scoring systems remain widely used due to their practicality and feasibility in resource-limited settings, while machine learning (ML) approaches have expanded rapidly, indicating a shift toward data-driven prediction. However, hybrid models integrating ML and scoring systems remain scarce, representing an important methodological gap. Future research should prioritize externally validated hybrid approaches and stronger cross-regional collaboration to improve generalizability and clinical applicability.

Acknowledgment

The authors thank the librarians at Universitas Indonesia Faculty of Public Health for their assistance with Scopus and PubMed database searches.

Funding Source

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Conflict of Interest

The author(s) do not have any conflict of interest.

Data Availability Statement

This statement does not apply to this article

Ethics Statement

This research did not involve human participants, animal subjects, or any material that requires ethical approval.

Informed Consent Statement

This study did not involve human participants, and therefore, informed consent was not required

Clinical Trial Registration

This research does not involve any clinical trials

Permission to reproduce material from other sources

Not applicable

Authors’ Contribution

Conceptualization: Farida Murtiani, Mondastri Korib Sudaryo, Evi Martha, Diah Handayani.
Data curation: Farida Murtiani, Annisa Ayu Lestari, Amelia Marisa.
Formal analysis: Farida Murtiani, Amelia Marisa, Annisa Ayu Lestari.
Methodology: Farida Murtiani, Mondastri Korib Sudaryo.
Visualization: Farida Murtiani, Annisa Ayu Lestari.
Writing – original draft: Farida Murtiani, Fatimah Fatimah, Helwiyah Umniyati, Ba’da Febriani.
Writing– review & editing: Farida Murtiani, Mondastri Korib Sudaryo, Evi Martha, Diah Handayani, Helwiyah Umniyati, Annisa Ayu Lestari, Fatimah Fatimah, Ba’da Febriani.

Reference

PAHO. Tuberculosis resurges as top infectious disease killer. Pan American Health Organization. November 1, 2024. Accessed October 18, 2025. https://www.paho.org/en/news/1-11-2024-tuberculosis-resurges-top-infectious-disease-killer
World Health Organization. Global Tuberculosis Report. 2024. https://www.who.int/publications/i/item/ 9789240101531
Estaji F, Kamali A, Keikha M. Strengthening the global Response to Tuberculosis: Insights from the 2024 WHO global TB report. J Clin Tuberc Other Mycobact Dis. 2025;39:100522. doi:10.1016/J.JCTUBE.2025.100522
CrossRef
Ntinginya NE, Kuchaka D, Orina F, et al. Unlocking the health system barriers to maximise the uptake and utilisation of molecular diagnostics in low-income and middle-income country setting. BMJ Glob Health. 2021;6(8):5357. doi:10.1136/BMJGH-2021-005357
CrossRef
Hosu MC, Faye LM, Apalata T. Predicting Treatment Outcomes in Patients with Drug-Resistant Tuberculosis and Human Immunodeficiency Virus Coinfection, Using Supervised Machine Learning Algorithm. Pathogens. 2024;13(11):923. doi:10.3390/PATHOGENS13110923
CrossRef
Oh AL, Makmor-Bakry M, Islahudin F, Ting CY, Chan SK, Tie ST. Development and validation of a predictive scoring model for risk stratification of tuberculosis treatment interruption. Res Social Adm Pharm. 2024;20(12 Pt A):1102-1109. doi:10.1016/J.SAPHARM.2024.08.091
CrossRef
Yan J, Luo H, Nie Q, Hu S, Yu Q, Wang X. A Scoring System Based on Laboratory Parameters and Clinical Features to Predict Unfavorable Treatment Outcomes in Multidrug- and Rifampicin-Resistant Tuberculosis Patients. Infect Drug Resist. 2023;(January):225-237.
CrossRef
Baik Y, Rickman HM, Hanrahan CF, et al. A clinical score for identifying active tuberculosis while awaiting microbiological results: Development and validation of a multivariable prediction model in sub-Saharan Africa. PLoS Med. 2020;17(11):e1003420. doi:10.1371/JOURNAL.PMED.1003420
CrossRef
Jain E, Kukreja V, Choudhary S. Automated Tuberculosis Detection Using Convolutional Neural Networks on Chest X-Ray Images: A High-Accuracy Diagnostic Approach. 2024 International Conference on Augmented Reality, Intelligent Systems, and Industrial Automation, ARIIA 2024. Published online 2024. doi:10.1109/ARIIA63345.2024.11051457
CrossRef
Lu B, Shi Y, Wang M, et al. Development of a clinical prediction model for poor treatment outcomes in the intensive phase in patients with initial treatment of pulmonary tuberculosis. Front Med (Lausanne). 2025;12:1472295. doi:10.3389/FMED.2025.1472295/FULL
CrossRef
Zhang F, Yang Z, Geng X, et al. Using Machine Learning Methods to Predict Early Treatment Outcomes for Multidrug-Resistant or Rifampicin-Resistant Tuberculosis to Enhance Patient Cure Rates: Development and Validation of Multiple Models. J Med Internet Res. 2025;27(1). doi:10.2196/69998
CrossRef
Karki M, Kantipudi K, Haghigh B, et al. Training Data for Machine Learning to Enhance Patient-Centered Outcomes Research (PCOR) Data Infrastructure- A Case Study in Tuberculosis Drug Resistance |. ASPE. 2023. Accessed October 18, 2025. https://aspe.hhs.gov/reports/training-data-pcor-nlm
Pongsuwun K, Puwarawuttipanit W, Nguantad S, et al. A Systematic Review of the Accuracy of Machine Learning Models for Diagnosing Pulmonary Tuberculosis: Implications for Nursing Practice and Implementation. Nurs Health Sci. 2025;27(1). doi:10.1111/NHS.70077
CrossRef
Marcelin JR, Goel S, Niehaus WN, Messersmith RC, Cawcutt KA. Which Topics Drive Dissemination? Alternative Bibliometrics Analysis of the Highest-Ranking Articles in 3 Infectious Diseases Journals Before COVID-19. Open Forum Infect Dis. 2024;11(3). doi:10.1093/OFID/OFAE116
CrossRef
Ganti L, Persaud NA, Stead TS. Bibliometric analysis methods for the medical literature. Academic Medicine & Surgery. Published online January 30, 2025. doi:10.62186/001C.129134
CrossRef
Gao XW, Qian Y. Prediction of Multidrug-Resistant TB from CT Pulmonary Images Based on Deep Learning Techniques. Mol Pharm. 2018;15(10):4326-4335. doi:10.1021/ACS.MOLPHARMACEUT.7B00875
CrossRef
Heyckendorf J, Marwitz S, Reimann M, et al. Prediction of anti-tuberculosis treatment duration based on a 22-gene transcriptomic model. Eur Respir J. 2021;58(3):744. doi:10.1183/13993003.03492-2020
CrossRef
Modongo C, Pasipanodya JG, Magazi BT, et al. Artificial intelligence and amikacin exposures predictive of outcomes in multidrug-resistant tuberculosis patients. Antimicrob Agents Chemother. 2016;60(10):5928-5932. doi:10.1128/AAC.00962-16
CrossRef
Li Y, Wang B, Wen L, et al. Machine learning and radiomics for the prediction of multidrug resistance in cavitary pulmonary tuberculosis: a multicentre study. Eur Radiol. 2023;33(1):391-400. doi:10.1007/S00330-022-08997-9
CrossRef
Zheng X, Forsman LD, Bao Z, et al. Drug exposure and susceptibility of second-line drugs correlate with treatment response in patients with multidrug-resistant tuberculosis: a multicentre prospective cohort study in China. Eur Respir J. 2022;59(3). doi:10.1183/13993003.01925-2021
CrossRef
Clemens DL, Lee BY, Silva A, et al. Artificial intelligence enabled parabolic response surface platform identifies ultra-rapid near-universal TB drug treatment regimens comprising approved drugs. PLoS One. 2019;14(5). doi:10.1371/JOURNAL.PONE.0215607
CrossRef
Abdelbary BE, Garcia-Viveros M, Ramirez-Oropesa H, Rahbar MH, Restrepo BI. Predicting treatment failure, death and drug resistance using a computed risk score among newly diagnosed TB patients in Tamaulipas, Mexico. Epidemiol Infect. 2017;145(14):3020-3034. doi:10.1017/S0950268817001911
CrossRef
Tola H, Holakouie-Naieni K, Mansournia MA, et al. National treatment outcome and predictors of death and treatment failure in multidrug-resistant tuberculosis in Ethiopia: A 10-year retrospective cohort study. BMJ Open. 2021;11(8). doi:10.1136/BMJOPEN-2020-040862
CrossRef
Nijiati M, Guo L, Abulizi A, et al. Deep learning and radiomics of longitudinal CT scans for early prediction of tuberculosis treatment outcomes. Eur J Radiol. 2023;169. doi:10.1016/J.EJRAD.2023.111180
CrossRef
Arroyo LH, Ramos ACV, Yamamura M, et al. Predictive model of unfavorable outcomes for multidrug-resistant tuberculosis. Rev Saude Publica. 2019;53. doi:10.11606/S1518-8787.2019053001151
CrossRef
Wang Q, Gu J, Gabrielian A, et al. Analysis of a Large Patient-Level Dataset to Predict Outcome of Treatment for Drug-Resistant Tuberculosis. medRxiv. Published online September 17, 2022:2022.09.14.22279738. doi:10.1101/2022.09.14.22279738
CrossRef
Yang Y, Chen J, Liu L, et al. Applying a Combined Model to Evaluate the Risk of Poor Treatment Outcomes in Rifampicin Resistant Tuberculosis Patients: A Multicenter Retrospective Study. Infect Drug Resist. 2024;17:5287-5298. doi:10.2147/IDR.S491910
CrossRef
Yan J, Luo H, Nie Q, Hu S, Yu Q, Wang X. A Scoring System Based on Laboratory Parameters and Clinical Features to Predict Unfavorable Treatment Outcomes in Multidrug-and Rifampicin-Resistant Tuberculosis Patients. Infect Drug Resist. 2023;16:225-237. doi:10.2147/IDR.S397304
CrossRef
Abdelbary BE, Garcia-Viveros M, Ramirez-Oropesa H, Rahbar MH, Restrepo BI. Predicting treatment failure, death and drug resistance using a computed risk score among newly diagnosed TB patients in Tamaulipas, Mexico. Epidemiol Infect. 2017;145(14):3020-3034. doi:10.1017/S0950268817001911
CrossRef
Anley DT, Akalu TY, Dessie AM, et al. Prognostication of treatment non-compliance among patients with multidrug-resistant tuberculosis in the course of their follow-up: a logistic regression–based machine learning algorithm. Front Digit Health. 2023;5. doi:10.3389/FDGTH.2023.1165222
CrossRef
CDC. Health Disparities in Tuberculosis . Centers For Disease Control and Prevention. 2025. Accessed November 7, 2025. https://www.cdc.gov/tb/health-equity/index.html
Zhang F, Yang Z, Geng X, et al. Using Machine Learning Methods to Predict Early Treatment Outcomes for Multidrug-Resistant or Rifampicin-Resistant Tuberculosis to Enhance Patient Cure Rates: Development and Validation of Multiple Models. J Med Internet Res . 2025;27(1):e69998. doi:10.2196/69998
CrossRef
Anley DT, Akalu TY, Merid MW, Tsegaye T. Development and Validation of a Nomogram for the Prediction of Unfavorable Treatment Outcome Among Multidrug Resistant Tuberculosis Patients in North West Ethiopia: An Application of Prediction Modelling. Infect Drug Resist. 2022;15:3887-3904. doi:10.2147/IDR.S372351
CrossRef

Visited 3 times, 6 visit(s) today