Metabolomics-based prediction model for diabetes: A comprehensive analysis of biomarkers and machine learning approaches

Doaa  Farid; Farhana  Saleh; Tareq  Mohammed; Atika  Nigar; Abderrahmane  Maaradji

doi:10.11134/btp.3s.2025.14

Authors

Abstract

Aims: To develop a prediction model for diabetes using metabolomics data and to evaluate various machine learning approaches and identify the most effective framework for disease prediction.

Methods: A comprehensive analysis was conducted on the Qatar Biobank dataset comprising metabolomics profiles, instrument measurements, and clinical diagnoses from 450 Qatari nationals. Targeted metabolites were selected based on correlation strength with diabetes status. Five machine learning models (Logistic Regression, Decision Tree, Random Forest, Gradient Boosting, and Neural Network) were evaluated for their predictive performance using metrics including accuracy, precision, recall, F1 score, and ROC AUC.

Results: Among 450 individuals, 9.33 % (n = 42) were diagnosed with diabetes. Correlation analysis identified 140 metabolites significantly associated with diabetes status (p < 0.05). The most strongly correlated metabolites included glucose (r = 0.281, p < 0.0001), mannose (r = 0.247, p < 0.0001), and 1,5-anhydroglucitol (r = −0.297, p < 0.0001). Logistic Regression demonstrated superior performance with the highest accuracy (93.3 %), F1 score (0.625), and ROC AUC (0.941) compared to other models.

Conclusion: Metabolomics data can effectively predict diabetes status, with logistic regression providing the optimal balance of performance and interpretability. The identified metabolites offer potential biomarkers for early diabetes detection and monitoring. This model could serve as a valuable tool for clinical risk assessment and personalized preventive interventions.

Keywords: Diabetes mellitus, Metabolomics, Machine learning, Biomarkers, Prediction model, Logistic regression

PDF

Published: Nov 14, 2025

DOI

https://doi.org/10.11134/btp.3s.2025.14

Issue

No. 3s (2025)

Section

Articles

How to Cite

Farid, D. ., Saleh, F. ., Mohammed, T. ., Nigar, A. ., & Maaradji, A. . (2025). Metabolomics-based prediction model for diabetes: A comprehensive analysis of biomarkers and machine learning approaches. Eurasian Journal of Applied Biotechnology, (3s). https://doi.org/10.11134/btp.3s.2025.14

Main Article Content

Authors

Abstract

Article Details

Article Sidebar