International Journal For Multidisciplinary Research

E-ISSN: 2582-2160     Impact Factor: 9.24

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal

Call for Paper Volume 8, Issue 3 (May-June 2026) Submit your research before last 3 days of June to publish your research paper in the issue of May-June.

A Review of Explainable Machine Learning Frameworks for Early Diabetes Prediction Using Gradient Boosting and SHAP Analysis

Author(s) Mr. Abhishek Sharma, Dr. Atul Barve
Country India
Abstract Due to rising rates of complications and costs of treatment, diabetes mellitus represents one of the most rapidly increasing chronic metabolic conditions in the world and is a major concern of public health policy. A growing body of research indicates that diabetes prevalence has grown significantly over the last two to three decades and that this trend is particularly evident in low- and middle-income countries [1], [2]. Therefore, early detection of diabetes is important for providing an opportunity for preventive intervention and slowing disease progression. Traditional methods for diagnosing diabetes are based on laboratory-based clinical tests that are typically conducted after metabolic anomalies have appeared [17], [18].

Recently, researchers have been applying machine learning (ML) to diabetes prediction using structured clinical data to demonstrate better predictive performance than traditional statistical models [27], [28]. While many high performing ML models operate as "black boxes," they do not provide a clear explanation about the factors that contributed to their predictions. A primary concern for clinical application of these models is that the lack of transparency limits their ability to be trusted by clinicians, and ultimately, limits the potential for widespread use of these tools for making clinical decisions [5], [40].

The purpose of this review paper is to present a comprehensive and systematic review of machine learning-based approaches for predicting the onset of diabetes early in the disease process with a focus on ensemble learning techniques and explainable AI (XAI). In addition to exploring datasets that were commonly used for machine learning, data preprocessing strategies, class balance methods, and predictive model techniques that were reported in prior studies, the review will explore predictive modeling and evaluation metric techniques. The review also focuses on gradient boosting-based models because of their high predictive performance on structured clinical data [8], [9] and SHAP (SHapley Additive exPlanations), a widely accepted XAI method based on cooperative game theory [12].
Keywords Diabetes Mellitus, Machine Learning, Explainable Artificial Intelligence, Gradient Boosting, SHAP, Healthcare Analytics, Clinical Decision Support Systems.
Field Computer > Artificial Intelligence / Simulation / Virtual Reality
Published In Volume 8, Issue 3, May-June 2026
Published On 2026-05-22

Share this