Bridging Healthcare and AI:  An Interpretable and Robust Framework  for Early Diabetes Prediction

Girisha Arora; Keerti Mamgain; Namami Khantwal; Manisha Sharma; Yatu Rani

doi:10.36948/ijfmr.2026.v08i03.76897

Bridging Healthcare and AI: An Interpretable and Robust Framework for Early Diabetes Prediction

Author(s)	Ms. Girisha Arora, Ms. Keerti Mamgain, Ms. Namami Khantwal, Ms. Manisha Sharma, Dr. Yatu Rani
Country	India
Abstract	The rapid increase in diabetes cases worldwide emphasises the urgent need for comprehensive diagnostic tools that can help in early detection and prompt treatment. This study presents an iterative and comprehensible machine learning framework designed to make an accurate prediction of diabetes risk using the PIMA Indian Diabetes Dataset. One of the major challenges when working with medical datasets is the presence of noisy data and missing values. To address this, we made a careful data preprocessing pipeline that maintains the biological value of the data. For example, unrealistic zero values found in important medical attributes such as glucose level and diastolic blood pressure were treated as missing data and handled using the K-Nearest Neighbours (KNN) imputation method. This approach estimates missing values based on similar patient records instead of using simple averages, which helps preserve important patterns within the dataset. In addition to data cleaning, the process of feature scaling was applied to maintain consistency among variables, and hereby, new interaction features were created to better capture complex relationships between different health indicators. Class imbalance was a major problem for the medical cases, where diabetic cases are fewer than non-diabetic cases; synthetic oversampling techniques were used in that particular case to balance the dataset. This helps the model learn better patterns of the diabetic class and improves its ability to correctly identify patients with a higher risk of having diabetes. Rather than using only a single algorithm for determining diabetic patients, this research uses an ensemble learning approach that combines multiple models, such as kernel-based methods and boosting techniques. The predictions from these models are refined through a probability calibration step to make the outputs more reliable as well as accurate in a clinical context. An uncertainty estimation mechanism is included to identify predictions where the model is less confident, allowing such cases to be reviewed by medical professionals. To make the model more trustworthy, or we can say reliable, SHAP (Shapley Additive Explanations) is used to explain how each feature contributes to the final prediction. This makes it possible to understand how factors like glucose level, BMI, or age affects the diabetes risk score for each individual patient. The model was evaluated using k-fold cross-validation to ensure robustness and reliability. Special attention was given to improve the evaluation metric, recall, so that the chances of missing actual diabetic patients are minimized. Therefore, the conceptual model acts as an interpretable and reliable decision-support system that not only predicts diabetes risk but also provides meaningful explanations behind each prediction. Such a system can support healthcare professionals in making informed and research-based clinical decisions.
Keywords	Diabetes Prediction, Machine Learning, Ensemble Learning, Data Preprocessing, Class Imbalance, SMOTE, ADASYN, KNN Imputation, Explainable AI, SHAP, K-fold Cross-Validation, Healthcare Analytics.
Field	Engineering
Published In	Volume 8, Issue 3, May-June 2026
Published On	2026-05-03
DOI	https://doi.org/10.36948/ijfmr.2026.v08i03.76897

View / Download PDF File

E-ISSN 2582-2160

doi

CrossRef DOI is assigned to each research paper published in our journal.

IJFMR DOI prefix is
10.36948/ijfmr

Downloads

Research Paper Format Copyright Permission Form and Undertaking Form Cover Page Vol 8 Isu 2 Cover Page Vol 8 Isu 1 Cover Page Vol 7 Isu 6

All research papers published on this website are licensed under Creative Commons Attribution-ShareAlike 4.0 International License, and all rights belong to their respective authors/researchers.

CC-BY-SA

About IJFMR Fees & Payment Current Issue Publication Archive	Submit Research Paper Track Submission Status Publication Guidelines Publication Ethics Peer Review & Plagiarism	Join as a Reviewer Editors & Reviewers Reviewer Referral Program Get Reviewer Membership Certi.	Website/Journal Policies Usage Policy Content Policies Privacy Policy

Contact Us		+91-9687-828-838	editor@ijfmr.com

International Journal For Multidisciplinary Research

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal

Bridging Healthcare and AI: An Interpretable and Robust Framework for Early Diabetes Prediction

Share this