International Journal For Multidisciplinary Research
E-ISSN: 2582-2160
•
Impact Factor: 9.24
A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal
Home
Research Paper
Submit Research Paper
Publication Guidelines
Publication Charges
Upload Documents
Track Status / Pay Fees / Download Publication Certi.
Editors & Reviewers
View All
Join as a Reviewer
Get Membership Certificate
Current Issue
Publication Archive
Conference
Publishing Conf. with IJFMR
Upcoming Conference(s) ↓
Conferences Published ↓
DePaul-2026
IC-AIRCM-T3-2026
SPHERE-2025
AIMAR-2025
SVGASCA-2025
ICCE-2025
Chinai-2023
PIPRDA-2023
ICMRS'23
Contact Us
Plagiarism is checked by the leading plagiarism checker
Call for Paper
Volume 8 Issue 3
May-June 2026
Indexing Partners
Bridging Healthcare and AI: An Interpretable and Robust Framework for Early Diabetes Prediction
| Author(s) | Ms. Girisha Arora, Ms. Keerti Mamgain, Ms. Namami Khantwal, Ms. Manisha Sharma, Dr. Yatu Rani |
|---|---|
| Country | India |
| Abstract | The rapid increase in diabetes cases worldwide emphasises the urgent need for comprehensive diagnostic tools that can help in early detection and prompt treatment. This study presents an iterative and comprehensible machine learning framework designed to make an accurate prediction of diabetes risk using the PIMA Indian Diabetes Dataset. One of the major challenges when working with medical datasets is the presence of noisy data and missing values. To address this, we made a careful data preprocessing pipeline that maintains the biological value of the data. For example, unrealistic zero values found in important medical attributes such as glucose level and diastolic blood pressure were treated as missing data and handled using the K-Nearest Neighbours (KNN) imputation method. This approach estimates missing values based on similar patient records instead of using simple averages, which helps preserve important patterns within the dataset. In addition to data cleaning, the process of feature scaling was applied to maintain consistency among variables, and hereby, new interaction features were created to better capture complex relationships between different health indicators. Class imbalance was a major problem for the medical cases, where diabetic cases are fewer than non-diabetic cases; synthetic oversampling techniques were used in that particular case to balance the dataset. This helps the model learn better patterns of the diabetic class and improves its ability to correctly identify patients with a higher risk of having diabetes. Rather than using only a single algorithm for determining diabetic patients, this research uses an ensemble learning approach that combines multiple models, such as kernel-based methods and boosting techniques. The predictions from these models are refined through a probability calibration step to make the outputs more reliable as well as accurate in a clinical context. An uncertainty estimation mechanism is included to identify predictions where the model is less confident, allowing such cases to be reviewed by medical professionals. To make the model more trustworthy, or we can say reliable, SHAP (Shapley Additive Explanations) is used to explain how each feature contributes to the final prediction. This makes it possible to understand how factors like glucose level, BMI, or age affects the diabetes risk score for each individual patient. The model was evaluated using k-fold cross-validation to ensure robustness and reliability. Special attention was given to improve the evaluation metric, recall, so that the chances of missing actual diabetic patients are minimized. Therefore, the conceptual model acts as an interpretable and reliable decision-support system that not only predicts diabetes risk but also provides meaningful explanations behind each prediction. Such a system can support healthcare professionals in making informed and research-based clinical decisions. |
| Keywords | Diabetes Prediction, Machine Learning, Ensemble Learning, Data Preprocessing, Class Imbalance, SMOTE, ADASYN, KNN Imputation, Explainable AI, SHAP, K-fold Cross-Validation, Healthcare Analytics. |
| Field | Engineering |
| Published In | Volume 8, Issue 3, May-June 2026 |
| Published On | 2026-05-03 |
| DOI | https://doi.org/10.36948/ijfmr.2026.v08i03.76897 |
Share this

E-ISSN 2582-2160
CrossRef DOI is assigned to each research paper published in our journal.
IJFMR DOI prefix is
10.36948/ijfmr
Downloads
All research papers published on this website are licensed under Creative Commons Attribution-ShareAlike 4.0 International License, and all rights belong to their respective authors/researchers.
Powered by Sky Research Publication and Journals