An Ensembled Tabnet-Based Model Approach for Diabetes Disease Classification

Main Article Content

Duncan Ogindo Obunge
Lawrence Muriira
Vincent Mbandu

Abstract

Despite the advancements in machine learning (ML) for classification tasks, accurately classifying diseases on limited-feature medical datasets remains challenging. Traditional ML models struggle with interpretability, necessitating an exploration of novel technique. This research developed and evaluated a novel TabNet-based ensemble model for diabetes classification, rating its performance against Extreme Gradient Boosting (XGBoost), Random Forest and base TabNet models. The study utilized the PIMA Indian Diabetes dataset from a public ML Repository, which contains 768 tuples (8 features and 1 outcome variable). A TabNet-based ensemble model was developed using a weighted averaging strategy. For comparative analysis, baseline models, including XGBoost, Random Forest, and a standalone TabNet model were also implemented and optimized. Model performance was assessed using key metrics: balanced accuracy, precision and recall (class 1), F1 score, and Receiver Operating Characteristic-Area Under the Curve (ROC-AUC). The ensembled TabNet-based model consistently achieved the highest performance metrics: balanced accuracy of 83%, precision of 84%, recall of 89%, F1 score of 84%, and ROC-AUC of 90.4%  compared to XGBoost (accuracy 81% , precision 79% , recall 86%, F1 score 81%, ROC-AUC 88.6%) , Random Forest (accuracy 81%,  precision 78%, recall 87%, F1 score 81%, ROC-AUC 91.6%)  and base TabNet (accuracy 81%,  precision 80%, recall 82%, F1 score 81%, ROC-AUC 86.7%). The study recommends healthcare institutions to adopt the validated ensemble TabNet-based architecture as a standardized framework for clinical decision support systems across multiple diseases. Further, researchers should establish this methodology as the preferred approach for limited-feature medical datasets, extending beyond diabetes to include cardiovascular, hypertension, and cancer screening applications.

Article Details

How to Cite
Obunge, D. O., Muriira, L., & Mbandu, V. (2025). An Ensembled Tabnet-Based Model Approach for Diabetes Disease Classification. International Journal of Professional Practice, 13(4), 16–24. https://doi.org/10.71274/ijpp.v13i4.559
Section
Browse Articles in this Issue

References

Ahmed, I., Jeon, G., & Piccialli, F. (2022). From Artificial Intelligence to Explainable Artificial Intelligence in Industry 4.0: A Survey on What, How, and Where. IEEE Transactions on Industrial Informatics, 18(8), 5031–5042. https://doi.org/10.1109/TII.2022.3146552

Arik, S. Ö., & Pfister, T. (2021). TabNet: Attentive Interpretable Tabular Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 35(8), 6679–6687. https://doi.org/10.1609/aaai.v35i8.16826

Chaddad, A., Peng, J., Xu, J., & Bouridane, A. (2023). Survey of Explainable AI Techniques in Healthcare. Sensors, 23(2), 634. https://doi.org/10.3390/s23020634

Contreras, I., Bertachi, A., Biagi, L., Oviedo, S., Ramkissoon, C., & Vehi, J. (2020). Artificial intelligence-based decision support systems for diabetes. In Artificial Intelligence in Precision Health (pp. 329–357). Elsevier. https://doi.org/10.1016/B978-0-12-817133-2.00014-8

García, G., Gallardo, J., Mauricio, A., López, J., & Del Carpio, C. (2017). Detection of Diabetic Retinopathy Based on a Convolutional Neural Network Using Retinal Fundus Images. In A. Lintas, S. Rovetta, P. F. M. J. Verschure, & A. E. P. Villa (Eds.), Artificial Neural Networks and Machine Learning – ICANN 2017 (Vol. 10614, pp. 635–642). Springer International Publishing. https://doi.org/10.1007/978-3-319-68612-7_72

Hamilton, A. J., Strauss, A. T., Martinez, D. A., Hinson, J. S., Levin, S., Lin, G., & Klein, E. Y. (2021). Machine learning and artificial intelligence: Applications in healthcare epidemiology. Antimicrobial Stewardship & Healthcare Epidemiology, 1(1), e28. https://doi.org/10.1017/ash.2021.192

Jakka, A., & Vakula Rani, J. (2023). An Explainable AI Approach for Diabetes Prediction. In H. S. Saini, R. Sayal, A. Govardhan, & R. Buyya (Eds.), Innovations in Computer Science and Engineering (Vol. 565, pp. 15–25). Springer Nature Singapore. https://doi.org/10.1007/978-981-19-7455-7_2

Joseph, L. P., Joseph, E. A., & Prasad, R. (2022). Explainable diabetes classification using hybrid Bayesian-optimized TabNet architecture. Computers in Biology and Medicine, 151, 106178. https://doi.org/10.1016/j.compbiomed.2022.106178

Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G., & King, D. (2019). Key challenges for delivering clinical impact with artificial intelligence. BMC Medicine, 17(1), 195. https://doi.org/10.1186/s12916-019-1426-2

Kiran Rao, P., & Chatterjee, S. (2022). TabNet to Identify Risks in Chronic Kidney Disease Using GAN’s Synthetic Data. 2022 2nd International Conference on Technological Advancements in Computational Sciences (ICTACS), 209–215. https://doi.org/10.1109/ICTACS56270.2022.9988284

Mirzaei, S., Mao, H., Al-Nima, R. R. O., & Woo, W. L. (2023). Explainable AI Evaluation: A Top-Down Approach for Selecting Optimal Explanations for Black Box Models. Information, 15(1), 4. https://doi.org/10.3390/info15010004

Mohan Raparthy, E. Al. (2023). Predictive Maintenance in IoT Devices using Time Series Analysis and Deep Learning. Dandao Xuebao/Journal of Ballistics, 35(3), 01–10. https://doi.org/10.52783/dxjb.v35.113

Mujahid, M., Kına, E., Rustam, F., Villar, M. G., Alvarado, E. S., De La Torre Diez, I., & Ashraf, I. (2024). Data oversampling and imbalanced datasets: An investigation of performance for machine learning and feature engineering. Journal of Big Data, 11(1), 87. https://doi.org/10.1186/s40537-024-00943-4

Rezaee, K., Savarkar, S., Yu, X., & Zhang, J. (2022). A hybrid deep transfer learning-based approach for Parkinson’s disease classification in surface electromyography signals. Biomedical Signal Processing and Control, 71, 103161. https://doi.org/10.1016/j.bspc.2021.103161

Shah, C., Du, Q., & Xu, Y. (2022a). Enhanced TabNet: Attentive Interpretable Tabular Learning for Hyperspectral Image Classification. Remote Sensing, 14(3), 716. https://doi.org/10.3390/rs14030716

Shah, C., Du, Q., & Xu, Y. (2022b). Enhanced TabNet: Attentive Interpretable Tabular Learning for Hyperspectral Image Classification. Remote Sensing, 14(3), 716. https://doi.org/10.3390/rs14030716

Vakalopoulou, M., Christodoulidis, S., Burgos, N., Colliot, O., & Lepetit, V. (2023). Deep Learning: Basics and Convolutional Neural Networks (CNNs). In O. Colliot (Ed.), Machine Learning for Brain Disorders (Vol. 197, pp. 77–115). Springer US. https://doi.org/10.1007/978-1-0716-3195-9_3

Vujovic, Ž. Ð. (2021). Classification Model Evaluation Metrics. International Journal of Advanced Computer Science and Applications, 12(6). https://doi.org/10.14569/IJACSA.2021.0120670

Wiens, J., Saria, S., Sendak, M., Ghassemi, M., Liu, V. X., Doshi-Velez, F., Jung, K., Heller, K., Kale, D., Saeed, M., Ossorio, P. N., Thadaney-Israni, S., & Goldenberg, A. (2019). Do no harm: A roadmap for responsible machine learning for health care. Nature Medicine, 25(9), 1337–1340. https://doi.org/10.1038/s41591-019-0548-6

Zhang, L., Ma, K., Yuan, F., & Fang, W. (2022). A Tabnet based Card Fraud detetion Algorithm with Feature Engineering. 2022 2nd International Conference on Consumer Electronics and Computer Engineering (ICCECE), 911–914. https://doi.org/10.1109/ICCECE54139.2022.9712822