Chronic Kidney Disease Prediction with Stacked Ensemble-Based Model
Year 2023,
Volume: 1 Issue: 1, 50 - 61, 31.12.2023
Erkan Akkur
,
Ahmet Cankat Öztürk
Abstract
Chronic kidney disease (CKD) is viewed as a significant health issue worldwide. Treating this disease early is crucial to prevent it from causing further problems. Researchers have been using different machine learning-based approaches to predict this disease in recent years. The focus of this paper is on a stacked ensemble model that can be used to predict CKD. The proposed model is applied to an open-access CKD dataset. The dataset is made suitable for classification by undergoing several pre-processing steps. The proposed model comprises two phases. First, the prediction process was performed using base classifiers. Then, the stacked ensemble model is used to combine these base classifiers in the best way. The recursive feature elimination technique is used to select the most discriminative features. The optimal hyperparameters for classification algorithms are determined using the hyperparameter optimization technique. When compared to other base classifiers, the suggested stacked model achieves 100% accuracy. Furthermore, the proposed model is compared to various approaches in the literature and achieved a high classification rate.
Ethical Statement
Since the data set used in this study is publicly available, ethics committee permission was not required.
References
- [1] AC. Webster, EV Nagler, RL. Morton RL et al. (2017) “Chronic kidney disease”. Lancet Lond Engl 389(10075):1238–1252, 2017.
- [2] CP. Kovesdy, “Epidemiology of chronic kidney disease: an update 2022.” Kidney International Supplements, 12(1), 7-11, 2022.
- [3] J. Qezelbash-Chamak, S. Badamchizadeh, K. Eshghi, Y. Asadi, “A survey of machine learning in kidney disease diagnosis.” Machine Learning with Applications, 10, 100418, 2022.
- [4] F. Sanmarchi, C. Fanconi, D. Golinelli, D. Gori, T. Hernandez-Boussard, A. Capodici, “Predict, diagnose, and treat chronic kidney disease with machine learning: a systematic literature review.” Journal of nephrology, 1-17, 2023.
- [5] O. Sagi & L. Rokach, Ensemble learning: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4), e1249, 2018.
- [6] V. Kumar & S. Minz, “Feature selection: a literature review.” SmartCR, 4(3), 211-229, 2014.
- [7] A. Viloria, O. B. P., Lezama, & N. Mercado-Caruzo, “Unbalanced data processing using oversampling: machine learning.” Procedia Computer Science, 175, 108-113, 2020.
- [8] S. Pal, “Chronic Kidney Disease Prediction Using Machine Learning Techniques.” Biomedical Materials & Devices, 2022.
- [9] DA. Debal, D.A., TM. Sitote, “Chronic kidney disease prediction using machine learning techniques”, J Big Data 9, 109, 2022.
- [10] K. Amogh Babu, K. Priyanka, T. Raghavendra Babu T, “Chronic kidney disease prediction based on naive Bayes technique”. International Research Journal of Engineering and Technology (IRJET) p. 1653–1659, 2019.
- [11] MA. Islam, MZH. Majumder, MA Hussein, “Chronic kidney disease prediction based on machine learning algorithms.” J Pathol Inform. 12; 14:100189, 2023.
- [12] P. Chittora et al., “Prediction of chronic kidney disease - a machine learning perspective,” in IEEE Access, vol. 9, pp. 17312-17334, 2021.
- [13] Rajeshwari and H. K. Yogish, “Prediction of chronic kidney disease using machine learning technique,” 2022 Fourth International Conference on Cognitive Computing and Information Processing (CCIP), Bengaluru, India, 2022, pp. 1-6.
- [14] M. S I. Wibawa, M. D. Maysanjaya and I. M. A. W Putra., “Boosted classifier and features selection for enhancing chronic kidney disease diagnose,” 2017 5th International Conference on Cyber and IT Service Management (CITSM), Denpasar, Indonesia, 2017, pp. 1-6
- [15] A. Farjana, FT. Liza, PP. Pandit, MC. Das, M. Hasan, F. Tabassum, MH. Hossen, “Predicting Chronic Kidney Disease Using Machine Learning Algorithms.” In Proceedings of the 2023 IEEE 13th Annual Computing and Communication Workshop and Conference, Las Vegas, NV, USA, 8–11 March 2023; pp. 1267–1271
- [16] Z. Ullah, M. Jamjoom, “Early detection and diagnosis of chronic kidney disease based on selected predominant features.” J. Healthc. Eng. 2023:3553216, 2023.
- [17] Arif, M.S.; Mukheimer, A.; Asif, D. (2023), Enhancing the early detection of chronic kidney disease: a robust machine learning model. Big Data Cogn. Comput., 7, 144.
- [18] VK. Venkatesan, MT. Ramakrishna, I. Izonin, R. Tkachenko, M. Havryliuk, “Efficient data preprocessing with ensemble machine learning technique for the early detection of chronic kidney disease”. Appl. Sci., 13(5), 2885, 2023.
- [19] L. Rubini, P. Soundarapandian, P. Eswaran, Chronic Kidney Disease. UCI Machine Learning Repository. 2015.
Available online: https://archive.ics.uci.edu/dataset/336/chronic+kidney+disease (accessed on 21.11. 2023).
- [20] J. Han, J. Pei J, H. Tong, “Data mining: concepts and techniques.” Morgan kaufmann; 2022.
- [21] NV. Chawla, KW. Bowyer, LO. Hall, WP. Kegelmeyer, SMOTE: synthetic minority over-sampling technique.” Journal of artificial intelligence research. 16: 321-57, 2002.
- [22] D. Singh, B. Singh, “Investigating the impact of data normalization on classification performance.” Applied Soft Computing. 97: 105524, 2020.
- [23] H. Sanz, C. Valim, E. Vegas, JM. Oller, & F. Reverter, F. “SVM-RFE: selection and visualization of the most relevant features through non-linear kernels.” BMC bioinformatics, 19(1), 1-18, 2018.
- [24] VR. Joseph, “Optimal ratio for data splitting.” Stat. Anal. Data Mining ASA Data Sci. J., 15, 531–538, 2022.
- [25] A. I. Naimi, LB. Balzer, “Stacked generalization: an introduction to super learning.” European journal of epidemiology, 33, 459-464, 2018.
- [26] S. Tewari, U.D. Dwivedi, “A comparative study of heterogeneous ensemble methods for the identification of geological lithofacies.” J Petrol Explor Prod Technol, 10, 1849–1868, 2020.
- [27] D. M., Belete, & M. D. Huchaiah, “Grid search in hyperparameter optimization of machine learning models for prediction of HIV/AIDS test results.” International Journal of Computers and Applications, 44(9), 875-886, 2022.
- [28] I. Reis, D. Baron, S. Shahaf, “Probabilistic random forest: A machine learning algorithm for noisy data sets.” Astron. JW, 157, 16, 2018.
- [29] M. Fattah, N. A. Othman, N. Gower, “Predicting chronic kidney disease using hybrid machine learning based on Apache Spark”, Comput Intell Neurosci. 23; 2022:9898831, 2022.
- [30] A. Rahman, T. Saba, H. Ali, N. ElHakim, N. Ayesha, “Hybrid machine learning model to predict chronic kidney diseases using handcrafted features for early health rehabilitation using handcrafted features for early health rehabilitation”, Turkish Journal of Electrical Engineering and Computer Sciences: Vol. 31: No. 6, Article 4, 2023.
Yığılmış Topluluk Tabanlı Model ile Kronik Böbrek Hastalığı Tahmini
Year 2023,
Volume: 1 Issue: 1, 50 - 61, 31.12.2023
Erkan Akkur
,
Ahmet Cankat Öztürk
Abstract
Kronik böbrek hastalığı (KBH) dünya genelinde önemli bir sağlık problemi olarak kabul edilmektedir. Daha fazla soruna yol açmasını önlemek için bu hastalığın erken dönemde tedavi edilmesi çok önemlidir. Araştırmacılar son yıllarda bu hastalığı tahmin etmek için farklı makine öğrenimi tabanlı yaklaşımlar kullanmaktadır. Bu makalenin odak noktası, KBH'yi tahmin etmek için kullanılabilecek yığılmış bir topluluk modelidir. Önerilen model açık erişimli bir CKD veri setine uygulanmıştır. Veri kümesi, çeşitli ön işleme adımlarından geçirilerek sınıflandırma için uygun hale getirilmiştir. Önerilen topluluk modeli iki aşamadan oluşmaktadır. İlk olarak, tahmin işlemi temel sınıflandırıcılar kullanılarak gerçekleştirilmiştir. Ardından, bu temel sınıflandırıcıları en iyi şekilde birleştirmek için yığılmış topluluk modeli kullanılır. En ayırt edici öznitelikleri seçmek için özyinelemeli öznitelik eleme tekniği kullanılmıştır. Hiperparametre optimizasyon tekniği kullanılarak sınıflandırma algoritmaları için en uygun hiperparametreler belirlenmiştir. Diğer temel sınıflandırıcılarla karşılaştırıldığında, önerilen yığılmış model %100 doğruluk elde etmektedir. Ayrıca, önerilen model literatürdeki farklı yaklaşımlara karşı değerlendirilmiş ve yüksek bir sınıflandırma oranına ulaşmıştır.
References
- [1] AC. Webster, EV Nagler, RL. Morton RL et al. (2017) “Chronic kidney disease”. Lancet Lond Engl 389(10075):1238–1252, 2017.
- [2] CP. Kovesdy, “Epidemiology of chronic kidney disease: an update 2022.” Kidney International Supplements, 12(1), 7-11, 2022.
- [3] J. Qezelbash-Chamak, S. Badamchizadeh, K. Eshghi, Y. Asadi, “A survey of machine learning in kidney disease diagnosis.” Machine Learning with Applications, 10, 100418, 2022.
- [4] F. Sanmarchi, C. Fanconi, D. Golinelli, D. Gori, T. Hernandez-Boussard, A. Capodici, “Predict, diagnose, and treat chronic kidney disease with machine learning: a systematic literature review.” Journal of nephrology, 1-17, 2023.
- [5] O. Sagi & L. Rokach, Ensemble learning: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4), e1249, 2018.
- [6] V. Kumar & S. Minz, “Feature selection: a literature review.” SmartCR, 4(3), 211-229, 2014.
- [7] A. Viloria, O. B. P., Lezama, & N. Mercado-Caruzo, “Unbalanced data processing using oversampling: machine learning.” Procedia Computer Science, 175, 108-113, 2020.
- [8] S. Pal, “Chronic Kidney Disease Prediction Using Machine Learning Techniques.” Biomedical Materials & Devices, 2022.
- [9] DA. Debal, D.A., TM. Sitote, “Chronic kidney disease prediction using machine learning techniques”, J Big Data 9, 109, 2022.
- [10] K. Amogh Babu, K. Priyanka, T. Raghavendra Babu T, “Chronic kidney disease prediction based on naive Bayes technique”. International Research Journal of Engineering and Technology (IRJET) p. 1653–1659, 2019.
- [11] MA. Islam, MZH. Majumder, MA Hussein, “Chronic kidney disease prediction based on machine learning algorithms.” J Pathol Inform. 12; 14:100189, 2023.
- [12] P. Chittora et al., “Prediction of chronic kidney disease - a machine learning perspective,” in IEEE Access, vol. 9, pp. 17312-17334, 2021.
- [13] Rajeshwari and H. K. Yogish, “Prediction of chronic kidney disease using machine learning technique,” 2022 Fourth International Conference on Cognitive Computing and Information Processing (CCIP), Bengaluru, India, 2022, pp. 1-6.
- [14] M. S I. Wibawa, M. D. Maysanjaya and I. M. A. W Putra., “Boosted classifier and features selection for enhancing chronic kidney disease diagnose,” 2017 5th International Conference on Cyber and IT Service Management (CITSM), Denpasar, Indonesia, 2017, pp. 1-6
- [15] A. Farjana, FT. Liza, PP. Pandit, MC. Das, M. Hasan, F. Tabassum, MH. Hossen, “Predicting Chronic Kidney Disease Using Machine Learning Algorithms.” In Proceedings of the 2023 IEEE 13th Annual Computing and Communication Workshop and Conference, Las Vegas, NV, USA, 8–11 March 2023; pp. 1267–1271
- [16] Z. Ullah, M. Jamjoom, “Early detection and diagnosis of chronic kidney disease based on selected predominant features.” J. Healthc. Eng. 2023:3553216, 2023.
- [17] Arif, M.S.; Mukheimer, A.; Asif, D. (2023), Enhancing the early detection of chronic kidney disease: a robust machine learning model. Big Data Cogn. Comput., 7, 144.
- [18] VK. Venkatesan, MT. Ramakrishna, I. Izonin, R. Tkachenko, M. Havryliuk, “Efficient data preprocessing with ensemble machine learning technique for the early detection of chronic kidney disease”. Appl. Sci., 13(5), 2885, 2023.
- [19] L. Rubini, P. Soundarapandian, P. Eswaran, Chronic Kidney Disease. UCI Machine Learning Repository. 2015.
Available online: https://archive.ics.uci.edu/dataset/336/chronic+kidney+disease (accessed on 21.11. 2023).
- [20] J. Han, J. Pei J, H. Tong, “Data mining: concepts and techniques.” Morgan kaufmann; 2022.
- [21] NV. Chawla, KW. Bowyer, LO. Hall, WP. Kegelmeyer, SMOTE: synthetic minority over-sampling technique.” Journal of artificial intelligence research. 16: 321-57, 2002.
- [22] D. Singh, B. Singh, “Investigating the impact of data normalization on classification performance.” Applied Soft Computing. 97: 105524, 2020.
- [23] H. Sanz, C. Valim, E. Vegas, JM. Oller, & F. Reverter, F. “SVM-RFE: selection and visualization of the most relevant features through non-linear kernels.” BMC bioinformatics, 19(1), 1-18, 2018.
- [24] VR. Joseph, “Optimal ratio for data splitting.” Stat. Anal. Data Mining ASA Data Sci. J., 15, 531–538, 2022.
- [25] A. I. Naimi, LB. Balzer, “Stacked generalization: an introduction to super learning.” European journal of epidemiology, 33, 459-464, 2018.
- [26] S. Tewari, U.D. Dwivedi, “A comparative study of heterogeneous ensemble methods for the identification of geological lithofacies.” J Petrol Explor Prod Technol, 10, 1849–1868, 2020.
- [27] D. M., Belete, & M. D. Huchaiah, “Grid search in hyperparameter optimization of machine learning models for prediction of HIV/AIDS test results.” International Journal of Computers and Applications, 44(9), 875-886, 2022.
- [28] I. Reis, D. Baron, S. Shahaf, “Probabilistic random forest: A machine learning algorithm for noisy data sets.” Astron. JW, 157, 16, 2018.
- [29] M. Fattah, N. A. Othman, N. Gower, “Predicting chronic kidney disease using hybrid machine learning based on Apache Spark”, Comput Intell Neurosci. 23; 2022:9898831, 2022.
- [30] A. Rahman, T. Saba, H. Ali, N. ElHakim, N. Ayesha, “Hybrid machine learning model to predict chronic kidney diseases using handcrafted features for early health rehabilitation using handcrafted features for early health rehabilitation”, Turkish Journal of Electrical Engineering and Computer Sciences: Vol. 31: No. 6, Article 4, 2023.