Performance of Machine Learning Methods in Determining Stroke Risk: A Comparative Study
Yıl 2021,
Cilt: IDAP-2021 : 5th International Artificial Intelligence and Data Processing symposium Sayı: Special, 274 - 287, 20.10.2021
Özer Oğuz
,
Suat Bayır
,
Hasan Badem
Öz
Stroke is a sudden crisis that occurs as a result of reduced or interrupted blood flow in a certain region of the brain or heart. The stroke, which is one of the most common causes of death in the world, also causes permanent disability is known. Therefore, predetermining the risk of stroke is very important to reduce the risk of death or permanent disability. In this study, 13 different machine learning methods have been used for early diagnosis and risk classification of stroke and experimental results have been obtained. The obtained experimental results have been evaluated on various comparison criteria. In the obtained experimental results, Random Forest Classifier has been found to be the most successful method with accuracy rate with 99.425%.
Kaynakça
- Anusha M., Suresh K., Chandana M. (2021) Earlier Prediction on the heart disease based on supervised machine learning techniques. Proceedings of the Fifth International Conference on Intelligent Computing and Control Systems (ICICCS), pp. 1696-1703. Madurai, India.
- Badem H. (2019) Parkinson Hastalığının Ses Sinyalleri Üzerinden Makine Öğrenmesi Teknikleri ile Tanımlanması. Ömer Halisdemir Üniversitesi Mühendislik Bilimleri Dergisi 8(2): 630-637.
- Badem H., Baştürk A., Çalışkan A. Yüksel M. E. (2017) A new efficient training strategy for deep neural networks by hybridization of artificial bee colony and limited-memory BFGS optimization algorithms. Neurocomputing 266: 506-526.
- Bayes T. (1763) An essay towards solving a problem in the doctrine of chances. By the late Rev. Mr. Bayes, F. R. S. communicated by Mr. Price, in a letter to John Canton, A. M. F. R. S. Philisophical Transactions (1683-1775) 53: 370-418.
- Berrar D. (2019) Cross-Validation. Encyclopedia of Bioinformatics and Computational Biology, Elsevier, pp. 542-545.
- Breiman L. (2001) Random Forests. Machine Learning 45: 5-32.
- Caplan L. R. (2016) Caplan's Stroke - A Clinical Approach, Cambridge University Press, p. 19.
- Cheon S., Kim J., Lim J. (2019) The Use of Deep Learning to Predict Stroke Patient Mortality. International Journal of Environmental Research and Public Health 16(11):1-12.
- Cortes C., Vapnik V. (1995) Support vector networks. Machine Learning 20(3): 273-297.
- Çalışkan M., Badem H. (2021) Makine Öğrenme Teknikleri Kullanılarak Epilepsi Teşhisi, M. Kalkancı, A. Günday (Ed.), Mühendislik Alanında Araştırma ve Değerlendirmeler, Gece Kitaplığı Yayınları, pp. 61-76.
- Dudani S. A. (1976) The Distance-Weighted k-Nearest-Neighbor Rule. IEEE Transactions on Systems, Man, and Cybernetics 6(4): 325-327.
- Emon M. U., Keya M.S., Meghla T. I., Rahman M. M., Al Mamun S., Kaiser M. S., (2020) Performance Analysis of Machine Learning. Fourth International Conference on Electronics, Communication and Aerospace Technology (ICECA), pp. 1464-1469. Coimbatore, India.
- Emre A., Çetiner M., Korkut Y. (2019) İnmeli Hastalarda Yaşam Kalitesi ve İlişkili Faktörler. Turkish Journal of Family Medicine and Primary Care (TJFMPC) 13(3): 103-111.
- Erkal B., Başak S., Ciloğlu A., Dede Şener D. (2020) Multiclass Classification of Brain Cancer with Machine Learning Algorithms. 2020 Medical Technologies Congress (TIPTEKNO), pp.1-4. Antalya,Turkey.
- Freidman J. H. (2001) Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics 29(5): 1189-1232.
- Freund Y., Schapire R. E. (1997) A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer and System Sciences 55(1): 119-139.
- Geurts P., Ernst D., Wehenkel L. (2006) Extremely randomized trees. Machine Learning 63: 3-42.
- Hossin M., Sulaiman N. (2015) A Review on Evaluation Metrics for Data Classification Evaluations. International Journal of Data Mining & Knowledge Management Process (IJDKP) 5(2): 1-11.
- Kaggle Stroke Prediction Dataset (2021). https://www.kaggle.com/fedesoriano/stroke-prediction-dataset. Erişim tarihi: 04 July 2021.
- Kalles D., Morris T. (1994) Efficient Incremental Induction of Decision Trees. Machine Learning 24(3): 231-242.
- Peng C. C., Wang S. H., Liu S. J., Yang Y. K., Liao B. H. (2020) Artificial Neural Network Application to the Stroke Prediction. 2020 IEEE 2nd Eurasia Conference on Biomedical Engineering, Healthcare and Sustainability (ECBIOS), pp. 130-133.
- Ray S., Alshouiliy K., Roy A, AlGhamdi A., Agrawal D. P. (2020) Chi-Squared Based Feature Selection for Stroke Prediction using AzureML. 2020 Intermountain Engineering, Technology and Computing (IETC), pp. 1-6. Orem, UT, USA.
- Revanth S., Sanjay S., Sanjay N., Vijagayaganth V. (2020) Stroke Prediction using Machine Learning Algorithms. International Journal of Disaster Recovery and Business Continuity 11(1): 3081-3086.
- Rish I. (2001) An Empirical Study of the Naïve Bayes Classifier. IJCAI 2001 workshop on empirical methods 3: 41-46. IBM New York.
- Saleh H., Ghanny F.A., Younis E., Omran N., Abdelmgeid A. (2019) Stroke Prediction using Distributed Machine Learning Based on Apache Spark. International Journal of Advanced Science and Technology (IJAST) 28(15): 89-97.
- Sevli O. (2021) İnme (Felç) Riskinin Makine Öğrenmesi Kullanılarak Tespiti. 7. Uluslararası Mühendislik Mimarlık ve Tasarım Kongresi (7th International Congress on Engineering, Architecture and Design), pp. 661-667. İstanbul, Türkiye.
- Singh M. S., Choudhary P. (2017) Stroke Prediction using Artificial Intelligence. 8th Annual Industrial Automation and Electromechanical Engineering Conference (IEMECON), pp. 158-161. Bangkok, Thailand.
- Song Y., Kong X., Huang S., Zhang C. (2021) Fast Training Logistic Regression via Adaptive Sampling. Scientific Programming 2021(2): 1-11.
- Truelsen T., Begg S., Mathers C. (2006) The Global Burden of Cerebrovascular Disease, Genova.
- Uzun R., Erkaymaz O., Şenyer Yapıcı İ. (2018) Comparison of Artificial Neural Network and Regression Models to Diagnose of Knee Disorder in Different Postures Using Surface Electromyography. Journal of Science 31(1): 100-110.
- WHO (2019) Global Health Estimates: Life expectancy and leading causes of death and disability, World Health Organization.
- WHO (2020) Global Health Estimates 2020: Deaths by Cause, Age, Sex, by Country and by Region, 2000-2019., World Health Organization, Genova.
- Yüksel M. E., Sarıkaya Baştürk N., Badem H., Çalışkan A., Baştürk A. (2018) Classification of high resolution hyperspectral remote sensing data using deep neural networks. Journal of Intelligent & Fuzzy Systems 34(4): 2273–2285.
Makine Öğrenmesi Yöntemlerinin Felç Riskinin Belirlenmesinde Performansı: Karşılaştırmalı bir çalışma
Yıl 2021,
Cilt: IDAP-2021 : 5th International Artificial Intelligence and Data Processing symposium Sayı: Special, 274 - 287, 20.10.2021
Özer Oğuz
,
Suat Bayır
,
Hasan Badem
Öz
Felç (inme), beyin ya da kalbin belli bir bölgesinde kan akışının azalması ya da kesilmesi sonucunda gerçekleşen ani krizlerdir. Dünya genelinde ölüme en çok neden olan rahatsızlıklardan biri olan felcin kalıcı sakatlanmalara da neden olduğu bilinmektedir. Bu nedenle felç riskinin önceden belirlenmesi ölüm ya da kalıcı sakatlık riskinin azaltılması için oldukça önemlidir. Bu çalışmada felcin erken teşhisi ve risk sınıflandırması için 13 farklı makine öğrenme yöntemi kullanılmış ve deneysel sonuçlar elde edilmiştir. Elde edilen deneysel sonuçlar çeşitli başarı karşılaştırma ölçütlerine göre değerlendirilerek en başarılı makine öğrenme modeli belirlenmiştir. Elde edilen deneysel sonuçlarda Rastgele Orman Sınıflandırıcısı 99.425% doğruluk değeri ile en başarılı yöntem olduğu görülmüştür.
Kaynakça
- Anusha M., Suresh K., Chandana M. (2021) Earlier Prediction on the heart disease based on supervised machine learning techniques. Proceedings of the Fifth International Conference on Intelligent Computing and Control Systems (ICICCS), pp. 1696-1703. Madurai, India.
- Badem H. (2019) Parkinson Hastalığının Ses Sinyalleri Üzerinden Makine Öğrenmesi Teknikleri ile Tanımlanması. Ömer Halisdemir Üniversitesi Mühendislik Bilimleri Dergisi 8(2): 630-637.
- Badem H., Baştürk A., Çalışkan A. Yüksel M. E. (2017) A new efficient training strategy for deep neural networks by hybridization of artificial bee colony and limited-memory BFGS optimization algorithms. Neurocomputing 266: 506-526.
- Bayes T. (1763) An essay towards solving a problem in the doctrine of chances. By the late Rev. Mr. Bayes, F. R. S. communicated by Mr. Price, in a letter to John Canton, A. M. F. R. S. Philisophical Transactions (1683-1775) 53: 370-418.
- Berrar D. (2019) Cross-Validation. Encyclopedia of Bioinformatics and Computational Biology, Elsevier, pp. 542-545.
- Breiman L. (2001) Random Forests. Machine Learning 45: 5-32.
- Caplan L. R. (2016) Caplan's Stroke - A Clinical Approach, Cambridge University Press, p. 19.
- Cheon S., Kim J., Lim J. (2019) The Use of Deep Learning to Predict Stroke Patient Mortality. International Journal of Environmental Research and Public Health 16(11):1-12.
- Cortes C., Vapnik V. (1995) Support vector networks. Machine Learning 20(3): 273-297.
- Çalışkan M., Badem H. (2021) Makine Öğrenme Teknikleri Kullanılarak Epilepsi Teşhisi, M. Kalkancı, A. Günday (Ed.), Mühendislik Alanında Araştırma ve Değerlendirmeler, Gece Kitaplığı Yayınları, pp. 61-76.
- Dudani S. A. (1976) The Distance-Weighted k-Nearest-Neighbor Rule. IEEE Transactions on Systems, Man, and Cybernetics 6(4): 325-327.
- Emon M. U., Keya M.S., Meghla T. I., Rahman M. M., Al Mamun S., Kaiser M. S., (2020) Performance Analysis of Machine Learning. Fourth International Conference on Electronics, Communication and Aerospace Technology (ICECA), pp. 1464-1469. Coimbatore, India.
- Emre A., Çetiner M., Korkut Y. (2019) İnmeli Hastalarda Yaşam Kalitesi ve İlişkili Faktörler. Turkish Journal of Family Medicine and Primary Care (TJFMPC) 13(3): 103-111.
- Erkal B., Başak S., Ciloğlu A., Dede Şener D. (2020) Multiclass Classification of Brain Cancer with Machine Learning Algorithms. 2020 Medical Technologies Congress (TIPTEKNO), pp.1-4. Antalya,Turkey.
- Freidman J. H. (2001) Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics 29(5): 1189-1232.
- Freund Y., Schapire R. E. (1997) A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer and System Sciences 55(1): 119-139.
- Geurts P., Ernst D., Wehenkel L. (2006) Extremely randomized trees. Machine Learning 63: 3-42.
- Hossin M., Sulaiman N. (2015) A Review on Evaluation Metrics for Data Classification Evaluations. International Journal of Data Mining & Knowledge Management Process (IJDKP) 5(2): 1-11.
- Kaggle Stroke Prediction Dataset (2021). https://www.kaggle.com/fedesoriano/stroke-prediction-dataset. Erişim tarihi: 04 July 2021.
- Kalles D., Morris T. (1994) Efficient Incremental Induction of Decision Trees. Machine Learning 24(3): 231-242.
- Peng C. C., Wang S. H., Liu S. J., Yang Y. K., Liao B. H. (2020) Artificial Neural Network Application to the Stroke Prediction. 2020 IEEE 2nd Eurasia Conference on Biomedical Engineering, Healthcare and Sustainability (ECBIOS), pp. 130-133.
- Ray S., Alshouiliy K., Roy A, AlGhamdi A., Agrawal D. P. (2020) Chi-Squared Based Feature Selection for Stroke Prediction using AzureML. 2020 Intermountain Engineering, Technology and Computing (IETC), pp. 1-6. Orem, UT, USA.
- Revanth S., Sanjay S., Sanjay N., Vijagayaganth V. (2020) Stroke Prediction using Machine Learning Algorithms. International Journal of Disaster Recovery and Business Continuity 11(1): 3081-3086.
- Rish I. (2001) An Empirical Study of the Naïve Bayes Classifier. IJCAI 2001 workshop on empirical methods 3: 41-46. IBM New York.
- Saleh H., Ghanny F.A., Younis E., Omran N., Abdelmgeid A. (2019) Stroke Prediction using Distributed Machine Learning Based on Apache Spark. International Journal of Advanced Science and Technology (IJAST) 28(15): 89-97.
- Sevli O. (2021) İnme (Felç) Riskinin Makine Öğrenmesi Kullanılarak Tespiti. 7. Uluslararası Mühendislik Mimarlık ve Tasarım Kongresi (7th International Congress on Engineering, Architecture and Design), pp. 661-667. İstanbul, Türkiye.
- Singh M. S., Choudhary P. (2017) Stroke Prediction using Artificial Intelligence. 8th Annual Industrial Automation and Electromechanical Engineering Conference (IEMECON), pp. 158-161. Bangkok, Thailand.
- Song Y., Kong X., Huang S., Zhang C. (2021) Fast Training Logistic Regression via Adaptive Sampling. Scientific Programming 2021(2): 1-11.
- Truelsen T., Begg S., Mathers C. (2006) The Global Burden of Cerebrovascular Disease, Genova.
- Uzun R., Erkaymaz O., Şenyer Yapıcı İ. (2018) Comparison of Artificial Neural Network and Regression Models to Diagnose of Knee Disorder in Different Postures Using Surface Electromyography. Journal of Science 31(1): 100-110.
- WHO (2019) Global Health Estimates: Life expectancy and leading causes of death and disability, World Health Organization.
- WHO (2020) Global Health Estimates 2020: Deaths by Cause, Age, Sex, by Country and by Region, 2000-2019., World Health Organization, Genova.
- Yüksel M. E., Sarıkaya Baştürk N., Badem H., Çalışkan A., Baştürk A. (2018) Classification of high resolution hyperspectral remote sensing data using deep neural networks. Journal of Intelligent & Fuzzy Systems 34(4): 2273–2285.