Araştırma Makalesi
BibTex RIS Kaynak Göster
Yıl 2025, Cilt: 13 Sayı: 1, 367 - 381, 24.03.2025
https://doi.org/10.29109/gujsc.1489959

Öz

Proje Numarası

KBÜBAP-24-YL-065

Kaynakça

  • [1] N. AYDIN ATASOY and F. ÇAKMAK, “Web Tabanlı Sürücü Davranışları Analiz Uygulaması,” Gazi Journal of Engineering Sciences, vol. 7, no. 3, pp. 264–276, Dec. 2021, doi: 10.30855/gmbd.2021.03.09.
  • [2] E. DİKBIYIK, Ö. DEMİR, and B. DOĞAN, “Derin Öğrenme Yöntemleri İle Konuşmadan Duygu Tanıma Üzerine Bir Literatür Araştırması,” Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım ve Teknoloji, vol. 10, no. 4, pp. 765–791, Dec. 2022, doi: 10.29109/gujsc.1111884.
  • [3] Ö. TONKAL and H. POLAT, “Traffic Classification and Comparative Analysis with Machine Learning Algorithms in Software Defined Networks,” Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım ve Teknoloji, vol. 9, no. 1, pp. 71–83, Mar. 2021, doi: 10.29109/gujsc.869418.
  • [4] M. B. ER, “Akciğer Seslerinin Derin Öğrenme İle Sınıflandırılması,” Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım ve Teknoloji, vol. 8, no. 4, pp. 830–844, Dec. 2020, doi: 10.29109/gujsc.758325.
  • [5] R. Alanazi, “Identification and Prediction of Chronic Diseases Using Machine Learning Approach,” J Healthc Eng, vol. 2022, 2022, doi: 10.1155/2022/2826127.
  • [6] I. D. Mienye, Y. Sun, and Z. Wang, “An improved ensemble learning approach for the prediction of heart disease risk,” Inform Med Unlocked, vol. 20, Jan. 2020, doi: 10.1016/j.imu.2020.100402.
  • [7] S. Dhabarde, R. Mahajan, S. Mishra, S. Chaudhari, S. Manelu, and N. S. Shelke, “DISEASE PREDICTION USING MACHINE LEARNING ALGORITHMS”, [Online]. Available: www.irjmets.com
  • [8] S. Vilas and A. M. S. Scholar, “Diseases Prediction Model using Machine Learning Technique”, doi: 10.32628/IJSRST.
  • [9] A. Mujumdar and V. Vaidehi, “Diabetes Prediction using Machine Learning Algorithms,” in Procedia Computer Science, Elsevier B.V., 2019, pp. 292–299. doi: 10.1016/j.procs.2020.01.047.
  • [10] T. H. H. Aldhyani, A. S. Alshebami, and M. Y. Alzahrani, “Soft Clustering for Enhancing the Diagnosis of Chronic Diseases over Machine Learning Algorithms,” J Healthc Eng, vol. 2020, 2020, doi: 10.1155/2020/4984967.
  • [11] S. F. Weng, J. Reps, J. Kai, J. M. Garibaldi, and N. Qureshi, “Can Machine-learning improve cardiovascular risk prediction using routine clinical data?,” PLoS One, vol. 12, no. 4, Apr. 2017, doi: 10.1371/JOURNAL.PONE.0174944.
  • [12] S. Nusinovici et al., “Logistic regression was as good as machine learning for predicting major chronic diseases,” J Clin Epidemiol, vol. 122, pp. 56–69, Jun. 2020, doi: 10.1016/J.JCLINEPI.2020.03.002.
  • [13] J. Al Nahian, A. K. M. Masum, S. Abujar, and M. J. Mia, “Common human diseases prediction using machine learning based on survey data,” Bulletin of Electrical Engineering and Informatics, vol. 11, no. 6, pp. 3498–3508, Dec. 2022, doi: 10.11591/eei.v11i6.3405.
  • [14] N. Aydin Atasoy and A. Faris Abdulla Al Rahhawi, “Examining the classification performance of pre-trained capsule networks on imbalanced bone marrow cell dataset,” International Journal of Imaging Systems and Technology , vol. 34, no. 3, May 2024, doi: 10.1002/ima.23067.
  • [15] J. Bergstra, J. B. Ca, and Y. B. Ca, “Random Search for Hyper-Parameter Optimization Yoshua Bengio,” 2012. [Online]. Available: http://scikit-learn.sourceforge.net.
  • [16] M. Claesen and B. De Moor, “Hyperparameter Search in Machine Learning,” Feb. 2015, [Online]. Available: http://arxiv.org/abs/1502.02127
  • [17] Y. A. Ali, E. M. Awwad, M. Al-Razgan, and A. Maarouf, “Hyperparameter Search for Machine Learning Algorithms for Optimizing the Computational Complexity,” Processes, vol. 11, no. 2, Feb. 2023, doi: 10.3390/pr11020349.
  • [18] A. E. W. Johnson et al., “MIMIC-III, a freely accessible critical care database,” Sci Data, vol. 3, May 2016, doi: 10.1038/sdata.2016.35.
  • [19] J. Snoek, H. Larochelle, and R. P. Adams, “Practical Bayesian Optimization of Machine Learning Algorithms.”
  • [20] N. V Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic Minority Over-sampling Technique,” 2002.
  • [21] M. ÇOLAK, T. TÜMER SİVRİ, N. PERVAN AKMAN, A. BERKOL, and Y. EKİCİ, “Disease prognosis using machine learning algorithms based on new clinical dataset,” Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering, vol. 65, no. 1, pp. 52–68, Jun. 2023, doi: 10.33769/aupse.1215962.
  • [22] F. A. Latifah, I. Slamet, and Sugiyanto, “Comparison of heart disease classification with logistic regression algorithm and random forest algorithm,” AIP Conf Proc, vol. 2296, Nov. 2020, doi: 10.1063/5.0030579.
  • [23] R. Valarmathi and T. Sheela, “Heart disease prediction using hyper parameter optimization (HPO) tuning,” Biomed Signal Process Control, vol. 70, p. 103033, Sep. 2021, doi: 10.1016/J.BSPC.2021.103033.
  • [24] M. Feurer and F. Hutter, “Hyperparameter Optimization,” in Automated Machine Learning, 2019, pp. 3–33. doi: 10.1007/978-3-030-05318-5_1.
  • [25] B. Bischl, J. Richter, J. Bossek, D. Horn, J. Thomas, and M. Lang, “mlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions,” Mar. 2017, [Online]. Available: http://arxiv.org/abs/1703.03373
  • [26] G. Luo, “A review of automatic selection methods for machine learning algorithms and hyper-parameter values,” Network Modeling Analysis in Health Informatics and Bioinformatics, vol. 5, no. 1, Dec. 2016, doi: 10.1007/s13721-016-0125-6.
  • [27] P. Probst and B. Bischl, “Tunability: Importance of Hyperparameters of Machine Learning Algorithms,” 2019. [Online]. Available: http://jmlr.org/papers/v20/18-444.html.
  • [28] L. Yang and A. Shami, “On hyperparameter optimization of machine learning algorithms: Theory and practice,” Neurocomputing, vol. 415, pp. 295–316, Nov. 2020, doi: 10.1016/j.neucom.2020.07.061.
  • [29] D. J. Hand, “Measuring classifier performance: A coherent alternative to the area under the ROC curve,” Mach Learn, vol. 77, no. 1, pp. 103–123, Oct. 2009, doi: 10.1007/s10994-009-5119-5.
  • [30] T. Hastie, R. Tibshirani, and J. Friedman, “Springer Series in Statistics The Elements of Statistical Learning Data Mining, Inference, and Prediction.”
  • [31] A. Y. Ng, “Feature selection, L 1 vs. L 2 regularization, and rotational invariance,” in Twenty-first international conference on Machine learning - ICML ’04, New York, New York, USA: ACM Press, 2004, p. 78. doi: 10.1145/1015330.1015435.
  • [32] H. Zou and T. Hastie, “Regularization and variable selection via the elastic net,” J R Stat Soc Series B Stat Methodol, vol. 67, no. 2, pp. 301–320, 2005, doi: 10.1111/j.1467-9868.2005.00503.x.
  • [33] F. Pedregosa FABIANPEDREGOSA et al., “Scikit-learn: Machine Learning in Python Gaël Varoquaux Bertrand Thirion Vincent Dubourg Alexandre Passos PEDREGOSA, VAROQUAUX, GRAMFORT ET AL. Matthieu Perrot,” 2011. [Online]. Available: http://scikit-learn.sourceforge.net.
  • [34] G. C. Cawley and N. L. C. Talbot, “On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation,” 2010.
  • [35] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016.
  • [36] L. Breiman, “Random Forests,” in Machine Learning, vol. 45, 2001, pp. 5–32. doi: 10.1023/A:1010933404324.
  • [37] P. Probst, M. N. Wright, and A. L. Boulesteix, “Hyperparameters and tuning strategies for random forest,” in Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 9, no. 3, Wiley-Blackwell, 2019. doi: 10.1002/widm.1301.
  • [38] T. M. Oshiro, P. S. Perez, and J. A. Baranauskas, “How Many Trees in a Random Forest?,” 2012, pp. 154–168. doi: 10.1007/978-3-642-31537-4_13.
  • [39] G. Biau and E. Scornet, “A random forest guided tour,” Test, vol. 25, no. 2, pp. 197–227, Jun. 2016, doi: 10.1007/s11749-016-0481-7.
  • [40] G. Louppe, “Understanding Random Forests: From Theory to Practice,” Jul. 2014, [Online]. Available: http://arxiv.org/abs/1407.7502
  • [41] C. Cortes, V. Vapnik, and L. Saitta, “Support-Vector Networks Editor,” Kluwer Academic Publishers, 1995.
  • [42] B. Schölkopf and A. J. Smola, “Kernel Methods,” in Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, 2001, pp. 405–406.
  • [43] C.-C. Chang and C.-J. Lin, “LIBSVM: A Library for Support Vector Machines,” 2001. [Online]. Available: www.csie.ntu.edu.tw/
  • [44] M. M. Deza and E. Deza, Encyclopedia of distances. Springer Berlin Heidelberg, 2009. doi: 10.1007/978-3-642-00234-2.
  • [45] S. A. Dudani, “The Distance-Weighted k-Nearest-Neighbor Rule,” IEEE Trans Syst Man Cybern, vol. SMC-6, no. 4, pp. 325–327, 1976, doi: 10.1109/TSMC.1976.5408784.
  • [46] R. J. Samworth, “Optimal weighted nearest neighbour classifiers,” Ann Stat, vol. 40, no. 5, pp. 2733–2763, Oct. 2012, doi: 10.1214/12-AOS1049.
  • [47] J. H. Friedman, “Stochastic gradient boosting,” 2002. [Online]. Available: www.elsevier.com/locate/csda [48] T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Association for Computing Machinery, Aug. 2016, pp. 785–794. doi: 10.1145/2939672.2939785.
  • [49] T. Chen and T. He, “xgboost: eXtreme Gradient Boosting,” in R Package, 2024.
  • [50] A. Natekin and A. Knoll, “Gradient boosting machines, a tutorial,” Front Neurorobot, vol. 7, no. DEC, 2013, doi: 10.3389/fnbot.2013.00021.
  • [51] P. N. Astya, Galgotias University. School of Computing Science and Engineering, Institute of Electrical and Electronics Engineers. Uttar Pradesh Section, Institute of Electrical and Electronics Engineers. Uttar Pradesh Section. SP/C Joint Chapter, and Institute of Electrical and Electronics Engineers, Proceeding, International Conference on Computing, Communication and Automation (ICCCA 2016) : 29-30 April, 2016.
  • [52] S. F. Weng, J. Reps, J. Kai, J. M. Garibaldi, and N. Qureshi, “Can Machine-learning improve cardiovascular risk prediction using routine clinical data?,” PLoS One, vol. 12, no. 4, Apr. 2017, doi: 10.1371/JOURNAL.PONE.0174944.
  • [53] S. Nusinovici et al., “Logistic regression was as good as machine learning for predicting major chronic diseases,” J Clin Epidemiol, vol. 122, pp. 56–69, Jun. 2020, doi: 10.1016/J.JCLINEPI.2020.03.002.
  • [54] W. Wang, J. Lee, F. Harrou, and Y. Sun, “Early Detection of Parkinson’s Disease Using Deep Learning and Machine Learning,” IEEE Access, vol. 8, pp. 147635–147646, 2020, doi: 10.1109/ACCESS.2020.3016062.
  • [55] E. Kabir Hashi and M. Shahid Uz Zaman, “Developing a Hyperparameter Tuning Based Machine Learning Approach of Heart Disease Prediction,” Journal of Applied Science & Process Engineering, vol. 7, no. 2, 2020.
  • [56] D. Hamid, S. S. Ullah, J. Iqbal, S. Hussain, C. A. U. Hassan, and F. Umar, “A Machine Learning in Binary and Multiclassification Results on Imbalanced Heart Disease Data Stream,” J Sens, vol. 2022, 2022, doi: 10.1155/2022/8400622.
  • [57] M. Wang, Z. Wei, M. Jia, L. Chen, and H. Ji, “Deep learning model for multi-classification of infectious diseases from unstructured electronic medical records,” BMC Med Inform Decis Mak, vol. 22, no. 1, Dec. 2022, doi: 10.1186/s12911-022-01776-y.
  • [58] C. Leys, C. Ley, O. Klein, P. Bernard, and L. Licata, “Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median,” J Exp Soc Psychol, vol. 49, no. 4, pp. 764–766, Jul. 2013, doi: 10.1016/j.jesp.2013.03.013.
  • [59] A. Geron, Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems, 2nd ed. O’Reilly Media, Inc., 2019.
  • [60] P. Refaeilzadeh, L. Tang, and H. Liu, “Cross-Validation,” Encyclopedia of Database Systems, pp. 532–538, 2009, doi: 10.1007/978-0-387-39940-9_565.
  • [61] “View of SMOTE: Synthetic Minority Over-sampling Technique.” Accessed: Feb. 05, 2025. [Online]. Available: https://www.jair.org/index.php/jair/article/view/10302/24590

Enhancing Multi-Disease Prediction with Machine Learning: A Comparative Analysis and Hyperparameter Optimization Approach

Yıl 2025, Cilt: 13 Sayı: 1, 367 - 381, 24.03.2025
https://doi.org/10.29109/gujsc.1489959

Öz

Although traditional methods based on statistical parameters are still important in healthcare, Machine learning (ML) algorithms offer promising results for analyzing health data. Therefore, the presented work aimed to evaluate the success of several supervised ML models with hyperparameter optimization (HPO) for predicting multiple diseases such as diabetes, heart disease, Parkinson's disease, and breast cancer.
We evaluated seven distinct algorithms: Logistic Regression (LR), Gradient Boosting (GB), k-Nearest Neighbors (k-NN), Extreme Gradient Boosting (XGB), Support Vector Machines (SVM), Random Forests (RF), and a basic "nonlinear mapping technique". Each algorithm was trained and compared in isolation for each targeted health condition. The success of these techniques was assessed using standard performance metrics like accuracy, precision, F1-score, and recall. Additionally, hyperparameter optimization was applied to each algorithm and its effect on the result was observed. The results show the potential of ML for multiple disease prediction with individual models achieving high accuracy for specific diseases. SVM achieved 100% accuracy for heart disease, Gradient Boosting achieved 90% for diabetes, a simple Neural Network achieved 99% for breast cancer, and Random Forest achieved 100% for Parkinson's disease. These results emphasize the importance of selecting appropriate models for specific disease prediction tasks.
A web-based application has been developed so that users can easily use the models by selecting a disease, providing relevant input, and receiving a prediction based on the chosen model. In conclusion, this study highlights the potential of machine learning and hyperparameter optimization for multi-disease prediction and underlines the importance of model selection.

Etik Beyan

There is no conflict of interest between the authors.

Destekleyen Kurum

Karabuk University Scientific Research Projects Coordination Department

Proje Numarası

KBÜBAP-24-YL-065

Teşekkür

This study was supported with the project code: KBÜBAP-24-YL-065 under the program of “Karabuk University Scientific Research Projects Coordination Department”.

Kaynakça

  • [1] N. AYDIN ATASOY and F. ÇAKMAK, “Web Tabanlı Sürücü Davranışları Analiz Uygulaması,” Gazi Journal of Engineering Sciences, vol. 7, no. 3, pp. 264–276, Dec. 2021, doi: 10.30855/gmbd.2021.03.09.
  • [2] E. DİKBIYIK, Ö. DEMİR, and B. DOĞAN, “Derin Öğrenme Yöntemleri İle Konuşmadan Duygu Tanıma Üzerine Bir Literatür Araştırması,” Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım ve Teknoloji, vol. 10, no. 4, pp. 765–791, Dec. 2022, doi: 10.29109/gujsc.1111884.
  • [3] Ö. TONKAL and H. POLAT, “Traffic Classification and Comparative Analysis with Machine Learning Algorithms in Software Defined Networks,” Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım ve Teknoloji, vol. 9, no. 1, pp. 71–83, Mar. 2021, doi: 10.29109/gujsc.869418.
  • [4] M. B. ER, “Akciğer Seslerinin Derin Öğrenme İle Sınıflandırılması,” Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım ve Teknoloji, vol. 8, no. 4, pp. 830–844, Dec. 2020, doi: 10.29109/gujsc.758325.
  • [5] R. Alanazi, “Identification and Prediction of Chronic Diseases Using Machine Learning Approach,” J Healthc Eng, vol. 2022, 2022, doi: 10.1155/2022/2826127.
  • [6] I. D. Mienye, Y. Sun, and Z. Wang, “An improved ensemble learning approach for the prediction of heart disease risk,” Inform Med Unlocked, vol. 20, Jan. 2020, doi: 10.1016/j.imu.2020.100402.
  • [7] S. Dhabarde, R. Mahajan, S. Mishra, S. Chaudhari, S. Manelu, and N. S. Shelke, “DISEASE PREDICTION USING MACHINE LEARNING ALGORITHMS”, [Online]. Available: www.irjmets.com
  • [8] S. Vilas and A. M. S. Scholar, “Diseases Prediction Model using Machine Learning Technique”, doi: 10.32628/IJSRST.
  • [9] A. Mujumdar and V. Vaidehi, “Diabetes Prediction using Machine Learning Algorithms,” in Procedia Computer Science, Elsevier B.V., 2019, pp. 292–299. doi: 10.1016/j.procs.2020.01.047.
  • [10] T. H. H. Aldhyani, A. S. Alshebami, and M. Y. Alzahrani, “Soft Clustering for Enhancing the Diagnosis of Chronic Diseases over Machine Learning Algorithms,” J Healthc Eng, vol. 2020, 2020, doi: 10.1155/2020/4984967.
  • [11] S. F. Weng, J. Reps, J. Kai, J. M. Garibaldi, and N. Qureshi, “Can Machine-learning improve cardiovascular risk prediction using routine clinical data?,” PLoS One, vol. 12, no. 4, Apr. 2017, doi: 10.1371/JOURNAL.PONE.0174944.
  • [12] S. Nusinovici et al., “Logistic regression was as good as machine learning for predicting major chronic diseases,” J Clin Epidemiol, vol. 122, pp. 56–69, Jun. 2020, doi: 10.1016/J.JCLINEPI.2020.03.002.
  • [13] J. Al Nahian, A. K. M. Masum, S. Abujar, and M. J. Mia, “Common human diseases prediction using machine learning based on survey data,” Bulletin of Electrical Engineering and Informatics, vol. 11, no. 6, pp. 3498–3508, Dec. 2022, doi: 10.11591/eei.v11i6.3405.
  • [14] N. Aydin Atasoy and A. Faris Abdulla Al Rahhawi, “Examining the classification performance of pre-trained capsule networks on imbalanced bone marrow cell dataset,” International Journal of Imaging Systems and Technology , vol. 34, no. 3, May 2024, doi: 10.1002/ima.23067.
  • [15] J. Bergstra, J. B. Ca, and Y. B. Ca, “Random Search for Hyper-Parameter Optimization Yoshua Bengio,” 2012. [Online]. Available: http://scikit-learn.sourceforge.net.
  • [16] M. Claesen and B. De Moor, “Hyperparameter Search in Machine Learning,” Feb. 2015, [Online]. Available: http://arxiv.org/abs/1502.02127
  • [17] Y. A. Ali, E. M. Awwad, M. Al-Razgan, and A. Maarouf, “Hyperparameter Search for Machine Learning Algorithms for Optimizing the Computational Complexity,” Processes, vol. 11, no. 2, Feb. 2023, doi: 10.3390/pr11020349.
  • [18] A. E. W. Johnson et al., “MIMIC-III, a freely accessible critical care database,” Sci Data, vol. 3, May 2016, doi: 10.1038/sdata.2016.35.
  • [19] J. Snoek, H. Larochelle, and R. P. Adams, “Practical Bayesian Optimization of Machine Learning Algorithms.”
  • [20] N. V Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic Minority Over-sampling Technique,” 2002.
  • [21] M. ÇOLAK, T. TÜMER SİVRİ, N. PERVAN AKMAN, A. BERKOL, and Y. EKİCİ, “Disease prognosis using machine learning algorithms based on new clinical dataset,” Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering, vol. 65, no. 1, pp. 52–68, Jun. 2023, doi: 10.33769/aupse.1215962.
  • [22] F. A. Latifah, I. Slamet, and Sugiyanto, “Comparison of heart disease classification with logistic regression algorithm and random forest algorithm,” AIP Conf Proc, vol. 2296, Nov. 2020, doi: 10.1063/5.0030579.
  • [23] R. Valarmathi and T. Sheela, “Heart disease prediction using hyper parameter optimization (HPO) tuning,” Biomed Signal Process Control, vol. 70, p. 103033, Sep. 2021, doi: 10.1016/J.BSPC.2021.103033.
  • [24] M. Feurer and F. Hutter, “Hyperparameter Optimization,” in Automated Machine Learning, 2019, pp. 3–33. doi: 10.1007/978-3-030-05318-5_1.
  • [25] B. Bischl, J. Richter, J. Bossek, D. Horn, J. Thomas, and M. Lang, “mlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions,” Mar. 2017, [Online]. Available: http://arxiv.org/abs/1703.03373
  • [26] G. Luo, “A review of automatic selection methods for machine learning algorithms and hyper-parameter values,” Network Modeling Analysis in Health Informatics and Bioinformatics, vol. 5, no. 1, Dec. 2016, doi: 10.1007/s13721-016-0125-6.
  • [27] P. Probst and B. Bischl, “Tunability: Importance of Hyperparameters of Machine Learning Algorithms,” 2019. [Online]. Available: http://jmlr.org/papers/v20/18-444.html.
  • [28] L. Yang and A. Shami, “On hyperparameter optimization of machine learning algorithms: Theory and practice,” Neurocomputing, vol. 415, pp. 295–316, Nov. 2020, doi: 10.1016/j.neucom.2020.07.061.
  • [29] D. J. Hand, “Measuring classifier performance: A coherent alternative to the area under the ROC curve,” Mach Learn, vol. 77, no. 1, pp. 103–123, Oct. 2009, doi: 10.1007/s10994-009-5119-5.
  • [30] T. Hastie, R. Tibshirani, and J. Friedman, “Springer Series in Statistics The Elements of Statistical Learning Data Mining, Inference, and Prediction.”
  • [31] A. Y. Ng, “Feature selection, L 1 vs. L 2 regularization, and rotational invariance,” in Twenty-first international conference on Machine learning - ICML ’04, New York, New York, USA: ACM Press, 2004, p. 78. doi: 10.1145/1015330.1015435.
  • [32] H. Zou and T. Hastie, “Regularization and variable selection via the elastic net,” J R Stat Soc Series B Stat Methodol, vol. 67, no. 2, pp. 301–320, 2005, doi: 10.1111/j.1467-9868.2005.00503.x.
  • [33] F. Pedregosa FABIANPEDREGOSA et al., “Scikit-learn: Machine Learning in Python Gaël Varoquaux Bertrand Thirion Vincent Dubourg Alexandre Passos PEDREGOSA, VAROQUAUX, GRAMFORT ET AL. Matthieu Perrot,” 2011. [Online]. Available: http://scikit-learn.sourceforge.net.
  • [34] G. C. Cawley and N. L. C. Talbot, “On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation,” 2010.
  • [35] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016.
  • [36] L. Breiman, “Random Forests,” in Machine Learning, vol. 45, 2001, pp. 5–32. doi: 10.1023/A:1010933404324.
  • [37] P. Probst, M. N. Wright, and A. L. Boulesteix, “Hyperparameters and tuning strategies for random forest,” in Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 9, no. 3, Wiley-Blackwell, 2019. doi: 10.1002/widm.1301.
  • [38] T. M. Oshiro, P. S. Perez, and J. A. Baranauskas, “How Many Trees in a Random Forest?,” 2012, pp. 154–168. doi: 10.1007/978-3-642-31537-4_13.
  • [39] G. Biau and E. Scornet, “A random forest guided tour,” Test, vol. 25, no. 2, pp. 197–227, Jun. 2016, doi: 10.1007/s11749-016-0481-7.
  • [40] G. Louppe, “Understanding Random Forests: From Theory to Practice,” Jul. 2014, [Online]. Available: http://arxiv.org/abs/1407.7502
  • [41] C. Cortes, V. Vapnik, and L. Saitta, “Support-Vector Networks Editor,” Kluwer Academic Publishers, 1995.
  • [42] B. Schölkopf and A. J. Smola, “Kernel Methods,” in Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, 2001, pp. 405–406.
  • [43] C.-C. Chang and C.-J. Lin, “LIBSVM: A Library for Support Vector Machines,” 2001. [Online]. Available: www.csie.ntu.edu.tw/
  • [44] M. M. Deza and E. Deza, Encyclopedia of distances. Springer Berlin Heidelberg, 2009. doi: 10.1007/978-3-642-00234-2.
  • [45] S. A. Dudani, “The Distance-Weighted k-Nearest-Neighbor Rule,” IEEE Trans Syst Man Cybern, vol. SMC-6, no. 4, pp. 325–327, 1976, doi: 10.1109/TSMC.1976.5408784.
  • [46] R. J. Samworth, “Optimal weighted nearest neighbour classifiers,” Ann Stat, vol. 40, no. 5, pp. 2733–2763, Oct. 2012, doi: 10.1214/12-AOS1049.
  • [47] J. H. Friedman, “Stochastic gradient boosting,” 2002. [Online]. Available: www.elsevier.com/locate/csda [48] T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Association for Computing Machinery, Aug. 2016, pp. 785–794. doi: 10.1145/2939672.2939785.
  • [49] T. Chen and T. He, “xgboost: eXtreme Gradient Boosting,” in R Package, 2024.
  • [50] A. Natekin and A. Knoll, “Gradient boosting machines, a tutorial,” Front Neurorobot, vol. 7, no. DEC, 2013, doi: 10.3389/fnbot.2013.00021.
  • [51] P. N. Astya, Galgotias University. School of Computing Science and Engineering, Institute of Electrical and Electronics Engineers. Uttar Pradesh Section, Institute of Electrical and Electronics Engineers. Uttar Pradesh Section. SP/C Joint Chapter, and Institute of Electrical and Electronics Engineers, Proceeding, International Conference on Computing, Communication and Automation (ICCCA 2016) : 29-30 April, 2016.
  • [52] S. F. Weng, J. Reps, J. Kai, J. M. Garibaldi, and N. Qureshi, “Can Machine-learning improve cardiovascular risk prediction using routine clinical data?,” PLoS One, vol. 12, no. 4, Apr. 2017, doi: 10.1371/JOURNAL.PONE.0174944.
  • [53] S. Nusinovici et al., “Logistic regression was as good as machine learning for predicting major chronic diseases,” J Clin Epidemiol, vol. 122, pp. 56–69, Jun. 2020, doi: 10.1016/J.JCLINEPI.2020.03.002.
  • [54] W. Wang, J. Lee, F. Harrou, and Y. Sun, “Early Detection of Parkinson’s Disease Using Deep Learning and Machine Learning,” IEEE Access, vol. 8, pp. 147635–147646, 2020, doi: 10.1109/ACCESS.2020.3016062.
  • [55] E. Kabir Hashi and M. Shahid Uz Zaman, “Developing a Hyperparameter Tuning Based Machine Learning Approach of Heart Disease Prediction,” Journal of Applied Science & Process Engineering, vol. 7, no. 2, 2020.
  • [56] D. Hamid, S. S. Ullah, J. Iqbal, S. Hussain, C. A. U. Hassan, and F. Umar, “A Machine Learning in Binary and Multiclassification Results on Imbalanced Heart Disease Data Stream,” J Sens, vol. 2022, 2022, doi: 10.1155/2022/8400622.
  • [57] M. Wang, Z. Wei, M. Jia, L. Chen, and H. Ji, “Deep learning model for multi-classification of infectious diseases from unstructured electronic medical records,” BMC Med Inform Decis Mak, vol. 22, no. 1, Dec. 2022, doi: 10.1186/s12911-022-01776-y.
  • [58] C. Leys, C. Ley, O. Klein, P. Bernard, and L. Licata, “Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median,” J Exp Soc Psychol, vol. 49, no. 4, pp. 764–766, Jul. 2013, doi: 10.1016/j.jesp.2013.03.013.
  • [59] A. Geron, Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems, 2nd ed. O’Reilly Media, Inc., 2019.
  • [60] P. Refaeilzadeh, L. Tang, and H. Liu, “Cross-Validation,” Encyclopedia of Database Systems, pp. 532–538, 2009, doi: 10.1007/978-0-387-39940-9_565.
  • [61] “View of SMOTE: Synthetic Minority Over-sampling Technique.” Accessed: Feb. 05, 2025. [Online]. Available: https://www.jair.org/index.php/jair/article/view/10302/24590
Toplam 60 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Biyomedikal Tanı
Bölüm Tasarım ve Teknoloji
Yazarlar

Mariam Kili Bechir 0000-0001-8053-3875

Ferhat Atasoy 0000-0002-1672-0593

Proje Numarası KBÜBAP-24-YL-065
Erken Görünüm Tarihi 20 Şubat 2025
Yayımlanma Tarihi 24 Mart 2025
Gönderilme Tarihi 26 Mayıs 2024
Kabul Tarihi 9 Şubat 2025
Yayımlandığı Sayı Yıl 2025 Cilt: 13 Sayı: 1

Kaynak Göster

APA Bechir, M. K., & Atasoy, F. (2025). Enhancing Multi-Disease Prediction with Machine Learning: A Comparative Analysis and Hyperparameter Optimization Approach. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım Ve Teknoloji, 13(1), 367-381. https://doi.org/10.29109/gujsc.1489959

                                     16168      16167     16166     21432        logo.png   


    e-ISSN:2147-9526