Research Article
BibTex RIS Cite

Çok Sınıflı ve Dengesiz Eğitimsel Veri Kümesiyle Yükseköğretim Planlama ve Karar Destek Sistemi: Teknoloji Fakültesi Örneği

Year 2023, Volume: 9 Issue: 1, 63 - 78, 30.04.2023

Abstract

Eğitimsel Veri Madenciliğinin alt dalı olan akademik performans tahminiyle ilgili çalışmalar son yıllarda arttı. Gerçek ortamlarda eğitimsel veri kümeleri çoğunlukla sınıf dengesizliğine ve çok sınıflı hedef değişkene sahiptir. Ancak bu veri kümesiyle yapılan çalışmalar oldukça azdır. Bu bağlamda, bu çalışmada, 23.05.2022-286783 etik no kararı ile Marmara Üniversitesi (MÜ) Teknoloji Fakültesi (TF) öğrencilerine ait veri seti kullanılarak, çok sınıflı dengesiz eğitimsel veri kümesiyle, riskli öğrencileri tespit etmek için öğrenci mezuniyet durum tahmini yapıldı. Veri ön işleme ve özellik seçimi (FS) aşamalarıyla 1394 örneklem ve 11 özellik elde edildi. 2016 yılına ait 153 öğrenci sağlamlık kontrolü için kullanıldı. 7 farklı FS ile elde edilen 11, 7 ve 5 özellik içeren 3 farklı veri kümesi oluşturuldu. 9 farklı örnekleme yöntemi ve 16 farklı makine öğrenmesi algoritması kullanılarak birbirinden farklı 750 model oluşturuldu. Modellere sağlamlık kontrolü yapıldı. Başarı ölçütü olarak F1 Score ve Repeated Stratified 5*5 fold-CV kullanıldı. Hiper parametre ayarları GridSearchCV ile yapıldı. Sonuç olarak RandomOverSampler+RandomForest F1 Score 0.9935 değeriyle en başarılı algoritma olmasına rağmen, en başarılı ve en tutarlı modeller 7 özellikli, None+ExtraTrees, None+MLP, None+Bagging_DesicionTree ve None+RandomForest modelleri oldu. Bu modellerle karar destek sistemi web uygulaması geliştirilerek MÜ TF öğretim üyelerine sunuldu.

References

  • [1] Yükseköğretim Kanunu https://www.mevzuat.gov.tr/MevzuatMetin/1.5.2547.pdf 25 Nisan 2022
  • [2] A. Hancı Karademirci, “Öğretim teknolojileri: tanımı ve tarihsel gelişimine yeniden bakmak,” akademik bilişim’10 - XII. Akademik Bilişim Konferansı Bildirileri Muğla Üniversitesi, 10 - 12 Şubat 2010, pp.397-403.
  • [3] EDM Tanımı https://educationaldatamining.org/ 25 Nisan 2022
  • [4][Sayfa 419, Uygulamalı Tahmine Dayalı Modelleme , 2013.
  • [5] D. Çelik, “ 11. sınıf öğrencilerinin düşünme stilleri, öğrenme stratejileri ve düşünme stilleri ile öğrenme stratejileri arasındaki ilişki,” Yüksek Lisans tezi, Pamukkale Üniversitesi Eğitim Bilimleri Enstitüsü, Denizli, 2016.
  • [6] Sayfa vii, Dengesiz Veri Kümelerinden Öğrenme , 2018.
  • [7] A. Dutt, M. A. Ismail, T. Herawan, "A systematic review on educational data mining," in IEEE Access, vol. 5, pp. 15991-16005, 2017. Doi: 10.1109/ACCESS.2017.2654247. https://ieeexplore.ieee.org/abstract/document/7820050 25 Nisan 2022 https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7820050 25 Nisan 2022
  • [8] M. Tatlıdil, “Veri Türleri ve İstatistiğe Giriş”, mervetatlidil.medium.com, May. 15, 2020. [Online]. Erişilebilir: https://mervetatlidil.medium.com/veri-t%C3%BCrleri-ve-i%CC%87statisti%C4%9Fe-giri%C5%9F-2959f509f768 25 Nisan 2022
  • [9] J. Brownlee, “One-vs-Rest and One-vs-One for Multi-Class Classification”, machinelearningmastery.com, Nis. 13, 2020. [Online]. Erişilebilir: https://machinelearningmastery.com/one-vs-rest-and-one-vs-one-for-multi-class-classification/
  • [10] A. Khan, S.K. Ghosh, “Student performance analysis and prediction in classroom learning: A review of educational data mining studies”. Educ Inf Technol, vol 26, pp. 205–240, 2021. https://doi.org/10.1007/s10639-020-10230-3.
  • [11] E. Yılmaz, “Systematic Literature Review on Multiclass Imbalanced Educational Data Mining”
  • [12] Ulusal Tez Merkezi [https://tez.yok.gov.tr/UlusalTezMerkezi ] 25 Nisan 2022
  • [13] Y. Pristyanto, I. Pratama, ve A. F. Nugraha, “Data level approach for imbalanced class handling on educational data mining multiclass classification,” In 2018 International Conference on Information and Communications Technology (ICOIACT), March, 2018. pp. 310-314.
  • [14] S. D. A Bujang, A. Selamat, R. Ibrahim, O. Krejcar, E. Herrera-Viedma, H. Fujita, N. A. M. Ghani, “Multiclass prediction model for student grade prediction using machine learning,” IEEE Access, 9, 2021. 95608-95621.
  • [15] R. Ghorbani, R. Ghousi, “Comparing different resampling methods in predicting students’ performance using machine learning techniques,” IEEE Access, 8, 2020, 67899-67911.
  • [16] I. Pratama, Y. Pristyanto, P. T. Prasetyaningrum, “Imbalanced Class handling and Classification on Educational Dataset,” in 4th International Conference on Information and Communications Technology (ICOIACT), August, 2021. pp. 180-185.
  • [17] V. T. N Chau, N. H. Phung, “Imbalanced educational data classification: An effective approach with resampling and random forest,” in the 2013 RIVF International Conference on Computing & Communication Technologies-Research, Innovation, and Vision for Future (RIVF), November, 2013. pp. 135-140.
  • [18] T. Purwoningsih, “Early Prediction of Students’ Academic Achievement: Categorical Data from Fully Online Learning on Machine-Learning Classification Algorithm.” 2021.
  • [19] T. Lenin, N. Chandrasekaran, “Learning from imbalanced educational data using ensemble machine learning algorithms,” Webology, 18(SI01), pp.183-195, 2021.
  • [20] E. Buraimoh, R. Ajoodha and K. Padayachee, "Application of machine learning techniques to the prediction of student success," 2021 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), 2021. pp. 1-6. doi: 10.1109/IEMTRONICS52119.2021.9422545.
  • [21] N. Rachburee, W. Punlumjeak, “Oversampling technique in student performance classification from engineering course,” International Journal of Electrical and Computer Engineering; Yogyakarta, vol. 11, Iss. 4, pp. 3567-3574, Aug 2021.
  • [22] E. Buraimoh, R. Ajoodha, K. Padayachee, “Importance of data re-sampling and dimensionality reduction in predicting students' success.” in 2021 International Conference on Electrical, Communication, and Computer Engineering (ICECCE) June, 2021, pp. 1-6.

Higher Education Planning and Decision Support System with Multi-Class and Imbalanced Educational Dataset: A Case Of Technology Faculty

Year 2023, Volume: 9 Issue: 1, 63 - 78, 30.04.2023

Abstract

Studies on academic performance prediction, a sub-branch of Educational Data Mining, have increased in recent years. Educational datasets in real environments often have class imbalanced and multi-class target variables. However, studies with these datasets are very few. In this context, in this study, with the ethical no decision of 23.05.2022-286783, using the data set of Marmara University (MU) Faculty of Technology (TF) students, a student graduation status estimation was made with the multiclass imbalanced educational dataset to identify the students at risk. 1394 samples and 11 features were obtained through data preprocessing and feature selection (FS) stages. 153 students belonging to 2016 were used for robustness control. 3 different datasets containing 11, 7 and 5 features obtained with 7 different FS were created. Using 9 different sampling methods and 16 different machine learning algorithms, 750 different models were created. Models were checked for robustness. F1 Score and Repeated Stratified 5*5 fold-CV were used as success criteria. Hyperparameter settings were made with GridSearchCV. As a result, although ROS+RF was the most successful algorithm with an F1 Score of 0.9935, the most successful and most consistent models were the 7-featured None+ET, None+MLP, None+Bagging_DT and None+RF models. With these models, the decision support system web application was developed and presented to MU TF faculty members.

References

  • [1] Yükseköğretim Kanunu https://www.mevzuat.gov.tr/MevzuatMetin/1.5.2547.pdf 25 Nisan 2022
  • [2] A. Hancı Karademirci, “Öğretim teknolojileri: tanımı ve tarihsel gelişimine yeniden bakmak,” akademik bilişim’10 - XII. Akademik Bilişim Konferansı Bildirileri Muğla Üniversitesi, 10 - 12 Şubat 2010, pp.397-403.
  • [3] EDM Tanımı https://educationaldatamining.org/ 25 Nisan 2022
  • [4][Sayfa 419, Uygulamalı Tahmine Dayalı Modelleme , 2013.
  • [5] D. Çelik, “ 11. sınıf öğrencilerinin düşünme stilleri, öğrenme stratejileri ve düşünme stilleri ile öğrenme stratejileri arasındaki ilişki,” Yüksek Lisans tezi, Pamukkale Üniversitesi Eğitim Bilimleri Enstitüsü, Denizli, 2016.
  • [6] Sayfa vii, Dengesiz Veri Kümelerinden Öğrenme , 2018.
  • [7] A. Dutt, M. A. Ismail, T. Herawan, "A systematic review on educational data mining," in IEEE Access, vol. 5, pp. 15991-16005, 2017. Doi: 10.1109/ACCESS.2017.2654247. https://ieeexplore.ieee.org/abstract/document/7820050 25 Nisan 2022 https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7820050 25 Nisan 2022
  • [8] M. Tatlıdil, “Veri Türleri ve İstatistiğe Giriş”, mervetatlidil.medium.com, May. 15, 2020. [Online]. Erişilebilir: https://mervetatlidil.medium.com/veri-t%C3%BCrleri-ve-i%CC%87statisti%C4%9Fe-giri%C5%9F-2959f509f768 25 Nisan 2022
  • [9] J. Brownlee, “One-vs-Rest and One-vs-One for Multi-Class Classification”, machinelearningmastery.com, Nis. 13, 2020. [Online]. Erişilebilir: https://machinelearningmastery.com/one-vs-rest-and-one-vs-one-for-multi-class-classification/
  • [10] A. Khan, S.K. Ghosh, “Student performance analysis and prediction in classroom learning: A review of educational data mining studies”. Educ Inf Technol, vol 26, pp. 205–240, 2021. https://doi.org/10.1007/s10639-020-10230-3.
  • [11] E. Yılmaz, “Systematic Literature Review on Multiclass Imbalanced Educational Data Mining”
  • [12] Ulusal Tez Merkezi [https://tez.yok.gov.tr/UlusalTezMerkezi ] 25 Nisan 2022
  • [13] Y. Pristyanto, I. Pratama, ve A. F. Nugraha, “Data level approach for imbalanced class handling on educational data mining multiclass classification,” In 2018 International Conference on Information and Communications Technology (ICOIACT), March, 2018. pp. 310-314.
  • [14] S. D. A Bujang, A. Selamat, R. Ibrahim, O. Krejcar, E. Herrera-Viedma, H. Fujita, N. A. M. Ghani, “Multiclass prediction model for student grade prediction using machine learning,” IEEE Access, 9, 2021. 95608-95621.
  • [15] R. Ghorbani, R. Ghousi, “Comparing different resampling methods in predicting students’ performance using machine learning techniques,” IEEE Access, 8, 2020, 67899-67911.
  • [16] I. Pratama, Y. Pristyanto, P. T. Prasetyaningrum, “Imbalanced Class handling and Classification on Educational Dataset,” in 4th International Conference on Information and Communications Technology (ICOIACT), August, 2021. pp. 180-185.
  • [17] V. T. N Chau, N. H. Phung, “Imbalanced educational data classification: An effective approach with resampling and random forest,” in the 2013 RIVF International Conference on Computing & Communication Technologies-Research, Innovation, and Vision for Future (RIVF), November, 2013. pp. 135-140.
  • [18] T. Purwoningsih, “Early Prediction of Students’ Academic Achievement: Categorical Data from Fully Online Learning on Machine-Learning Classification Algorithm.” 2021.
  • [19] T. Lenin, N. Chandrasekaran, “Learning from imbalanced educational data using ensemble machine learning algorithms,” Webology, 18(SI01), pp.183-195, 2021.
  • [20] E. Buraimoh, R. Ajoodha and K. Padayachee, "Application of machine learning techniques to the prediction of student success," 2021 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), 2021. pp. 1-6. doi: 10.1109/IEMTRONICS52119.2021.9422545.
  • [21] N. Rachburee, W. Punlumjeak, “Oversampling technique in student performance classification from engineering course,” International Journal of Electrical and Computer Engineering; Yogyakarta, vol. 11, Iss. 4, pp. 3567-3574, Aug 2021.
  • [22] E. Buraimoh, R. Ajoodha, K. Padayachee, “Importance of data re-sampling and dimensionality reduction in predicting students' success.” in 2021 International Conference on Electrical, Communication, and Computer Engineering (ICECCE) June, 2021, pp. 1-6.
There are 22 citations in total.

Details

Primary Language Turkish
Subjects Computer Software
Journal Section Research Articles
Authors

Esra Yılmaz 0000-0003-2411-4937

Zehra Aysun Altıkardeş 0000-0003-3875-1793

Hasan Erdal 0000-0001-8296-0694

Publication Date April 30, 2023
Submission Date June 21, 2022
Acceptance Date March 3, 2023
Published in Issue Year 2023 Volume: 9 Issue: 1

Cite

IEEE E. Yılmaz, Z. A. Altıkardeş, and H. Erdal, “Çok Sınıflı ve Dengesiz Eğitimsel Veri Kümesiyle Yükseköğretim Planlama ve Karar Destek Sistemi: Teknoloji Fakültesi Örneği”, GJES, vol. 9, no. 1, pp. 63–78, 2023.

Gazi Journal of Engineering Sciences (GJES) publishes open access articles under a Creative Commons Attribution 4.0 International License (CC BY). 1366_2000-copia-2.jpg