Modeling Students’ Academic Performance Based on Their Interactions in an Online Learning Environment
Yıl 2015,
Cilt: 14 Sayı: 3, 815 - 824, 10.01.2015
Gökhan Akçapınar
,
Arif Altun
,
Petek Aşkar
Öz
The aim of this study is to model students' academic performance based on their interaction with the online learning environment designed by researchers. The dataset includes 10 input attributes extracted from students' learning activity logs. And as an output variable (class) final grades obtained by students in Computer Hardware course was used. The predictive performance of three different classification algorithms were tested (Naïve Bayes, Classification Tree, and CN2 rules) on dataset. Predictive performance of algorithms were compared in terms of Classification Accuracy (CA), and Area under the ROC Curve (AUC) metrics. All analysis were performed by using Orange data mining tool and models were evaluated by using ten-fold cross-validation. Results of analysis were presented as Confusion Matrix, Decision Tree, and IF-THEN rules. The experimental results indicate that the Naïve Bayes algorithm outperforms other classification algorithms in terms of CA and AUC metrics. On the other hand models which are generated by Classification Tree and CN2 algorithm are easy to understand for non-expert data mining users.
Kaynakça
- Alfredo, V., Félix, C., & Àngela, N. (2010). Clustering Educational Data Handbook of Educational Data Mining (pp. 75-92): CRC Press.
- Ali, L., Asadi, M., Gašević, D., Jovanović, J., & Hatala, M. (2013). Factors influencing beliefs for adoption of a learning analytics tool: An empirical study. Computers & Education, 62(0), 130-148. doi: http://dx.doi.org/10.1016/j.compedu.2012.10.023
- Baker, R. S. J. d. (2007). Modeling and understanding students' off-task behavior in intelligent tutoring systems. Paper presented at the Proceedings of the SIGCHI conference on Human factors in computing systems, San Jose, California, USA.
- Beal, C. R., Qu, L., & Lee, H. (2008). Mathematics motivation and achievement as predictors of high school students' guessing and help-seeking with instructional software. Journal of Computer Assisted Learning, 24(6), 507-514. doi: 10.1111/j.1365-2729.2008.00288.x
- Bousbia, N., & Belamri, I. (2014). Which Contribution Does EDM Provide to Computer-Based Learning Environments? In A. Peña-Ayala (Ed.), Educational Data Mining (Vol. 524, pp. 3-28): Springer International Publishing.
- Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and Regression Trees. Belmont, CA: Wadsworth International Group.
- Chang, Y.-W., & Lin, C.-J. (2008). Feature Ranking Using Linear SVM. Paper presented at the JMLR Workshop and Conference Proceedings: Causation and Prediction Challenge (WCCI 2008).
- Charu, C. A. (2014). An Introduction to Data Classification Data Classification (pp. 1-36): Chapman and Hall/CRC.
- Clark, P., & Niblett, T. (1989). The CN2 Induction Algorithm. Machine Learning, 3(4), 261-283. doi: 10.1023/A:1022641700528
- Demšar, J., Curk, T., Erjavec, A., Gorup, Č., Hočevar, T., Milutinovič, M., . . . Zupan, B. (2013). Orange: data mining toolbox in python. J. Mach. Learn. Res., 14(1), 2349-2353.
- Dreiseitl, S., & Ohno-Machado, L. (2002). Logistic regression and artificial neural network classification models: a methodology review. Journal of Biomedical Informatics, 35(5–6), 352-359. doi: http://dx.doi.org/10.1016/S1532-0464(03)00034-0
- Enot, D. P., Lin, W., Beckmann, M., Parker, D., Overy, D. P., & Draper, J. (2008). Preprocessing, classification modeling and feature selection using flow injection electrospray mass spectrometry metabolite fingerprint data. Nat. Protocols, 3(3), 446-470.
- Fayyad, U. M., & Irani, K. B. (1992). The attribute selection problem in decision tree generation. Paper presented at the Proceedings of the tenth national conference on Artificial intelligence, San Jose, California.
- Greller, W., & Drachsler, H. (2012). Translating Learning into Numbers: A Generic Framework for Learning Analytics. Educational Technology & Society, 15 %6(3), 42-57.
- Hongbo, D., Yizhou, S., Yi, C., & Jiawei, H. (2014). Probabilistic Models for Classification Data Classification (pp. 65-86): Chapman and Hall/CRC.
- Jie, L. (2004). A personalized e-learning material recommender system.
- Kabakchieva, D. (2013). Predicting student performance by using data mining methods for classification. Cybernetics and Information Technologies, 13(1), 61-72. doi: 10.2478/cait-2013-0006
- Kardan, S., & Conati, C. (2011, 2011). A Framework for Capturing Distinguishing User Interaction Behaviours in Novel Interfaces. Paper presented at the The 4th International Conference on Educational Data Mining (EDM 2011).
- Koedinger, K., Cunningham, K., Skogsholm, A., & Leber, B. (2008). An open repository and analysis tools for fine-grained, longitudinal learner data. Paper presented at the Educational Data Mining 2008: 1st International Conference on Educational Data Mining, Proceedings.
- Liu, B., Hsu, W., & Ma., Y. (1998). Integrating Classification and Association Rule Mining. Paper presented at the ACM KDD Conference.
- Lopez, M. I., Luna, J. M., Romero, C., & Ventura, S. (2012). Classification via clustering for predicting final marks based on student participation in forums. Paper presented at the 5th International Conference on Educational Data Mining, EDM 2012, Chania, Greece.
- Márquez-Vera, C., Cano, A., Romero, C., & Ventura, S. (2013). Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data. Applied Intelligence, 38(3), 315-330. doi: 10.1007/s10489-012-0374-8
- McCuaig, J., & Baldwin, J. (2012, Jun 19-21, 2012). Identifying Successful Learners from Interaction Behaviour. Paper presented at the 5th International Conference on Educational Data Mining (EDM), Chania, Greece.
- Moreno, L., Gonzalez, C., Castilla, I., Gonzalez, E., & Sigut, J. (2007). Applying a constructivist and collaborative methodological approach in engineering education. Computers & Education, 49(3), 891-915. doi: http://dx.doi.org/10.1016/j.compedu.2005.12.004
- Orange. (2014). from http://docs.orange.biolab.si/widgets/rst/data/rank.html
- Osmanbegović, E., & Suljić, M. (2012). Data Mining Approach for Predicting Student Performance. Economic Review, 10(1).
- Peña-Ayala, A. (2014). Educational data mining: A survey and a data mining-based analysis of recent works. Expert Systems with Applications, 41(4, Part 1), 1432-1462. doi: http://dx.doi.org/10.1016/j.eswa.2013.08.042
- Quinlan, J. R. (1993). C4.5: programs for machine learning: Morgan Kaufmann Publishers Inc.
- Romero, C., Espejo, P. G., Zafra, A., Romero, J. R., & Ventura, S. (2010). Web usage mining for predicting final marks of students that use Moodle courses. Computer Applications in Engineering Education, n/a-n/a. doi: 10.1002/cae.20456
- Romero, C., Espejo, P. G., Zafra, A., Romero, J. R., & Ventura, S. (2013). Web usage mining for predicting final marks of students that use Moodle courses. Computer Applications in Engineering Education, 21(1), 135-146. doi: 10.1002/cae.20456
- Romero, C., & Ventura, S. (2010). Educational Data Mining: A Review of the State of the Art. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 40(6), 601-618. doi: 10.1109/TSMCC.2010.2053532
- Siemens, G., & Baker, R. S. J. d. (2012). Learning Analytics and Educational Data Mining: Towards Communication and Collaboration. Paper presented at the Proceedings of the 2nd International Conference on Learning Analytics and Knowledge.
- Xing, W., Guo, R., Petakovic, E., & Goggins, S. (2014). Participation-based student final performance prediction model through interpretable Genetic Programming: Integrating learning analytics, educational data mining and theory. Computers in Human Behavior(0). doi: http://dx.doi.org/10.1016/j.chb.2014.09.034
Öğrencilerin Akademik Performanslarının Çevrimiçi Öğrenme Ortamındaki Etkileşim Verilerine Göre Modellenmesi
Yıl 2015,
Cilt: 14 Sayı: 3, 815 - 824, 10.01.2015
Gökhan Akçapınar
,
Arif Altun
,
Petek Aşkar
Öz
Bu çalışmanın amacı çevrimiçi öğrenme ortamındaki etkileşim verilerine göre öğrencilerin Bilgisayar
Donanımı dersine ilişkin akademik performanslarının modellenmesidir. Çalışmada kullanılan veri seti
öğrencilerin çevrimiçi öğrenme ortamındaki log verilerinden elde edilen 10 adet değişkeni ve sınıf (tahmin)
değişkeni olarak da öğrencilerin akademik performanslarının yansıması olan dönem sonu notlarını içermektedir.
Yapılan analizlerde 3 farklı veri madenciliği algoritmasının (Naïve Bayes, Karar Ağacı ve CN2) sınıflama
performansı karşılaştırılmıştır. Elde edilen modellerin tahmin performanslarının karşılaştırılması için Doğru
Sınıflama Oranı (DSO) ve ROC Altında Kalan Alan (EAKA) metrikleri kullanılmıştır. Tüm analizler Orange
veri madenciliği yazılımı ile gerçekleştirilmiştir ve elde edilen modellerin genelleştirilmesi için 10k çapraz
geçerlilik yöntemi kullanılmıştır. Analiz sonuçları çapraz tablo, karar ağacı ve eğer-ise kurallar dizisi şeklinde
sunulmuştur.
Kaynakça
- Alfredo, V., Félix, C., & Àngela, N. (2010). Clustering Educational Data Handbook of Educational Data Mining (pp. 75-92): CRC Press.
- Ali, L., Asadi, M., Gašević, D., Jovanović, J., & Hatala, M. (2013). Factors influencing beliefs for adoption of a learning analytics tool: An empirical study. Computers & Education, 62(0), 130-148. doi: http://dx.doi.org/10.1016/j.compedu.2012.10.023
- Baker, R. S. J. d. (2007). Modeling and understanding students' off-task behavior in intelligent tutoring systems. Paper presented at the Proceedings of the SIGCHI conference on Human factors in computing systems, San Jose, California, USA.
- Beal, C. R., Qu, L., & Lee, H. (2008). Mathematics motivation and achievement as predictors of high school students' guessing and help-seeking with instructional software. Journal of Computer Assisted Learning, 24(6), 507-514. doi: 10.1111/j.1365-2729.2008.00288.x
- Bousbia, N., & Belamri, I. (2014). Which Contribution Does EDM Provide to Computer-Based Learning Environments? In A. Peña-Ayala (Ed.), Educational Data Mining (Vol. 524, pp. 3-28): Springer International Publishing.
- Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and Regression Trees. Belmont, CA: Wadsworth International Group.
- Chang, Y.-W., & Lin, C.-J. (2008). Feature Ranking Using Linear SVM. Paper presented at the JMLR Workshop and Conference Proceedings: Causation and Prediction Challenge (WCCI 2008).
- Charu, C. A. (2014). An Introduction to Data Classification Data Classification (pp. 1-36): Chapman and Hall/CRC.
- Clark, P., & Niblett, T. (1989). The CN2 Induction Algorithm. Machine Learning, 3(4), 261-283. doi: 10.1023/A:1022641700528
- Demšar, J., Curk, T., Erjavec, A., Gorup, Č., Hočevar, T., Milutinovič, M., . . . Zupan, B. (2013). Orange: data mining toolbox in python. J. Mach. Learn. Res., 14(1), 2349-2353.
- Dreiseitl, S., & Ohno-Machado, L. (2002). Logistic regression and artificial neural network classification models: a methodology review. Journal of Biomedical Informatics, 35(5–6), 352-359. doi: http://dx.doi.org/10.1016/S1532-0464(03)00034-0
- Enot, D. P., Lin, W., Beckmann, M., Parker, D., Overy, D. P., & Draper, J. (2008). Preprocessing, classification modeling and feature selection using flow injection electrospray mass spectrometry metabolite fingerprint data. Nat. Protocols, 3(3), 446-470.
- Fayyad, U. M., & Irani, K. B. (1992). The attribute selection problem in decision tree generation. Paper presented at the Proceedings of the tenth national conference on Artificial intelligence, San Jose, California.
- Greller, W., & Drachsler, H. (2012). Translating Learning into Numbers: A Generic Framework for Learning Analytics. Educational Technology & Society, 15 %6(3), 42-57.
- Hongbo, D., Yizhou, S., Yi, C., & Jiawei, H. (2014). Probabilistic Models for Classification Data Classification (pp. 65-86): Chapman and Hall/CRC.
- Jie, L. (2004). A personalized e-learning material recommender system.
- Kabakchieva, D. (2013). Predicting student performance by using data mining methods for classification. Cybernetics and Information Technologies, 13(1), 61-72. doi: 10.2478/cait-2013-0006
- Kardan, S., & Conati, C. (2011, 2011). A Framework for Capturing Distinguishing User Interaction Behaviours in Novel Interfaces. Paper presented at the The 4th International Conference on Educational Data Mining (EDM 2011).
- Koedinger, K., Cunningham, K., Skogsholm, A., & Leber, B. (2008). An open repository and analysis tools for fine-grained, longitudinal learner data. Paper presented at the Educational Data Mining 2008: 1st International Conference on Educational Data Mining, Proceedings.
- Liu, B., Hsu, W., & Ma., Y. (1998). Integrating Classification and Association Rule Mining. Paper presented at the ACM KDD Conference.
- Lopez, M. I., Luna, J. M., Romero, C., & Ventura, S. (2012). Classification via clustering for predicting final marks based on student participation in forums. Paper presented at the 5th International Conference on Educational Data Mining, EDM 2012, Chania, Greece.
- Márquez-Vera, C., Cano, A., Romero, C., & Ventura, S. (2013). Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data. Applied Intelligence, 38(3), 315-330. doi: 10.1007/s10489-012-0374-8
- McCuaig, J., & Baldwin, J. (2012, Jun 19-21, 2012). Identifying Successful Learners from Interaction Behaviour. Paper presented at the 5th International Conference on Educational Data Mining (EDM), Chania, Greece.
- Moreno, L., Gonzalez, C., Castilla, I., Gonzalez, E., & Sigut, J. (2007). Applying a constructivist and collaborative methodological approach in engineering education. Computers & Education, 49(3), 891-915. doi: http://dx.doi.org/10.1016/j.compedu.2005.12.004
- Orange. (2014). from http://docs.orange.biolab.si/widgets/rst/data/rank.html
- Osmanbegović, E., & Suljić, M. (2012). Data Mining Approach for Predicting Student Performance. Economic Review, 10(1).
- Peña-Ayala, A. (2014). Educational data mining: A survey and a data mining-based analysis of recent works. Expert Systems with Applications, 41(4, Part 1), 1432-1462. doi: http://dx.doi.org/10.1016/j.eswa.2013.08.042
- Quinlan, J. R. (1993). C4.5: programs for machine learning: Morgan Kaufmann Publishers Inc.
- Romero, C., Espejo, P. G., Zafra, A., Romero, J. R., & Ventura, S. (2010). Web usage mining for predicting final marks of students that use Moodle courses. Computer Applications in Engineering Education, n/a-n/a. doi: 10.1002/cae.20456
- Romero, C., Espejo, P. G., Zafra, A., Romero, J. R., & Ventura, S. (2013). Web usage mining for predicting final marks of students that use Moodle courses. Computer Applications in Engineering Education, 21(1), 135-146. doi: 10.1002/cae.20456
- Romero, C., & Ventura, S. (2010). Educational Data Mining: A Review of the State of the Art. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 40(6), 601-618. doi: 10.1109/TSMCC.2010.2053532
- Siemens, G., & Baker, R. S. J. d. (2012). Learning Analytics and Educational Data Mining: Towards Communication and Collaboration. Paper presented at the Proceedings of the 2nd International Conference on Learning Analytics and Knowledge.
- Xing, W., Guo, R., Petakovic, E., & Goggins, S. (2014). Participation-based student final performance prediction model through interpretable Genetic Programming: Integrating learning analytics, educational data mining and theory. Computers in Human Behavior(0). doi: http://dx.doi.org/10.1016/j.chb.2014.09.034