Research Article
BibTex RIS Cite

The Effects of Item Pool Characteristics on Test Length and Classification Accuracy in Computerized Adaptive Classification Testings

Year 2018, Volume: 33 Issue: 4, 888 - 896, 31.10.2018

Abstract

In this study the effects of distributions
and sizes on average test length and average classification accuracy in
computerized adaptive classification testings (CACT) were investigated. For
that purpose random item selection method (RISM), Maximum Fisher Information
(MFI) and Kullback-Leibler Information (KLI) were studied in broad and peaked
item pools with 50 items, 100 items, 200 items and 300 items. Thetas are
derived from N(0,1). In peaked item pools items are simulated from U[0,5; 2,0]
for a parameters, N(1, 0,4) for b parameters and N(0,15, 0,05) for c
parameters; and in broad item pools items are simulated from U[0,5; 2,0] for a
parameters, N(1, 1,5) for b parameters and N(0,15, 0,05) for c parameters. The simulation
study was performed in R results show that RISM has the maximum value with
respect to average test length; and MFI and KLI perform similar. The more items
in the pool, the shorter test length and fewer the classification accuracy but
in all conditions classification accuracy has high rate above 90%. In addition,
in peaked item pools it is seen that the average test lengths are getting
shorter and the test effectiveness is getting higher; but the classification
accuracies are not changing. In conclusion it can be said that with the peaked
item pools with more items, CACT provides shorter tests and high classification
accuracy.

References

  • Babcock, B. & Weiss, D. J. (2009). Termination criteria in computerized adaptive tests: Variable length CATs are not biased. In D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. Retrieved [15.1.2015] from www.psych.umn.edu/psylabs/CATCentral/
  • Dooley, K. (2002), “Simulation research methods,” Companion to Organizations, Joel Baum (ed.), London: Blackwell, pp. 829-848. Embretson, S. E. & Reise, S. P. (2000). Item Response Theory for Psychologist. London: Lawrence Erlbaum Associates Publishers.
  • Flaugher, R. (2000). Item Pools. In Wainer, H. (Ed.) Computerized adaptive testing: A Primer. Mahwah, NJ: Erlbaum.
  • Hambleton, R. K. & Swaminathan, H. (1985). Item Response Theory: Principles and Applications. Boston: Kluwer Nijhoff Publishing.
  • R Core Team. (2013). R: A language and environment for statistical computing (Version 3.0.1) Vienna, Austria: R Foundation for Statistical Computing.
  • Spray, J. A. & Reckase, M. D. (1994). The Selection of Test Items for Decision Making with a Computer Adaptive Test. Paper presented at the Annual Meeting of the National Council on Measurement in Education. New Orleans, LA, April 5-7, 1994.
  • Şencan, H. (2005). Sosyal ve Davranışsal Ölçümlerde Güvenirlilik ve Geçerlilik. Ankara: Seçkin Yayıncılık.
  • Thompson, N. A. (2007). A Practitioner’s Guide for Variable-length Computerized Classification Testing. Practical Assessment Research & Evaluation, 12(1). Available online: http://pareonline.net/getvn.asp?v=12&n=1
  • Thompson, N. A. (2009). Item selection in computerized classification testing. Educational and Psychological Measurement, 69(5), pp. 778-793.
  • Thompson, Nathan A., & Weiss, David A. (2011). A Framework for the Development of Computerized Adaptive Tests. Practical Assessment, Research & Evaluation, 16(1). Available online: http://pareonline.net/getvn.asp?v=16&n=1.
  • Wang, T. & Vispoel, W. P. (1998). Properties of Ability Estimation Methods in Computerized Adaptive Testing. Journal of Educational Measurement, 35 (2), pp. 109-135.
  • Wang, T. (2011). Essentially unbiased EAP estimates in computerized adaptive testing. Paper presented at the annual meeting of the American Educational Research Association Conference, Chicago, USA.
  • Weiss, D. J., & Kingsbury, G. G. (1984). Application of computerized adaptive testing to educational problems. Journal of Educational Measurement, 21, 361-375.
  • Yang, X, Poggio, J. C. & Glasnapp, D. R. (2006). Effects of Estimation Bias on Multiple-Category Classification with an IRT-Based Adaptive Classification Procedure. Educational and Psychological Measurement, 66(4), pp. 545-564.

Bireyselleştirilmiş Bilgisayarlı Sınıflama Testlerinde Madde Havuzu Özelliklerinin Test Uzunluğu ve Sınıflama Doğruluğu Üzerindeki Etkisi

Year 2018, Volume: 33 Issue: 4, 888 - 896, 31.10.2018

Abstract

Bu çalışmada bireyselleştirilmiş bilgisayarlı
sınıflama testlerinde (BBST) madde havuzu özelliklerinden dağılım ve
büyüklüklerin ortalama test uzunluğu ve ortalama sınıflama doğruluğu üzerindeki
etkisi incelenmiştir. Bu amaçla, sivri ve basık dağılımlı 50, 100, 200 ve 300
maddelik madde havuzlarında; tesadüfi madde seçme yöntemi (TMSY), Maksimum
Fisher Bilgisi (MFB) ve Kullback-Leibler Bilgisi (KLB) yöntemleri
incelenmiştir. 1000 bireye ait yetenek parametreleri -3,3 aralığında N(0,1)
olacak şekilde türetilmiştir. Sivri dağılıma sahip madde havuzlarındaki
maddelerin a parametresi U[0,5; 2,0] aralığından; b parametresi N(1, 0,4) ve c
parametresi N(0,15, 0,05) şeklinde; basık dağılıma sahip madde havuzlarındaki
maddeler ise a parametresi U[0,5; 2,0] aralığından; b parametresi N(1, 1,5) ve
c parametresi N(0,15, 0,05) şeklinde türetilmiştir. R’da gerçekleştirilen
simülasyon sonucunda tüm madde havuzlarında ortalama test uzunluğu bakımından
en yüksek değerin TMSY’ye ait olduğu; MFB ve KLB yöntemlerinin birbirine
oldukça benzer çalıştıkları söylenebilir. Madde havuzu büyüklüğü arttıkça test
uzunluklarının kısaldığı; sınıflama doğruluklarının azaldığı ancak tüm
koşullarda 0,90 üstünde yüksek sınıflama doğruluğu elde edildiği görülmüştür.
Ayrıca sivri dağılıma sahip madde havuzlarında test uzunluğunun kısaldığı ve
test etkililiğinin arttığı; sınıflama doğruluklarının ise değişmediği
görülmüştür. Bu sonuçlar dikkate alındığında, BBST’de çok sayıda maddeden
oluşan sivri dağılıma sahip madde havuzları ile yüksek sınıflama doğruluğuna
sahip daha kısa testlerin oluşturulabileceği söylenebilir.

References

  • Babcock, B. & Weiss, D. J. (2009). Termination criteria in computerized adaptive tests: Variable length CATs are not biased. In D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. Retrieved [15.1.2015] from www.psych.umn.edu/psylabs/CATCentral/
  • Dooley, K. (2002), “Simulation research methods,” Companion to Organizations, Joel Baum (ed.), London: Blackwell, pp. 829-848. Embretson, S. E. & Reise, S. P. (2000). Item Response Theory for Psychologist. London: Lawrence Erlbaum Associates Publishers.
  • Flaugher, R. (2000). Item Pools. In Wainer, H. (Ed.) Computerized adaptive testing: A Primer. Mahwah, NJ: Erlbaum.
  • Hambleton, R. K. & Swaminathan, H. (1985). Item Response Theory: Principles and Applications. Boston: Kluwer Nijhoff Publishing.
  • R Core Team. (2013). R: A language and environment for statistical computing (Version 3.0.1) Vienna, Austria: R Foundation for Statistical Computing.
  • Spray, J. A. & Reckase, M. D. (1994). The Selection of Test Items for Decision Making with a Computer Adaptive Test. Paper presented at the Annual Meeting of the National Council on Measurement in Education. New Orleans, LA, April 5-7, 1994.
  • Şencan, H. (2005). Sosyal ve Davranışsal Ölçümlerde Güvenirlilik ve Geçerlilik. Ankara: Seçkin Yayıncılık.
  • Thompson, N. A. (2007). A Practitioner’s Guide for Variable-length Computerized Classification Testing. Practical Assessment Research & Evaluation, 12(1). Available online: http://pareonline.net/getvn.asp?v=12&n=1
  • Thompson, N. A. (2009). Item selection in computerized classification testing. Educational and Psychological Measurement, 69(5), pp. 778-793.
  • Thompson, Nathan A., & Weiss, David A. (2011). A Framework for the Development of Computerized Adaptive Tests. Practical Assessment, Research & Evaluation, 16(1). Available online: http://pareonline.net/getvn.asp?v=16&n=1.
  • Wang, T. & Vispoel, W. P. (1998). Properties of Ability Estimation Methods in Computerized Adaptive Testing. Journal of Educational Measurement, 35 (2), pp. 109-135.
  • Wang, T. (2011). Essentially unbiased EAP estimates in computerized adaptive testing. Paper presented at the annual meeting of the American Educational Research Association Conference, Chicago, USA.
  • Weiss, D. J., & Kingsbury, G. G. (1984). Application of computerized adaptive testing to educational problems. Journal of Educational Measurement, 21, 361-375.
  • Yang, X, Poggio, J. C. & Glasnapp, D. R. (2006). Effects of Estimation Bias on Multiple-Category Classification with an IRT-Based Adaptive Classification Procedure. Educational and Psychological Measurement, 66(4), pp. 545-564.
There are 14 citations in total.

Details

Primary Language Turkish
Journal Section Makaleler
Authors

Ceylan Gündeğer 0000-0003-3572-1708

Nuri Doğan 0000-0001-6274-2016

Publication Date October 31, 2018
Published in Issue Year 2018 Volume: 33 Issue: 4

Cite

APA Gündeğer, C., & Doğan, N. (2018). Bireyselleştirilmiş Bilgisayarlı Sınıflama Testlerinde Madde Havuzu Özelliklerinin Test Uzunluğu ve Sınıflama Doğruluğu Üzerindeki Etkisi. Hacettepe Üniversitesi Eğitim Fakültesi Dergisi, 33(4), 888-896.