Research Article
BibTex RIS Cite

Investigation of Measurement Precision and Test Lengths in Computerized Adaptive Tests in Different Conditions

Year 2024, Volume: 15 Issue: 1, 5 - 17, 31.03.2024
https://doi.org/10.21031/epod.1068572

Abstract

In this study, it is aimed to examine item exposure rate, content balancing, and ability estimation in terms of termination rules with regard to testing lengths and testing accuracy in computerized adaptive testing. In this context, EAP and MLE ability estimation methods were compared with 1, 2 and 4 group content balancing pattern; 0.50, 0.75 and 1.00 exposure rate; it was compared with a total of 72 different conditions, including 0.35 and 0.40 standard error-based and the termination rule based on the test length of 15 and 30, was compared to correlation, bias, RMSE and test length. The production and analysis of the data were performed in the R program. As a result, the best performance in the measurement is a fixed test length of 30 items with 0.35 standard error; in group 1 pattern where the content balancing is not a group limitation; the exposure rate was displayed in the range of 0.75 and 1. Depending on the test length of ability estimation methods, scope balancing patterns and exposure rates, the number of items changes in the range of 22 and 25; Based on the termination rule, it was estimated that at least 0.40 standard errors with a standard error based on 39 items.

References

  • Aybek, E., & Çıkrıkçı, R. (2018). Kendini Değerlendirme Envanteri’nin Bilgisayar Ortamında Bireye Uyarlanmış Test Olarak Uygulanabilirliği. Türk Psikolojik Danışma ve Rehberlik Dergisi, 117-141.
  • Babcock, B. & Weiss, D.J. (2012). Termination criteria in Computerized Adaptive Tests: do variable-length CAT’s provide efficient and effective measurement? International Association for Computerized Adaptive Testing, 1, 1-18. Boyd, M. A. (2003). Strategies for Controlling Testlet Exposure Rates in Computerized Adaptive Testing Systems. Unpublished Doctoral Thesis, The University of Texas, Austin.
  • Boyd, A. M., Dodd, B., & Fitzpatrick, S. (2013). A Comparison of Exposure Control Procedures in CAT Systems Based on Different Measurement Models for Testlets. Applied Measurement in Education, 113–135.
  • Chen, J.-H., Chao, H.-Y., & Chen, S.-Y. (2019). A Dynamic Stratification Method for Improving Trait Estimation in Computerized Adaptive Testing Under Item Exposure Control. Applied Psychological Measurement, 1-15.
  • Davis, L. L. (2002). Strategies for Controlling Item Exposure in Computerized Adaptive Testing with Polytomously Scored Items. Unpublished Doctoral Thesis, The University of Texas, Austin.
  • Davis, L. L., & Dodd, B. G. (2005). Strategies for Controlling Item Exposure in Computerized Adaptive Testing with Partial Credit Model. Pearson Educational Measurement.
  • Choe, E., Kern, J., & Chang, H.-H. (2017). Optimizing the Use of Response Times for Item Selection in Computerized Adaptive Testing. Journal of Educational and Behavioral Statistics, 1-24.
  • Embretson, S. E., & Reise, S. P. (2000). Item Response Theory for Psychologists. Taylor & Francis.
  • Eroğlu, M. G., ve Kelecioğlu, H. (2012). Bireyselleştirilmiş Bilgisayarlı Test Uygulamalarında Farklı Sonlandırma Kurallarının Ölçme Kesinliği ve Test Uzunluğu Açısından Karşılaştırılması. Uludağ Üniversitesi Eğitim Fakültesi Dergisi, 28(1), 31-52.
  • Flaugher, R. (2000). Item Pools. H. Wainer içinde, Computerized Adaptive Testing: A Primer Second Edition (s. 37-59). New Jersey: Lawrence Erlbaum Associates, Publishers.
  • Fraenkel, J., & Wallen, N. (2011). How to design and evaluate research in education (6th ed.). New York: McGraw-Hill, Inc.
  • Kalender, İ. (2011). Effects of different Computerized Adaptive Testing strategied on recovery of abilitiy. Doctoral Disertation. Middle East Technical University.
  • Kezer, F. (2013). Bilgisayar ortamında bireye uyarlanmış test stratejilerinin karşılaştırılması. Yayınlamamış Doktora Tezi. Ankara: Ankara Üniversitesi Eğitim Bilimleri Enstitüsü.
  • Lee, M. (2014). Application of higher-order IRT models and hierarchical IRT models to computerized adaptive testing (Unpublished doctoral dissertation). University of California, Los Angeles.
  • Magis, D., & Raiche, G. (2012). Random Generation of Response Patterns under Computerized Adaptive Testing with the R package catR. Journal of Statistical Software, 1-31.
  • Özbaşı, D., ve Demirtaşlı, N. (2015). Bilgisayar Okuryazarlığı Testinin Bilgisayar Ortamında Bireye Uyarlanmış Test Olarak Geliştirilmesi. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 6(2), 218-237.
  • Pastor, D. A., Dodd, B. G., & Chang, H.-H. (2002). A Comparison of Item Selection Techniques and Exposure Control Mechanisms in CATs Using the Generalized Partial Credit Model. Applied Psychological Measurement , 147-163.
  • Reckase, M. D. (2009). Designing item pools to optimize the functioning of a computerized adaptive test. Psychological Test and Assessment Modeling, 127-141. Sulak, S., & Kelecioğlu, H. (2019). Investigation of Item Selection Methods According to Test Termination Rules in CAT Applications. Journal of Measurement and Evaluation in Education and Psychology, 315-326.
  • Thompson, N. A., & Weiss, D. J. (2011). A Framework for the Development of Computerized Adaptive Tests. Practical Assessment, Research & Evaluation., 1- 9.
  • Yao, L. (2013). Comparing the Performance of Five Multidimensional CAT Selection Procedures With Different Stopping Rules. Applied Psychological Measurement, 3-23.
  • Wainer, H. (2000). Introduction and history. In H. Wainer (Ed.), Computerized adaptive testing: A primer (2. baskı., p. 1–22). Mahwah N.J.: Lawrence Erlbaum Associates
  • Weiss, D. J. (2004). Computerized Adaptive Testing for Effective and Efficient Measurement in Counseling and Education. Measurement and Evaluation in Counseling and Development, 70-84.
Year 2024, Volume: 15 Issue: 1, 5 - 17, 31.03.2024
https://doi.org/10.21031/epod.1068572

Abstract

References

  • Aybek, E., & Çıkrıkçı, R. (2018). Kendini Değerlendirme Envanteri’nin Bilgisayar Ortamında Bireye Uyarlanmış Test Olarak Uygulanabilirliği. Türk Psikolojik Danışma ve Rehberlik Dergisi, 117-141.
  • Babcock, B. & Weiss, D.J. (2012). Termination criteria in Computerized Adaptive Tests: do variable-length CAT’s provide efficient and effective measurement? International Association for Computerized Adaptive Testing, 1, 1-18. Boyd, M. A. (2003). Strategies for Controlling Testlet Exposure Rates in Computerized Adaptive Testing Systems. Unpublished Doctoral Thesis, The University of Texas, Austin.
  • Boyd, A. M., Dodd, B., & Fitzpatrick, S. (2013). A Comparison of Exposure Control Procedures in CAT Systems Based on Different Measurement Models for Testlets. Applied Measurement in Education, 113–135.
  • Chen, J.-H., Chao, H.-Y., & Chen, S.-Y. (2019). A Dynamic Stratification Method for Improving Trait Estimation in Computerized Adaptive Testing Under Item Exposure Control. Applied Psychological Measurement, 1-15.
  • Davis, L. L. (2002). Strategies for Controlling Item Exposure in Computerized Adaptive Testing with Polytomously Scored Items. Unpublished Doctoral Thesis, The University of Texas, Austin.
  • Davis, L. L., & Dodd, B. G. (2005). Strategies for Controlling Item Exposure in Computerized Adaptive Testing with Partial Credit Model. Pearson Educational Measurement.
  • Choe, E., Kern, J., & Chang, H.-H. (2017). Optimizing the Use of Response Times for Item Selection in Computerized Adaptive Testing. Journal of Educational and Behavioral Statistics, 1-24.
  • Embretson, S. E., & Reise, S. P. (2000). Item Response Theory for Psychologists. Taylor & Francis.
  • Eroğlu, M. G., ve Kelecioğlu, H. (2012). Bireyselleştirilmiş Bilgisayarlı Test Uygulamalarında Farklı Sonlandırma Kurallarının Ölçme Kesinliği ve Test Uzunluğu Açısından Karşılaştırılması. Uludağ Üniversitesi Eğitim Fakültesi Dergisi, 28(1), 31-52.
  • Flaugher, R. (2000). Item Pools. H. Wainer içinde, Computerized Adaptive Testing: A Primer Second Edition (s. 37-59). New Jersey: Lawrence Erlbaum Associates, Publishers.
  • Fraenkel, J., & Wallen, N. (2011). How to design and evaluate research in education (6th ed.). New York: McGraw-Hill, Inc.
  • Kalender, İ. (2011). Effects of different Computerized Adaptive Testing strategied on recovery of abilitiy. Doctoral Disertation. Middle East Technical University.
  • Kezer, F. (2013). Bilgisayar ortamında bireye uyarlanmış test stratejilerinin karşılaştırılması. Yayınlamamış Doktora Tezi. Ankara: Ankara Üniversitesi Eğitim Bilimleri Enstitüsü.
  • Lee, M. (2014). Application of higher-order IRT models and hierarchical IRT models to computerized adaptive testing (Unpublished doctoral dissertation). University of California, Los Angeles.
  • Magis, D., & Raiche, G. (2012). Random Generation of Response Patterns under Computerized Adaptive Testing with the R package catR. Journal of Statistical Software, 1-31.
  • Özbaşı, D., ve Demirtaşlı, N. (2015). Bilgisayar Okuryazarlığı Testinin Bilgisayar Ortamında Bireye Uyarlanmış Test Olarak Geliştirilmesi. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 6(2), 218-237.
  • Pastor, D. A., Dodd, B. G., & Chang, H.-H. (2002). A Comparison of Item Selection Techniques and Exposure Control Mechanisms in CATs Using the Generalized Partial Credit Model. Applied Psychological Measurement , 147-163.
  • Reckase, M. D. (2009). Designing item pools to optimize the functioning of a computerized adaptive test. Psychological Test and Assessment Modeling, 127-141. Sulak, S., & Kelecioğlu, H. (2019). Investigation of Item Selection Methods According to Test Termination Rules in CAT Applications. Journal of Measurement and Evaluation in Education and Psychology, 315-326.
  • Thompson, N. A., & Weiss, D. J. (2011). A Framework for the Development of Computerized Adaptive Tests. Practical Assessment, Research & Evaluation., 1- 9.
  • Yao, L. (2013). Comparing the Performance of Five Multidimensional CAT Selection Procedures With Different Stopping Rules. Applied Psychological Measurement, 3-23.
  • Wainer, H. (2000). Introduction and history. In H. Wainer (Ed.), Computerized adaptive testing: A primer (2. baskı., p. 1–22). Mahwah N.J.: Lawrence Erlbaum Associates
  • Weiss, D. J. (2004). Computerized Adaptive Testing for Effective and Efficient Measurement in Counseling and Education. Measurement and Evaluation in Counseling and Development, 70-84.
There are 22 citations in total.

Details

Primary Language English
Journal Section Articles
Authors

Hüseyin Yıldız 0000-0003-2387-263X

Ceren Tunaboylu 0000-0001-8090-8913

Süleyman Ülkü 0000-0003-1965-0671

Gamze Giray This is me 0000-0002-5795-4521

Hülya Kelecioğlu 0000-0002-0741-9934

Publication Date March 31, 2024
Acceptance Date June 1, 2023
Published in Issue Year 2024 Volume: 15 Issue: 1

Cite

APA Yıldız, H., Tunaboylu, C., Ülkü, S., Giray, G., et al. (2024). Investigation of Measurement Precision and Test Lengths in Computerized Adaptive Tests in Different Conditions. Journal of Measurement and Evaluation in Education and Psychology, 15(1), 5-17. https://doi.org/10.21031/epod.1068572