Research Article
BibTex RIS Cite

Comparison of CAT Procedures at Low Ability Levels: A Simulation Study and Analysis in the Context of Students with Disabilities

Year 2024, Volume: 13 Issue: 3, 547 - 559, 31.07.2024

Abstract

The estimation of extreme abilities in computerized adaptive testing (CAT) is more biased and less accurate than that of intermediate abilities. This situation contradicts the structure of CAT, which targets all ability levels. This research aims to determine the procedures that perform better at lower skill levels, in accordance with other ability levels, by comparing the performances of various CAT procedures. In addition, a large-scale test examined whether the determined procedures would show similar performance in the ability levels of students with disabilities, as a group unfortunately more often of extreme abilities and that CAT will offer advantages in many respects. A pool of 1000 items and 1000 examinees with standard normal ability distribution were simulated with Monte Carlo. The CAT performances of 36 conditions consisting of different item selection methods, ability estimation methods and termination rules were compared. As a result of the research, the precision criterion termination rule used together with the maximum likelihood ability estimation method, Kullbak-Leibler information item selection rule, and precision criterion termination rule with test length limit (20 items) performed better and more consistently in terms of CAT performance across the ability levels. These procedures show high performance in the ability levels of students with disabilities, also in real data.

References

  • Babcock, B., & Weiss, D. J. (2009). Termination criteria in computerized adaptive tests: Variable-length CATs are not biased. Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing.
  • Barrada, J. R., Olea, J., Ponsoda, V., & Abad, F. J. (2009). Item selection rules in computerized adaptive testing: Accuracy and security. Methodology, 5(1), 7–17. https://doi.org/10.1027/1614-2241.5.1.7
  • Belov, D. I., & Armstrong, R. D. (2009). Direct and inverse problems of item pool design for computerized adaptive testing. Educational and Psychological Measurement, 69(4), 533–547. https://doi.org/10.1177/0013164409332224
  • Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6(4), 431–444. https://doi.org/10.1177/014662168200600405
  • Chen, S. K., Hou, L., Fitzpatrick, S. J., & Dodd, B. G. (1997). The effect of population distribution and method of theta estimation on computerized adaptive testing (cat) using the rating scale model. Educational and Psychological Measurement, 57(3), 422–439. https://doi.org/10.1177/0013164497057003004
  • Choi, S. W., Grady, M. W., & Dodd, B. G. (2011). A new stopping rule for computerized adaptive testing. Educational and Psychological Measurement, 71(1), 37–53. https://doi.org/10.1177/0013164410387338
  • Embretson, S., & Reise, S. P. (2000). Item Response Theory for psychologists. Lawrence Erlbaum Associates.
  • Gibbons, R. D., Weiss, D. J., Frank, E., & Kupfer, D. (2016). Computerized adaptive diagnosis and testing of mental health disorders. Annual Review of Clinical Psychology, 12, 83–104. https://doi.org/10.1146/annurev-clinpsy-021815-093634
  • Gibbons, R. D., Weiss, D. J., Pilkonis, P. A., Frank, E., Moore, T., Kim, J. B., & Kupfer, D. J. (2014). Development of a computerized adaptive test for anxiety. American Journal of Psychiatry, 171(2), 187–194. https://doi.org/10.1176/appi.ajp.2013.13020178
  • Hambleton, R. K., & Swaminathan, H. (1985). Item response theory, principles and applications. Springer Science+Business Media. https://doi.org/10.1017/CBO9781107415324.004
  • He, W., Diao, Q., & Hauser, C. (2013). A comparison of four item-selection methods for severely constrained CATs. NCME Annual Meeting, 1–26.
  • Kezer, F., & Koç, N. (2014). A comparison of computerized adaptive testing strategies. Eğitim Bilimleri Araştırmaları Dergisi, 4(1), 145–174. https://doi.org/10.12973/jesr.2014.41.8
  • Linacre, J. M. (2000). Computer-Adaptive Testing: A Methodology whose time has cCome. Komesa Press.
  • Lord, F. M. (1980). Applications of Item Response Theory to practical testing problems. Routledge.
  • Magis, D., Yan, D., & von Davier, A. A. (2018). Computerized adaptive and multistage testing with R: Using packages catR and mstR. In Measurement: Interdisciplinary Research and Perspectives (Vol. 16, Issue 4). https://doi.org/10.1080/15366367.2018.1520560
  • Maurelli, V., & Weiss, D. J. (1981). Factors Influencing the Psychometric Characteristics of an Adaptive Testing Strategy for Test Batteries.
  • Ministry of National Education. (2018). Sınavla öğrenci̇ alacak ortaöğreti̇m kurumlarına i̇li̇şki̇n merkezî sınav başvuru ve uygulama klavuzu [Application guide of central examination for secondary education institutions].
  • Mislevy, R. J., & Bock, R. D. (1982). Biweight estimates of latent ability. Educational and Psychological Measurement, 42(3), 725–737. https://doi.org/10.1177/001316448204200302
  • Reckase, M. D. (2010). Designing item pools to optimize the functioning of a computerized adaptive test. Psychological Test and Assessment Modeling, 52(2), 127–141. https://psycnet.apa.org/record/2010-17096-001
  • Riley, B. B., Conrad, K. J., Bezruczko, N., & Dennis, M. L. (2007). Relative precision, efficiency and construct validity of different starting and stopping rules for a computerized adaptive test: The GAIN substance problem scale. Journal of Applied Measurement, 8(1), 48–64. /pmc/articles/PMC5933849/
  • Sahin, A., & Ozbasi, D. (2017). Effects of content balancing and item selection method on ability estimation in computerized adaptive tests. Eurasian Journal of Educational Research, 69, 21–36. https://doi.org/10.14689/ejer.2017.69.2
  • Şahin, A., & Weiss, D. J. (2015). Effects of calibration sample size and item bank size on ability estimation in computerized adaptive testing. Educational Sciences: Theory & Practice, 15(6), 1585–1595. https://doi.org/10.12738/estp.2015.6.0102
  • Segall, D. O. (2004). Computerized adaptive testing. Encyclopedia of Social Measurement, 429–438. https://doi.org/10.1016/B0-12-369398-5/00444-8
  • Şenel, S., & Kutlu, Ö. (2018a). Computerized adaptive testing design for students with visual impairment. Egitim ve Bilim, 43(194), 261–284. https://doi.org/10.15390/EB.2018.7515
  • Şenel, S., & Kutlu, Ö. (2018b). Comparison of two test methods for VIS: paper-pencil test and CAT. European Journal of Special Needs Education, 33(5), 631–645. https://doi.org/10.1080/08856257.2017.1391014
  • Şenel, S., & Şenel, H. C. (2018). Bilgisayar tabanlı testlerde evrensel tasarım: Özel gereksinimli öğrenciler için düzenlemeler [Universal design in computer-based testing: Test Accommodations for students with special needs]. In S. Dinçer (Ed.), Değişen dünyada eğitim (1st ed., pp. 113–124). Pegem Akademi. https://doi.org/10.14527/9786052412480.08
  • Seo, D. G., & Choi, J. (2018). Post-hoc simulation study of computerized adaptive testing for the Korean Medical Licensing Examination. Journal of Educational Evaluation for Health Professions, 15, 14. https://doi.org/10.3352/jeehp.2018.15.14
  • Seo, D. G., & Weiss, D. J. (2015). Best Design for Multidimensional Computerized Adaptive Testing With the Bifactor Model. Educational and Psychological Measurement, 75(6), 954–978. https://doi.org/10.1177/0013164415575147
  • Stone, E., & Davey, T. (2011). Computer-adaptive testing for students with disabilities: A review of the literature. ETS Research Report Series, 2011(2), i–24. https://doi.org/10.1002/j.2333-8504.2011.tb02268.x
  • van der Linden, W. J., Ariel, A., & Veldkamp, B. P. (2006). Assembling a computerized adaptive testing item pool as a set of linear tests. Journal of Educational and Behavioral Statistics, 31(1), 81–99. https://doi.org/10.3102/10769986031001081
  • van der Linden, W. J., & Glas, C. A. W. (2010). Elements of adaptive testing. Springer.
  • Wainer, H., Dorans, N. J., Flaugher, R., Green, B. F., & Mislevy, R. J. (2000a). Computerized adaptive testing: A primer. Routledge.
  • Wainer, H., Dorans, N. J., Flaugher, R., Green, B. F., & Mislevy, R. J. (2000b). Computerized adaptive testing: A primer (2nd ed.). Lawrence Erlbaum Associates.
  • Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54(3), 427–450. https://doi.org/10.1007/BF02294627
  • Weiss, D. J. (1973). The stratified adaptive computerized ability test.
  • Weiss, D. J. (2011). Better data from better measurements using computerized adaptive testing. Journal of Methods and Measurement in the Social Sciences, 2(1), 1. https://doi.org/10.2458/jmm.v2i1.12351
  • Yao, L. (2013). Comparing the performance of five multidimensional CAT selection procedures with different stopping rules. Applied Psychological Measurement, 37(1), 3–23. https://doi.org/10.1177/0146621612455687
Year 2024, Volume: 13 Issue: 3, 547 - 559, 31.07.2024

Abstract

References

  • Babcock, B., & Weiss, D. J. (2009). Termination criteria in computerized adaptive tests: Variable-length CATs are not biased. Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing.
  • Barrada, J. R., Olea, J., Ponsoda, V., & Abad, F. J. (2009). Item selection rules in computerized adaptive testing: Accuracy and security. Methodology, 5(1), 7–17. https://doi.org/10.1027/1614-2241.5.1.7
  • Belov, D. I., & Armstrong, R. D. (2009). Direct and inverse problems of item pool design for computerized adaptive testing. Educational and Psychological Measurement, 69(4), 533–547. https://doi.org/10.1177/0013164409332224
  • Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6(4), 431–444. https://doi.org/10.1177/014662168200600405
  • Chen, S. K., Hou, L., Fitzpatrick, S. J., & Dodd, B. G. (1997). The effect of population distribution and method of theta estimation on computerized adaptive testing (cat) using the rating scale model. Educational and Psychological Measurement, 57(3), 422–439. https://doi.org/10.1177/0013164497057003004
  • Choi, S. W., Grady, M. W., & Dodd, B. G. (2011). A new stopping rule for computerized adaptive testing. Educational and Psychological Measurement, 71(1), 37–53. https://doi.org/10.1177/0013164410387338
  • Embretson, S., & Reise, S. P. (2000). Item Response Theory for psychologists. Lawrence Erlbaum Associates.
  • Gibbons, R. D., Weiss, D. J., Frank, E., & Kupfer, D. (2016). Computerized adaptive diagnosis and testing of mental health disorders. Annual Review of Clinical Psychology, 12, 83–104. https://doi.org/10.1146/annurev-clinpsy-021815-093634
  • Gibbons, R. D., Weiss, D. J., Pilkonis, P. A., Frank, E., Moore, T., Kim, J. B., & Kupfer, D. J. (2014). Development of a computerized adaptive test for anxiety. American Journal of Psychiatry, 171(2), 187–194. https://doi.org/10.1176/appi.ajp.2013.13020178
  • Hambleton, R. K., & Swaminathan, H. (1985). Item response theory, principles and applications. Springer Science+Business Media. https://doi.org/10.1017/CBO9781107415324.004
  • He, W., Diao, Q., & Hauser, C. (2013). A comparison of four item-selection methods for severely constrained CATs. NCME Annual Meeting, 1–26.
  • Kezer, F., & Koç, N. (2014). A comparison of computerized adaptive testing strategies. Eğitim Bilimleri Araştırmaları Dergisi, 4(1), 145–174. https://doi.org/10.12973/jesr.2014.41.8
  • Linacre, J. M. (2000). Computer-Adaptive Testing: A Methodology whose time has cCome. Komesa Press.
  • Lord, F. M. (1980). Applications of Item Response Theory to practical testing problems. Routledge.
  • Magis, D., Yan, D., & von Davier, A. A. (2018). Computerized adaptive and multistage testing with R: Using packages catR and mstR. In Measurement: Interdisciplinary Research and Perspectives (Vol. 16, Issue 4). https://doi.org/10.1080/15366367.2018.1520560
  • Maurelli, V., & Weiss, D. J. (1981). Factors Influencing the Psychometric Characteristics of an Adaptive Testing Strategy for Test Batteries.
  • Ministry of National Education. (2018). Sınavla öğrenci̇ alacak ortaöğreti̇m kurumlarına i̇li̇şki̇n merkezî sınav başvuru ve uygulama klavuzu [Application guide of central examination for secondary education institutions].
  • Mislevy, R. J., & Bock, R. D. (1982). Biweight estimates of latent ability. Educational and Psychological Measurement, 42(3), 725–737. https://doi.org/10.1177/001316448204200302
  • Reckase, M. D. (2010). Designing item pools to optimize the functioning of a computerized adaptive test. Psychological Test and Assessment Modeling, 52(2), 127–141. https://psycnet.apa.org/record/2010-17096-001
  • Riley, B. B., Conrad, K. J., Bezruczko, N., & Dennis, M. L. (2007). Relative precision, efficiency and construct validity of different starting and stopping rules for a computerized adaptive test: The GAIN substance problem scale. Journal of Applied Measurement, 8(1), 48–64. /pmc/articles/PMC5933849/
  • Sahin, A., & Ozbasi, D. (2017). Effects of content balancing and item selection method on ability estimation in computerized adaptive tests. Eurasian Journal of Educational Research, 69, 21–36. https://doi.org/10.14689/ejer.2017.69.2
  • Şahin, A., & Weiss, D. J. (2015). Effects of calibration sample size and item bank size on ability estimation in computerized adaptive testing. Educational Sciences: Theory & Practice, 15(6), 1585–1595. https://doi.org/10.12738/estp.2015.6.0102
  • Segall, D. O. (2004). Computerized adaptive testing. Encyclopedia of Social Measurement, 429–438. https://doi.org/10.1016/B0-12-369398-5/00444-8
  • Şenel, S., & Kutlu, Ö. (2018a). Computerized adaptive testing design for students with visual impairment. Egitim ve Bilim, 43(194), 261–284. https://doi.org/10.15390/EB.2018.7515
  • Şenel, S., & Kutlu, Ö. (2018b). Comparison of two test methods for VIS: paper-pencil test and CAT. European Journal of Special Needs Education, 33(5), 631–645. https://doi.org/10.1080/08856257.2017.1391014
  • Şenel, S., & Şenel, H. C. (2018). Bilgisayar tabanlı testlerde evrensel tasarım: Özel gereksinimli öğrenciler için düzenlemeler [Universal design in computer-based testing: Test Accommodations for students with special needs]. In S. Dinçer (Ed.), Değişen dünyada eğitim (1st ed., pp. 113–124). Pegem Akademi. https://doi.org/10.14527/9786052412480.08
  • Seo, D. G., & Choi, J. (2018). Post-hoc simulation study of computerized adaptive testing for the Korean Medical Licensing Examination. Journal of Educational Evaluation for Health Professions, 15, 14. https://doi.org/10.3352/jeehp.2018.15.14
  • Seo, D. G., & Weiss, D. J. (2015). Best Design for Multidimensional Computerized Adaptive Testing With the Bifactor Model. Educational and Psychological Measurement, 75(6), 954–978. https://doi.org/10.1177/0013164415575147
  • Stone, E., & Davey, T. (2011). Computer-adaptive testing for students with disabilities: A review of the literature. ETS Research Report Series, 2011(2), i–24. https://doi.org/10.1002/j.2333-8504.2011.tb02268.x
  • van der Linden, W. J., Ariel, A., & Veldkamp, B. P. (2006). Assembling a computerized adaptive testing item pool as a set of linear tests. Journal of Educational and Behavioral Statistics, 31(1), 81–99. https://doi.org/10.3102/10769986031001081
  • van der Linden, W. J., & Glas, C. A. W. (2010). Elements of adaptive testing. Springer.
  • Wainer, H., Dorans, N. J., Flaugher, R., Green, B. F., & Mislevy, R. J. (2000a). Computerized adaptive testing: A primer. Routledge.
  • Wainer, H., Dorans, N. J., Flaugher, R., Green, B. F., & Mislevy, R. J. (2000b). Computerized adaptive testing: A primer (2nd ed.). Lawrence Erlbaum Associates.
  • Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54(3), 427–450. https://doi.org/10.1007/BF02294627
  • Weiss, D. J. (1973). The stratified adaptive computerized ability test.
  • Weiss, D. J. (2011). Better data from better measurements using computerized adaptive testing. Journal of Methods and Measurement in the Social Sciences, 2(1), 1. https://doi.org/10.2458/jmm.v2i1.12351
  • Yao, L. (2013). Comparing the performance of five multidimensional CAT selection procedures with different stopping rules. Applied Psychological Measurement, 37(1), 3–23. https://doi.org/10.1177/0146621612455687
There are 37 citations in total.

Details

Primary Language English
Subjects Studies on Education
Journal Section Articles
Authors

Selma Şenel 0000-0002-5803-0793

Early Pub Date July 18, 2024
Publication Date July 31, 2024
Published in Issue Year 2024 Volume: 13 Issue: 3

Cite

APA Şenel, S. (2024). Comparison of CAT Procedures at Low Ability Levels: A Simulation Study and Analysis in the Context of Students with Disabilities. Bartın University Journal of Faculty of Education, 13(3), 547-559.

All the articles published in the journal are open access and distributed under the conditions of CommonsAttribution-NonCommercial 4.0 International License 

88x31.png


Bartın University Journal of Faculty of Education