The Effect of Aberrant Responses on Ability Estimation in Computer Adaptive Tests
Year 2022,
Volume: 13 Issue: 3, 256 - 268, 30.09.2022
Sebahat Gören
,
Hakan Kara
,
Başak Erdem Kara
,
Hülya Kelecioğlu
Abstract
In computer adaptive test (CAT), aberrant responses caused by some factors such as lucky guesses and carelessness errors may cause significant bias in ability estimation. Correct responses resulting from lucky guesses and false responses resulting from carelessness or anxiety may reveal aberrant responses and the impact of these types of aberrant responses may cause an erroneous estimation of the examinee’s actual ability because they do not reflect the examinee’s actual knowledge. In this study, the performances of regarding ability estimation were examined comparatively in the context of CAT simulations in case of aberrant responses.Under different conditions, twelve different CAT simulations were conducted with 10 replications for each of the conditions. Correlation, RMSE, bias, and mean absolute error (MAE) values were calculated and interpreted for each condition. Results generally indicated that the 4PL IRT model provided a more efficient and robust ability estimation than the 3PL IRT model and the 4PL model increased the precision and effectiveness of the CAT applications.
References
- Ackerman, T. A. (1989). Unidimensional IRT calibration of compensatory and noncompensatory multidimensional items. Applied Psychological Measurement, 13(2), 113-127. https://doi.org/10.1177/014662168901300201
- Babcock, B., & Weiss, D. J. (2012). Termination criteria in computerized adaptive tests: Do variable-length cats provide efficient and effective measurement? Journal of Computerized Adaptive Testing, 1(1), 1-18. https://doi.org/10.7333/1212-0101001
- Barton, M. A., & Lord, F. M. (1981). An upper asymptote for the three-parameter logistic item response model. (RR 81-20). Educational Testing Service. https://doi.org/10.1002/j.2333-8504.1981.tb01255.x
- Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologist. Lawrence Erlbaum Associates.
- Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principals and applications. Kluwer Academic Publishers.
- Jia, B., Zhang, X., & Zhu, Z. (2019). A short note on aberrant responses bias in item response theory. Frontiers in Psychology, 10, 43. https://doi.org/10.3389/fpsyg.2019.00043
- Liao, W., Ho, R., Yen, Y., & Cheng, H. (2012). The four-parameter logistic item response theory model as a robust method of estimating ability despite aberrant responses. Social Behavior and Personality, 40(10), 1679–1694. https://doi.org/10.2224/sbp.2012.40.10.1679
- Loken, E., & Rulison, K. L. (2010). Estimation of a four-parameter item response theory model. The British Journal of Mathematical and Statistical Psychology, 63(3), 509–25. https://doi.org/10.1348/000711009X474502
- Magis, D. (2014). On the asymptotic standard error of a class of robust estimators of ability in dichotomous item response models. The British Journal of Mathematical and Statistical Psychology, 67(3), 430–450. https://doi.org/10.1111/bmsp.12027
- Miller, I. & Miller, M. (2004). John E. Freund’s mathematical statistics with applications (7th ed.). Prentice Hall.
- Reckase, M., D. (2009). Multidimensional item response theory: Statistics for social and behavioral sciences. Springer.
- Rulison, K. L., & Loken, E. (2009). I’ve fallen and I can't get up: Can high-ability students recover from early mistakes in CAT? Applied Psychological Measurement, 33(2), 83–101. https://doi.org/10.1177/0146621608324023
- Segall, D. O. (2004). Computerized adaptive testing. In K. Kempf-Leonard (Ed.), Encyclopedia of social measurement (pp. 429-438). Academic.
- Thompson, N. A. (2009). Item selection in computerized classification testing. Educational and Psychological Measurement, 69(5), 778-793. https://doi.org/10.1177/0013164408324460
- Thompson, N. A., & Weiss, D. J. (2011). A framework for the development of computerized adaptive tests. Practical Assessment, Research & Evaluation, 16(1), 1-9. https://doi.org/10.7275/wqzt-9427
- Wainer, H. (Ed.). (2000). Computerized adaptive testing: A primer (2nd ed.). Lawrence Erlbaum.
- Waller, N. G., & Reise, S. P. (2010). Measuring psychopathology with non-standard item response theory models: Fitting the four-parameter model to the Minnesota Multiphasic Personality Inventory. In S. Embretson (Ed), New directions in psychological measurement with model-based approaches (pp. 147-173). American Psychological Association.
- Weiss, D. J. (2004). Computerized adaptive testing for effective and efficient measurement in counseling and education. Measurement and Evaluation in Counseling and Development, 37(2), 71-84. https://doi.org/10.1080/07481756.2004.11909751
Year 2022,
Volume: 13 Issue: 3, 256 - 268, 30.09.2022
Sebahat Gören
,
Hakan Kara
,
Başak Erdem Kara
,
Hülya Kelecioğlu
References
- Ackerman, T. A. (1989). Unidimensional IRT calibration of compensatory and noncompensatory multidimensional items. Applied Psychological Measurement, 13(2), 113-127. https://doi.org/10.1177/014662168901300201
- Babcock, B., & Weiss, D. J. (2012). Termination criteria in computerized adaptive tests: Do variable-length cats provide efficient and effective measurement? Journal of Computerized Adaptive Testing, 1(1), 1-18. https://doi.org/10.7333/1212-0101001
- Barton, M. A., & Lord, F. M. (1981). An upper asymptote for the three-parameter logistic item response model. (RR 81-20). Educational Testing Service. https://doi.org/10.1002/j.2333-8504.1981.tb01255.x
- Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologist. Lawrence Erlbaum Associates.
- Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principals and applications. Kluwer Academic Publishers.
- Jia, B., Zhang, X., & Zhu, Z. (2019). A short note on aberrant responses bias in item response theory. Frontiers in Psychology, 10, 43. https://doi.org/10.3389/fpsyg.2019.00043
- Liao, W., Ho, R., Yen, Y., & Cheng, H. (2012). The four-parameter logistic item response theory model as a robust method of estimating ability despite aberrant responses. Social Behavior and Personality, 40(10), 1679–1694. https://doi.org/10.2224/sbp.2012.40.10.1679
- Loken, E., & Rulison, K. L. (2010). Estimation of a four-parameter item response theory model. The British Journal of Mathematical and Statistical Psychology, 63(3), 509–25. https://doi.org/10.1348/000711009X474502
- Magis, D. (2014). On the asymptotic standard error of a class of robust estimators of ability in dichotomous item response models. The British Journal of Mathematical and Statistical Psychology, 67(3), 430–450. https://doi.org/10.1111/bmsp.12027
- Miller, I. & Miller, M. (2004). John E. Freund’s mathematical statistics with applications (7th ed.). Prentice Hall.
- Reckase, M., D. (2009). Multidimensional item response theory: Statistics for social and behavioral sciences. Springer.
- Rulison, K. L., & Loken, E. (2009). I’ve fallen and I can't get up: Can high-ability students recover from early mistakes in CAT? Applied Psychological Measurement, 33(2), 83–101. https://doi.org/10.1177/0146621608324023
- Segall, D. O. (2004). Computerized adaptive testing. In K. Kempf-Leonard (Ed.), Encyclopedia of social measurement (pp. 429-438). Academic.
- Thompson, N. A. (2009). Item selection in computerized classification testing. Educational and Psychological Measurement, 69(5), 778-793. https://doi.org/10.1177/0013164408324460
- Thompson, N. A., & Weiss, D. J. (2011). A framework for the development of computerized adaptive tests. Practical Assessment, Research & Evaluation, 16(1), 1-9. https://doi.org/10.7275/wqzt-9427
- Wainer, H. (Ed.). (2000). Computerized adaptive testing: A primer (2nd ed.). Lawrence Erlbaum.
- Waller, N. G., & Reise, S. P. (2010). Measuring psychopathology with non-standard item response theory models: Fitting the four-parameter model to the Minnesota Multiphasic Personality Inventory. In S. Embretson (Ed), New directions in psychological measurement with model-based approaches (pp. 147-173). American Psychological Association.
- Weiss, D. J. (2004). Computerized adaptive testing for effective and efficient measurement in counseling and education. Measurement and Evaluation in Counseling and Development, 37(2), 71-84. https://doi.org/10.1080/07481756.2004.11909751