The Effect of ratio of items indicating differential item functioning on computer adaptive and multi-stage tests

Başak Erdem Kara; Nuri Doğan

doi:10.21449/ijate.1105769

Araştırma Makalesi

The Effect of ratio of items indicating differential item functioning on computer adaptive and multi-stage tests

Yıl 2022, Cilt: 9 Sayı: 3, 682 - 696, 30.09.2022

Başak Erdem Kara , Nuri Doğan

https://doi.org/10.21449/ijate.1105769

Cited By: 4

Öz

Recently, adaptive test approaches have become a viable alternative to traditional fixed-item tests. The main advantage of adaptive tests is that they reach desired measurement precision with fewer items. However, fewer items mean that each item has a more significant effect on ability estimation and therefore those tests are open to more consequential results from any flaw in an item. So, any items indicating differential item functioning (DIF) may play an important role in examinees' test scores. This study, therefore, aimed to investigate the effect of DIF items on the performance of computer adaptive and multi-stage tests. For this purpose, different test designs were tested under different test lengths and ratios of DIF items using Monte Carlo simulation. As a result, it was seen that computer adaptive test (CAT) designs had the best measurement precision over all conditions. When multi-stage test (MST) panel designs were compared, it was found that the 1-3-3 design had higher measurement precision in most of the conditions; however, the findings were not enough to say that 1-3-3 design performed better than the 1-2-4 design. Furthermore, CAT was found to be the least affected design by the increase of ratio of DIF items. MST designs were affected by that increment especially in the 10-item length test.

Anahtar Kelimeler

Computer adaptive test, Multi-stage test, Differential item functioning

Kaynakça

Aksu-Dunya, B. (2017). Item parameter drift in computer adaptive testing due to lack of content knowledge within sub-populations (Publication No. 10708515) [Doctoral Dissertation, University of Illinois]. ProQuest Dissertations & Theses.
Armstrong, R.D., Jones, D.H., Koppel, N.B., & Pashley, P.J. (2004). Computerized adaptive testing with multiple-form structures. Applied Psychological Measurement, 28(3), 147–164. https://doi.org/10.1177/0146621604263652
Babcock, B., & Albano, A.D. (2012). Rasch scale stability in the presence of item parameter and trait drift. Applied Psychological Measurement, 36(7), 565 580. https://dx.doi.org/10.1177/0146621612455090
Berger, S., Verschoor, A.J., Eggen, T.J.H.M., & Moser, U. (2019). Improvement of measurement efficiency in multistage tests by targeted assignment. Frontiers in Education, 4(1), 1–18. https://doi.org/10.3389/feduc.2019.00001
Birdsall, M. (2011). Implementing computer adaptive testing to improve achievement opportunities. Office of Qualifications and Examinations Regulation Report. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/606023/0411_MichaelBirdsall_implementing-computer-testing-_Final_April_2011_With_Copyright.pdf
Camilli, G., & Shepard, L.A. (1994). Methods for identifying biased test items (4th ed.). Sage Publications, Inc.
Chu, M.W., & Lai, H. (2013). Detecting biased items using CATSIB to increase fairness in computer adaptive tests. Alberta Journal of Educational Research, 59(4), 630–643. https://doi.org/10.11575/ajer.v59i4.55750
Crocker, L. & Algina, J. (1986). Introduction to classical and modern test theory. Wadsworth Group/Thomson Learning.
Gierl, M.J., Lai, H., & Li, J. (2013). Identifying differential item functioning in multi-stage computer adaptive testing. Educational Research and Evaluation, 19(2-3), 188–203. https://www.tandfonline.com/doi/full/10.1080/13803611.2013.767622
Hambleton, R.K., & Swaminathan, H. (1991). Item response theory: Principles and applications. Springer.
Hambleton, R.K., Jac, N.Z., & Pieters, J.P.M. (2000). Computerized adaptive testing: Theory, applications and standards. In R.K. Hambleton, & J.N. Zaal (Eds.), Advances in educational and psychological testing: Theory and applications (4th ed., pp. 341–366). Springer.
Han, K.T., & Guo, F. (2011). Potential impact of item parameter drift due to practice and curriculum change on item calibration in computerized adaptive testing (Report No. RR-11-02). Graduate Management Admission Council (GMAC) Research Reports. https://www.gmac.com/~/media/Files/gmac/Research/research-report-series/rr1102_itemcalibration.pdf
Hendrickson, A. (2007). An NCME instructional module on multistage testing. Educational Measurement: Issues and Practice, Summer 2007, 44-52. https://doi.org/10.1111/j.1745-3992.2007.00093.x
Keng, L. (2008). A comparison of the performance of testlet-based computer adaptive tests and multistage tests (Publication No. 3315089) [Doctoral Dissertation, University of Texas]. ProQuest Dissertations & Theses.
Lei, P.W., Chen, S.Y., & Yu, L. (2006). Comparing methods of assessing differential item functioning in a computerized adaptive testing environment. Journal of Educational Measurement, 43(3), 245-264. http://dx.doi.org/10.1111/j.1745-3984.2006.00015.x
Luecht, R.M., & Sireci, S.G. (2011). A review of models for computer-based testing (Report No. 2011 12). College Board Research Report. https://files.eric.ed.gov/fulltext/ED562580.pdf
Magis, D., Yan, D., & von-Davier, A. (Eds.). (2017). Computerized adaptive and multistage testing with R: Using packages catR and mstR. Springer.
National Research Council (1999). Designing mathematics or science curriculum programs: A guide for using mathematics and science education standards. National Academies Press. https://www.nap.edu/catalog/9658.html
Piromsombat, C. (2014). Differential item functioning in computerized adaptive testing: Can cat self-adjust enough? (Publication No. 3620715) [Doctoral Dissertation, University of Minnesota]. ProQuest Dissertations & Theses.
Sari, H.I. (2016). Examining content control in adaptive tests: Computerized adaptive testing vs. Computerized multistage testing (Publication No. 403003) [Doctoral Dissertation, University of Florida]. The Council of Higher Education National Thesis Center.
Sari, H.I., & Huggins-Manley, A.C. (2017). Examining content control in adaptive tests: Computerized adaptive testing vs. computerized adaptive multistage testing. Educational Sciences: Theory and Practice, 17, 1759-1781. http://doi:10.12738/estp.2017.5.0484
Steinberg, L., Thissen, D., & Wainer, H. (2000). Validity. In H. Wainer (Ed.), Computerized adaptive testing: A primer (2. ed., p. 185–229). Routledge.
Tay, P.H. (2015). On-the-fly assembled multistage adaptive testing (Publication No. 3740572). [Doctoral Dissertation, University of Illinois]. ProQuest Dissertations & Theses.
van der Linden, W.J. (2008). Using response times for item selection in adaptive testing. Journal of Educational and Behavioral Statistics, 33(1), 5 20. https://doi.org/10.3102/1076998607302626
van der Linden, W.J., & Pashley, P.J. (2010). Item selection and ability estimation in adaptive testing. In W. J. van der Linden & C.A.W. Glas (Eds.), Elements of adaptive testing. Springer.
Wainer, H. (2000). Introduction and history. In H. Wainer (Ed.), Computerized adaptive testing: A primer (2nd ed., p. 1–22). Lawrence Erlbaum Associates.
Wang, K. (2017). A fair comparison of the performance of computerized adaptive testing and multistage adaptive testing (Publication No. 10273809). [Doctoral Dissertation, Michigan State University]. ProQuest Dissertations & Theses.
Wang, S., Haiyan, L., Chang, H.H., & Douglas, J. (2016). Hybrid computerized adaptive testing: From group sequential design to fully sequential design. Journal of Educational Measurement, 53(1), 45–62. https://doi.org/10.1111/jedm.12100
Wang, X. (2013). An investigation on computer-adaptive multistage testing panels for multidimensional assessment (Publication No. 3609605). [Doctoral Dissertation, University of North Carolina]. ProQuest Dissertations & Theses.
Weiss, D.J., & Kingsbury, G.G. (1984). Application of computer adaptive testing to educational problems. Journal of Educational Measurement, 21 (4), 361 375. https://doi.org/10.1111/j.1745-3984.1984.tb01040.x
Yan, D. (2010). Investigation of optimal design and scoring for adaptive multi-stage testing: A tree-based regression approach (Publication No. 3452799). [Master Thesis, Fordham University]. ProQuest Dissertations & Theses.
Yan, D., von-Davier, A.A., & Lewis, C. (2014). Overview of computerized multistage tests. In D. Yan, A.A. von-Davier, & C. Lewis (Eds.), Computerized multistage testing (p. 3–20). CRC Press; Taylor & Francis Group.
Zheng, Y., & Chang, H.H. (2014). On-the-fly assembled multistage adaptive testing. Applied Psychological Measurement, 39 (2), 104 118. https://doi.org/10.1177/0146621614544519
Zumbo, B.D. (1999). A handbook on the theory and methods of differential item functioning (DIF). Headquarters of National Defense.
Zwick, R. (2010). The investigation of differential item functioning in adaptive tests. In W.J. van der Linden and C.A.W. Glas (Eds.), Elements of adaptive testing. Springer
Zwick, R., & Bridgeman, B. (2014). Evaluating validity, fairness, and differential item functioning in multistage testing. In D. Yan, A.A. von-Davier, & C. Lewis (Eds.), Computerized multistage testing. CRC Press; Taylor&Francis Group.

The Effect of ratio of items indicating differential item functioning on computer adaptive and multi-stage tests

Yıl 2022, Cilt: 9 Sayı: 3, 682 - 696, 30.09.2022

Başak Erdem Kara , Nuri Doğan

https://doi.org/10.21449/ijate.1105769

Cited By: 4

Öz

Anahtar Kelimeler

Computer adaptive test, Multi-stage test, Differential item functioning

Kaynakça

Aksu-Dunya, B. (2017). Item parameter drift in computer adaptive testing due to lack of content knowledge within sub-populations (Publication No. 10708515) [Doctoral Dissertation, University of Illinois]. ProQuest Dissertations & Theses.
Armstrong, R.D., Jones, D.H., Koppel, N.B., & Pashley, P.J. (2004). Computerized adaptive testing with multiple-form structures. Applied Psychological Measurement, 28(3), 147–164. https://doi.org/10.1177/0146621604263652
Babcock, B., & Albano, A.D. (2012). Rasch scale stability in the presence of item parameter and trait drift. Applied Psychological Measurement, 36(7), 565 580. https://dx.doi.org/10.1177/0146621612455090
Berger, S., Verschoor, A.J., Eggen, T.J.H.M., & Moser, U. (2019). Improvement of measurement efficiency in multistage tests by targeted assignment. Frontiers in Education, 4(1), 1–18. https://doi.org/10.3389/feduc.2019.00001
Birdsall, M. (2011). Implementing computer adaptive testing to improve achievement opportunities. Office of Qualifications and Examinations Regulation Report. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/606023/0411_MichaelBirdsall_implementing-computer-testing-_Final_April_2011_With_Copyright.pdf
Camilli, G., & Shepard, L.A. (1994). Methods for identifying biased test items (4th ed.). Sage Publications, Inc.
Chu, M.W., & Lai, H. (2013). Detecting biased items using CATSIB to increase fairness in computer adaptive tests. Alberta Journal of Educational Research, 59(4), 630–643. https://doi.org/10.11575/ajer.v59i4.55750
Crocker, L. & Algina, J. (1986). Introduction to classical and modern test theory. Wadsworth Group/Thomson Learning.
Gierl, M.J., Lai, H., & Li, J. (2013). Identifying differential item functioning in multi-stage computer adaptive testing. Educational Research and Evaluation, 19(2-3), 188–203. https://www.tandfonline.com/doi/full/10.1080/13803611.2013.767622
Hambleton, R.K., & Swaminathan, H. (1991). Item response theory: Principles and applications. Springer.
Hambleton, R.K., Jac, N.Z., & Pieters, J.P.M. (2000). Computerized adaptive testing: Theory, applications and standards. In R.K. Hambleton, & J.N. Zaal (Eds.), Advances in educational and psychological testing: Theory and applications (4th ed., pp. 341–366). Springer.
Han, K.T., & Guo, F. (2011). Potential impact of item parameter drift due to practice and curriculum change on item calibration in computerized adaptive testing (Report No. RR-11-02). Graduate Management Admission Council (GMAC) Research Reports. https://www.gmac.com/~/media/Files/gmac/Research/research-report-series/rr1102_itemcalibration.pdf
Hendrickson, A. (2007). An NCME instructional module on multistage testing. Educational Measurement: Issues and Practice, Summer 2007, 44-52. https://doi.org/10.1111/j.1745-3992.2007.00093.x
Keng, L. (2008). A comparison of the performance of testlet-based computer adaptive tests and multistage tests (Publication No. 3315089) [Doctoral Dissertation, University of Texas]. ProQuest Dissertations & Theses.
Lei, P.W., Chen, S.Y., & Yu, L. (2006). Comparing methods of assessing differential item functioning in a computerized adaptive testing environment. Journal of Educational Measurement, 43(3), 245-264. http://dx.doi.org/10.1111/j.1745-3984.2006.00015.x
Luecht, R.M., & Sireci, S.G. (2011). A review of models for computer-based testing (Report No. 2011 12). College Board Research Report. https://files.eric.ed.gov/fulltext/ED562580.pdf
Magis, D., Yan, D., & von-Davier, A. (Eds.). (2017). Computerized adaptive and multistage testing with R: Using packages catR and mstR. Springer.
National Research Council (1999). Designing mathematics or science curriculum programs: A guide for using mathematics and science education standards. National Academies Press. https://www.nap.edu/catalog/9658.html
Piromsombat, C. (2014). Differential item functioning in computerized adaptive testing: Can cat self-adjust enough? (Publication No. 3620715) [Doctoral Dissertation, University of Minnesota]. ProQuest Dissertations & Theses.
Sari, H.I. (2016). Examining content control in adaptive tests: Computerized adaptive testing vs. Computerized multistage testing (Publication No. 403003) [Doctoral Dissertation, University of Florida]. The Council of Higher Education National Thesis Center.
Sari, H.I., & Huggins-Manley, A.C. (2017). Examining content control in adaptive tests: Computerized adaptive testing vs. computerized adaptive multistage testing. Educational Sciences: Theory and Practice, 17, 1759-1781. http://doi:10.12738/estp.2017.5.0484
Steinberg, L., Thissen, D., & Wainer, H. (2000). Validity. In H. Wainer (Ed.), Computerized adaptive testing: A primer (2. ed., p. 185–229). Routledge.
Tay, P.H. (2015). On-the-fly assembled multistage adaptive testing (Publication No. 3740572). [Doctoral Dissertation, University of Illinois]. ProQuest Dissertations & Theses.
van der Linden, W.J. (2008). Using response times for item selection in adaptive testing. Journal of Educational and Behavioral Statistics, 33(1), 5 20. https://doi.org/10.3102/1076998607302626
van der Linden, W.J., & Pashley, P.J. (2010). Item selection and ability estimation in adaptive testing. In W. J. van der Linden & C.A.W. Glas (Eds.), Elements of adaptive testing. Springer.
Wainer, H. (2000). Introduction and history. In H. Wainer (Ed.), Computerized adaptive testing: A primer (2nd ed., p. 1–22). Lawrence Erlbaum Associates.
Wang, K. (2017). A fair comparison of the performance of computerized adaptive testing and multistage adaptive testing (Publication No. 10273809). [Doctoral Dissertation, Michigan State University]. ProQuest Dissertations & Theses.
Wang, S., Haiyan, L., Chang, H.H., & Douglas, J. (2016). Hybrid computerized adaptive testing: From group sequential design to fully sequential design. Journal of Educational Measurement, 53(1), 45–62. https://doi.org/10.1111/jedm.12100
Wang, X. (2013). An investigation on computer-adaptive multistage testing panels for multidimensional assessment (Publication No. 3609605). [Doctoral Dissertation, University of North Carolina]. ProQuest Dissertations & Theses.
Weiss, D.J., & Kingsbury, G.G. (1984). Application of computer adaptive testing to educational problems. Journal of Educational Measurement, 21 (4), 361 375. https://doi.org/10.1111/j.1745-3984.1984.tb01040.x
Yan, D. (2010). Investigation of optimal design and scoring for adaptive multi-stage testing: A tree-based regression approach (Publication No. 3452799). [Master Thesis, Fordham University]. ProQuest Dissertations & Theses.
Yan, D., von-Davier, A.A., & Lewis, C. (2014). Overview of computerized multistage tests. In D. Yan, A.A. von-Davier, & C. Lewis (Eds.), Computerized multistage testing (p. 3–20). CRC Press; Taylor & Francis Group.
Zheng, Y., & Chang, H.H. (2014). On-the-fly assembled multistage adaptive testing. Applied Psychological Measurement, 39 (2), 104 118. https://doi.org/10.1177/0146621614544519
Zumbo, B.D. (1999). A handbook on the theory and methods of differential item functioning (DIF). Headquarters of National Defense.
Zwick, R. (2010). The investigation of differential item functioning in adaptive tests. In W.J. van der Linden and C.A.W. Glas (Eds.), Elements of adaptive testing. Springer
Zwick, R., & Bridgeman, B. (2014). Evaluating validity, fairness, and differential item functioning in multistage testing. In D. Yan, A.A. von-Davier, & C. Lewis (Eds.), Computerized multistage testing. CRC Press; Taylor&Francis Group.

Toplam 36 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Alan Eğitimleri
Bölüm	Makaleler
Yazarlar	Başak Erdem Kara 0000-0003-3066-2892 Nuri Doğan 0000-0001-6274-2016
Erken Görünüm Tarihi	31 Ağustos 2022
Yayımlanma Tarihi	30 Eylül 2022
Gönderilme Tarihi	19 Nisan 2022
Yayımlandığı Sayı	Yıl 2022 Cilt: 9 Sayı: 3

Kaynak Göster

APA	Erdem Kara, B., & Doğan, N. (2022). The Effect of ratio of items indicating differential item functioning on computer adaptive and multi-stage tests. International Journal of Assessment Tools in Education, 9(3), 682-696. https://doi.org/10.21449/ijate.1105769

Cited By

Effect of Differential Item Functioning on Computer Adaptive Testing Under Different Conditions

Applied Psychological Measurement

https://doi.org/10.1177/01466216241284295

Comparison of Different Computerized Adaptive Testing Approaches with Shadow Test Under Different Test Length and Ability Estimation Method Conditions

Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi

https://doi.org/10.21031/epod.1202599

The Effect of Test Design on Misrouting in Computerized Multistage Testing

Uluslararası Türk Eğitim Bilimleri Dergisi

https://doi.org/10.46778/goputeb.1267319

Effects of DIF in MST routing in ILSAs

Large-scale Assessments in Education

https://doi.org/10.1186/s40536-023-00165-9

Makale Dosyaları

Tam Metin

23823 23825 23824