Araştırma Makalesi
BibTex RIS Kaynak Göster

Type I error and power rates: A comparative analysis of techniques in differential item functioning

Yıl 2023, , 781 - 795, 23.12.2023
https://doi.org/10.21449/ijate.1368341

Öz

The main purpose of this study is to examine the Type I error and statistical power ratios of Differential Item Functioning (DIF) techniques based on different theories under different conditions. For this purpose, a simulation study was conducted by using Mantel-Haenszel (MH), Logistic Regression (LR), Lord’s χ2, and Raju’s Areas Measures techniques. In the simulation-based research model, the two-parameter item response model, group’s ability distribution, and DIF type were the fixed conditions while sample size (1800, 3000), rates of sample size (0.50, 1), test length (20, 80) and DIF- containing item rate (0, 0.05, 0.10) were manipulated conditions. The total number of conditions is 24 (2x2x2x3), and statistical analysis was performed in the R software. The current study found that the Type I error rates in all conditions were higher than the nominal error level. It was also demonstrated that MH had the highest error rate while Raju’s Areas Measures had the lowest error rate. Also, MH produced the highest statistical power rates. The analysis of the findings of Type 1 error and statistical power rates illustrated that techniques based on both of the theories performed better in the 1800 sample size. Furthermore, the increase in the sample size affected techniques based on CTT rather than IRT. Also, the findings demonstrated that the techniques’ Type 1 error rates were lower while their statistical power rates were higher under conditions where the test length was 80, and the sample sizes were not equal.

Kaynakça

  • Ankenmann, R.D., Witt, E.A., & Dunbar, S.B. (1999). An investigation of the power of the likelihood ratio goodness-of-fit statistics in detecting differential item functioning. Journal of Educational Measurement, 36(4), 277–300. https://doi.org/10.1111/j.1745-3984.1999.tb00558.x
  • Atalay Kabasakal, K., Arsan, N., Gök, B., & Kelecioğlu, H. (2014). Değişen madde fonksiyonunun belirlenmesinde MTK olabilirlik oranı, SIBTEST ve mantel- haenszel yöntemlerinin performanslarının (I. Tip hata ve güç) karşılaştırılması [Comparison of the performance (Type I error and power) of the IRT likelihood ratio, SIBTEST, and mantel-haenszel techniques in determining the differential item functioning]. Educational Sciences: Theory & Practice, 14(6), 2175- 2193.
  • Atar, B. (2007). Differential item functioning analyses for mixed response data using IRT likelihood-ratio test, logistic regression, and GLLAMM procedures (FSU_migr_etd-0248) [Doctoral dissertation, Florida State University]. http://purl.flvc.org/fsu/fd/FSU_migr_etd-0248.
  • Atar, B., & Kamata, A. (2011). Comparison of IRT likelihood ratio test and logistic regression DIF detection procedures. Hacettepe University Journal of Education, (41), 36–47.
  • Basman, M. (2023). A comparison of the efficacies of differential item functioning detection methods. International Journal of Assessment Tools in Education, 10(1), 145-159. https://doi.org/10.21449/ijate.1135368
  • Bradley, J.V. (1978). Robustness. British Journal of Mathematical and Statistical Psychology, 31(2), 144–152. http://dx.doi.org/10.1111/j.2044-8317.1978.tb00581.x
  • Camilli, G., & Shepard, L.A. (1994). Methods for identifying biased test items. Sage Publications.
  • Clauser, B.E., & Mazor, K.M. (1998). Using statistical procedure to identify differential item functioning test items. Educational Measurement: Issues and Practice, 17(1), 31‐44. https://doi.org/10.1111/j.1745-3992.1998.tb00619.x
  • Cohen, A.S., Kim, S.H., & Wollack, J.A. (1996). An investigation of the likelihood ratio test for detection of differential item functioning. Applied Psychological Measurement, 20(1), 15–26.
  • Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. CBS College Publishing.
  • Dainis, A.M. (2008). Methods for identifying differential item and test functioning: An investigation of type 1 error rates and power (3323367) [Doctoral dissertation, James Madıson University]. ProQuest.
  • DeMars, C.E. (2009). Modification of the mantel-haenszel and logistic regression DIF procedures to incorporate the SIBTEST regression correction. Journal of Educational and Behavioral Statistics, 34 (2), 149- 170.
  • Desa, Z.N. (2012). Bi-factor multidimensional item response theory modeling for subscores estimation, reliability, and classification (3523517) [Doctoral thesis, University of Kansas]. ProQuest.
  • Dodeen, H. (2004). The relationship between item parameters and item fit. Journal of Educational Measurement, 41(3), 261- 270.
  • Dooley, K. (2002). Simulation research methods. In J. Baum (Ed.), Companion to organizations (pp. 829-848). Blackwell
  • Dorans, N.J., & Holland, P.W. (1993). DIF detection and description: Mantel-haenszel and standardization. In P.W. Holland, & H. Wainer (Eds.), Differential item functioning (pp. 35-66). Lawrence Erlbaum.
  • Ellis, B.B., & Raju, N.S. (2003). Test and item bias: What they are, what they aren’t, and how to detect them. Educational Resources Information Center (ERIC).
  • Erdem Keklik, D. (2012). İki kategorili maddelerde tek biçimli değişen madde fonksiyonu belirleme tekniklerinin karşılaştırılması: Bir simülasyon çalışması [Comparison of techniques in detecting uniform differential item functioning in dichotomous items: A simulation study] (311744) [Doctoral thesis, Ankara university]. YÖK, Ulusal Tez Merkezi.
  • Fidalgo, A.M., Mellenberg, G.J., & Muniz, J. (2000). Effects of amount of DIF, test length, and purification type on robustness and power of Mantel-Haenszel procedures. Methods of Psychological Research Online, 5(3), 43–53.
  • Finch, W.H., & French, B.F. (2007). Detection of crossing differential item functioning a comparison of four methods. Educational and Psychological Measurement, 67(4), 565- 582. https://doi.org/10.1177/0013164406296975
  • Gierl, M.J., Jodoin, M.G., & Ackerman, T.A. (2000). Performance of mantel-haenszel, simultaneous item bias test, and logistic regression when the proportion of DIF items is large. The Annual Meeting of the American Educational Research Association.
  • Hambleton, R.K., Swaminathan, H., & Rogers, H.J. (1991). Fundamentals of item response theory. Sage Publications.
  • Harwell, M., Stone, C.A., Hsu, T.C., & Kirisci, L. (1996). Monte Carlo studies in item response theory. Applied Psychological Measurement, 20(2), 101 125. https://doi.org/10.1177/0146621696020002
  • Hauck Filho, N., Machado, W.D.L., & Damásio, B.F. (2014). Effects of statistical models and items difficulties on making trait-level inferences: A simulation study. Psicologia: Reflexão e Crítica, 27(4), 670- 678. https://doi.org/10.1590/1678-7153.201427407
  • Hidalgo, M.D., & Lopez-Pina, J.A. (2004). Differential item functioning detection and effect size: A comparison between logistic regression and mantel-haenszel procedures. Educational and Psychological Measurement, 64(6), 903 915. https://doi.org/10.1177/0013164403261769
  • Holland, P.W., & Thayer, D.T. (1988). Differential item performance and the Mantel-Haenszel procedure. In H. Wainer & H.I. Braun (Ed.), Test validity (pp.129-145). Erlbaum.
  • Jodoin, M.G., & Gierl, M.J. (2010). Evaluating type I error and power rates using an effect size measure with the logistic regression procedure for DIF detection. Applied Psychological Measurement, 14 (4), 329- 349. https://doi.org/10.1207/S15324818AME1404_2
  • Kan, A., Sünbül, Ö., & Ömür, S. (2013). 6. - 8. sınıf seviye belirleme sınavları alt testlerinin çeşitli yöntemlere göre değişen madde fonksiyonlarının incelenmesi [Analysis of 6th - 8th grade placement exams subtests' differential item functioning by various methods]. Mersin University Journal of the Faculty of Education, 9(2), 207- 222.
  • Karasar, N. (2010). Bilimsel araştırma yöntemleri [Research methods]. Nobel Publication.
  • Kim, J. (2010). Controlling type I error rate in evaluating differential item functioning for four DIF methods: Use of three procedures for adjustment of multiple item testing [Doctoral thesis, Georgia State University]. https://doi.org/10.57709/1642363
  • Koğar, H. (2018). An examination of parametric and nonparametric dimensionality assessment methods with exploratory and confirmatory models. Journal of Education and Learning, 7(3), 148-158. 10.5539/jel.v7n3p148
  • Kristjansson, E. (2001). Detecting DIF in polytomous items: an empirical comparison of the ordinal logistic regression, logistic discriminant function analysis, Mantel, and generalized Mantel Haenszel procedures [Unpublished Doctoral Dissertation]. University of Ottawa.
  • Kristjansson, E., Aylesworth, R., Mcdowell, I., & Zumbo, B.D. (2005). Comparison of four methods for detecting differential item functioning in ordered response items. Educational and Psychological Measurement, 65(6), 935 953. https://doi.org/10.1177/0013164405275668
  • Lim, R.G., & Drasgow, F. (1990). Evaluation of two methods for estimating item response theory parameters when assessing differential item functioning. Journal of Applied Psychology, 75(2), 164-174. https://doi.org/10.1037/0021-9010.75.2.164
  • Lord, F.M. (2012). Applications of item response theory to practical testing problems. Lawrence Erlbaum Associates.
  • Magis, D., Beland, B., & Raiche, G. (2018). difR: Collection of methods to detect dichotomous differential item functioning (DIF). https://cran.r project.org/web/packages/difR/difR.pdf
  • Magis, D., & De Boeck, P. (2012). A robust outlier approach to prevent type I error inflation in differential item functioning. Educational and Psychological Measurement, 72(2), 291-311.
  • Mellenbergh, G.J. (1983). Conditional item bias methods. In S.H. Irvine & J.W. Berry (Ed.), Human assessment and cultural factors (pp. 293-302). Springer.
  • Narayanan, P., & Swaminathan, H. (1994). Performance of the mantel-haenszel and simultaneous item bias procedures for detecting differential. Applied Psychological Measurement, 18(4), 315-328. https://doi.org/10.1177/014662169401800403
  • Narayanan, P., & Swaminathan, H. (1996). Identification of items that show nonuniform DIF. Applied Psychological Measurement, 20(3), 257 274. https://doi.org/10.1177/014662169602000306
  • Osterlind, S.J. (1983). Test item bias. Sage Publications.
  • Osterlind, S.J., & Everson, H.T. (2009). Differential item functioning. Sage Publications.
  • Patton, M.Q. (1990). Qualitative evaluation and research methods. Sage Publications, Inc.
  • Price, E.A. (2014). Item Discrimination, model-data fit, and type I error rates in DIF detection using lord’s χ2, the likelihood ratio test, and the mantel-haenszel procedure [Doctoral thesis, Ohio University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1395842816
  • Raju, N.S. (1988). The area between two item characteristic curves. Psychometrica, 53(4), 495- 502. https://doi.org/10.1007/BF02294403
  • Rizopoulos, D. (2018). Latent trait models under IRT. https://cran.r project.org/web/packages/ltm/ltm.pdf
  • Rogers, H.J., & Swaminathan, H. (1993). A comparison of logistic regression and mantel-haenszel procedures for detecting differential item functioning. Applied Psychological Measurement, 17(2), 105-116. https://doi.org/10.1177/014662169301700201
  • Roussos, L.A., & Stout, W.F. (1996). Simulation studies of the effects of small sample size and studied item parameters on SIBTEST and mantel-haenszel type I error performance. Journal of Educational Measurement, 33(2), 215-230. https://doi.org/10.1111/j.1745-3984.1996.tb00490.x
  • Samuelsen, K.M. (2005). Examining differential item functioning from a latent class perspective (3175148) [Doctoral thesis, University of Maryland]. PreQuest.
  • Shepard, L., Camilli, G., & Averill, M. (1981). Comparison of procedures for detecting test-item bias with both internal and external ability criteria. Journal of Educational Statistics, 6(4), 317-375. https://doi.org/10.3102/10769986006004317
  • Simon, J.L. (1978). Basic research methods in social science. Random House.
  • Sünbül, Ö., & Ömür Sünbül, S. (2016). Değişen madde fonksiyonunun belirlenmesinde kullanılan yöntemlerde I. tip hata ve güç çalışması [Type I error and power study in methods used to determine differential item functioning]. Elementary Education Online, 15(3), 882- 897.
  • Swaminathan, H., & Rogers, H.J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27(4), 361 370. https://www.jstor.org/stable/1434855
  • Vaughn, B.K., & Wang, Q. (2010). DIF trees: Using classification trees to detect differential item functioning. Educational and Psychological Measurement, 70(6) 941–952. https://doi.org/10.1177/0013164410379326
  • Wang, W.C., & Su, Y.H. (2004). Factors influencing the Mantel and generalized Mantel-Haenszel methods for the assessment of differential item functioning in polytomous items. Applied Psychological Measurement, 28, 450–481.
  • Wang, W., Tay, L., & Drasgow, F. (2013). Detecting differential item functioning of polytomous items for an ideal point response process. Applied Psychological Measurement, 37(4), 316- 335. https://doi.org/10.1177/0146621613476156
  • Zumbo, B.D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and likert-type (ordinal) item scores. Directorate of Human Resources Research and Evaluation, Department of National Defense.
  • Zwick, R., Donoghue, J.R., & Grima, A. (1993). Assessment of differential item functioning for performance tasks. Journal of Educational Measurement, 30, 233–251.
  • Zwick, R., Thayer, D.T., & Mazzeo, J. (1997). Describing and categorizing DIF in polytomous items. Educational Testing Service.

Type I error and power rates: A comparative analysis of techniques in differential item functioning

Yıl 2023, , 781 - 795, 23.12.2023
https://doi.org/10.21449/ijate.1368341

Öz

The main purpose of this study is to examine the Type I error and statistical power ratios of Differential Item Functioning (DIF) techniques based on different theories under different conditions. For this purpose, a simulation study was conducted by using Mantel-Haenszel (MH), Logistic Regression (LR), Lord’s χ2, and Raju’s Areas Measures techniques. In the simulation-based research model, the two-parameter item response model, group’s ability distribution, and DIF type were the fixed conditions while sample size (1800, 3000), rates of sample size (0.50, 1), test length (20, 80) and DIF- containing item rate (0, 0.05, 0.10) were manipulated conditions. The total number of conditions is 24 (2x2x2x3), and statistical analysis was performed in the R software. The current study found that the Type I error rates in all conditions were higher than the nominal error level. It was also demonstrated that MH had the highest error rate while Raju’s Areas Measures had the lowest error rate. Also, MH produced the highest statistical power rates. The analysis of the findings of Type 1 error and statistical power rates illustrated that techniques based on both of the theories performed better in the 1800 sample size. Furthermore, the increase in the sample size affected techniques based on CTT rather than IRT. Also, the findings demonstrated that the techniques’ Type 1 error rates were lower while their statistical power rates were higher under conditions where the test length was 80, and the sample sizes were not equal.

Kaynakça

  • Ankenmann, R.D., Witt, E.A., & Dunbar, S.B. (1999). An investigation of the power of the likelihood ratio goodness-of-fit statistics in detecting differential item functioning. Journal of Educational Measurement, 36(4), 277–300. https://doi.org/10.1111/j.1745-3984.1999.tb00558.x
  • Atalay Kabasakal, K., Arsan, N., Gök, B., & Kelecioğlu, H. (2014). Değişen madde fonksiyonunun belirlenmesinde MTK olabilirlik oranı, SIBTEST ve mantel- haenszel yöntemlerinin performanslarının (I. Tip hata ve güç) karşılaştırılması [Comparison of the performance (Type I error and power) of the IRT likelihood ratio, SIBTEST, and mantel-haenszel techniques in determining the differential item functioning]. Educational Sciences: Theory & Practice, 14(6), 2175- 2193.
  • Atar, B. (2007). Differential item functioning analyses for mixed response data using IRT likelihood-ratio test, logistic regression, and GLLAMM procedures (FSU_migr_etd-0248) [Doctoral dissertation, Florida State University]. http://purl.flvc.org/fsu/fd/FSU_migr_etd-0248.
  • Atar, B., & Kamata, A. (2011). Comparison of IRT likelihood ratio test and logistic regression DIF detection procedures. Hacettepe University Journal of Education, (41), 36–47.
  • Basman, M. (2023). A comparison of the efficacies of differential item functioning detection methods. International Journal of Assessment Tools in Education, 10(1), 145-159. https://doi.org/10.21449/ijate.1135368
  • Bradley, J.V. (1978). Robustness. British Journal of Mathematical and Statistical Psychology, 31(2), 144–152. http://dx.doi.org/10.1111/j.2044-8317.1978.tb00581.x
  • Camilli, G., & Shepard, L.A. (1994). Methods for identifying biased test items. Sage Publications.
  • Clauser, B.E., & Mazor, K.M. (1998). Using statistical procedure to identify differential item functioning test items. Educational Measurement: Issues and Practice, 17(1), 31‐44. https://doi.org/10.1111/j.1745-3992.1998.tb00619.x
  • Cohen, A.S., Kim, S.H., & Wollack, J.A. (1996). An investigation of the likelihood ratio test for detection of differential item functioning. Applied Psychological Measurement, 20(1), 15–26.
  • Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. CBS College Publishing.
  • Dainis, A.M. (2008). Methods for identifying differential item and test functioning: An investigation of type 1 error rates and power (3323367) [Doctoral dissertation, James Madıson University]. ProQuest.
  • DeMars, C.E. (2009). Modification of the mantel-haenszel and logistic regression DIF procedures to incorporate the SIBTEST regression correction. Journal of Educational and Behavioral Statistics, 34 (2), 149- 170.
  • Desa, Z.N. (2012). Bi-factor multidimensional item response theory modeling for subscores estimation, reliability, and classification (3523517) [Doctoral thesis, University of Kansas]. ProQuest.
  • Dodeen, H. (2004). The relationship between item parameters and item fit. Journal of Educational Measurement, 41(3), 261- 270.
  • Dooley, K. (2002). Simulation research methods. In J. Baum (Ed.), Companion to organizations (pp. 829-848). Blackwell
  • Dorans, N.J., & Holland, P.W. (1993). DIF detection and description: Mantel-haenszel and standardization. In P.W. Holland, & H. Wainer (Eds.), Differential item functioning (pp. 35-66). Lawrence Erlbaum.
  • Ellis, B.B., & Raju, N.S. (2003). Test and item bias: What they are, what they aren’t, and how to detect them. Educational Resources Information Center (ERIC).
  • Erdem Keklik, D. (2012). İki kategorili maddelerde tek biçimli değişen madde fonksiyonu belirleme tekniklerinin karşılaştırılması: Bir simülasyon çalışması [Comparison of techniques in detecting uniform differential item functioning in dichotomous items: A simulation study] (311744) [Doctoral thesis, Ankara university]. YÖK, Ulusal Tez Merkezi.
  • Fidalgo, A.M., Mellenberg, G.J., & Muniz, J. (2000). Effects of amount of DIF, test length, and purification type on robustness and power of Mantel-Haenszel procedures. Methods of Psychological Research Online, 5(3), 43–53.
  • Finch, W.H., & French, B.F. (2007). Detection of crossing differential item functioning a comparison of four methods. Educational and Psychological Measurement, 67(4), 565- 582. https://doi.org/10.1177/0013164406296975
  • Gierl, M.J., Jodoin, M.G., & Ackerman, T.A. (2000). Performance of mantel-haenszel, simultaneous item bias test, and logistic regression when the proportion of DIF items is large. The Annual Meeting of the American Educational Research Association.
  • Hambleton, R.K., Swaminathan, H., & Rogers, H.J. (1991). Fundamentals of item response theory. Sage Publications.
  • Harwell, M., Stone, C.A., Hsu, T.C., & Kirisci, L. (1996). Monte Carlo studies in item response theory. Applied Psychological Measurement, 20(2), 101 125. https://doi.org/10.1177/0146621696020002
  • Hauck Filho, N., Machado, W.D.L., & Damásio, B.F. (2014). Effects of statistical models and items difficulties on making trait-level inferences: A simulation study. Psicologia: Reflexão e Crítica, 27(4), 670- 678. https://doi.org/10.1590/1678-7153.201427407
  • Hidalgo, M.D., & Lopez-Pina, J.A. (2004). Differential item functioning detection and effect size: A comparison between logistic regression and mantel-haenszel procedures. Educational and Psychological Measurement, 64(6), 903 915. https://doi.org/10.1177/0013164403261769
  • Holland, P.W., & Thayer, D.T. (1988). Differential item performance and the Mantel-Haenszel procedure. In H. Wainer & H.I. Braun (Ed.), Test validity (pp.129-145). Erlbaum.
  • Jodoin, M.G., & Gierl, M.J. (2010). Evaluating type I error and power rates using an effect size measure with the logistic regression procedure for DIF detection. Applied Psychological Measurement, 14 (4), 329- 349. https://doi.org/10.1207/S15324818AME1404_2
  • Kan, A., Sünbül, Ö., & Ömür, S. (2013). 6. - 8. sınıf seviye belirleme sınavları alt testlerinin çeşitli yöntemlere göre değişen madde fonksiyonlarının incelenmesi [Analysis of 6th - 8th grade placement exams subtests' differential item functioning by various methods]. Mersin University Journal of the Faculty of Education, 9(2), 207- 222.
  • Karasar, N. (2010). Bilimsel araştırma yöntemleri [Research methods]. Nobel Publication.
  • Kim, J. (2010). Controlling type I error rate in evaluating differential item functioning for four DIF methods: Use of three procedures for adjustment of multiple item testing [Doctoral thesis, Georgia State University]. https://doi.org/10.57709/1642363
  • Koğar, H. (2018). An examination of parametric and nonparametric dimensionality assessment methods with exploratory and confirmatory models. Journal of Education and Learning, 7(3), 148-158. 10.5539/jel.v7n3p148
  • Kristjansson, E. (2001). Detecting DIF in polytomous items: an empirical comparison of the ordinal logistic regression, logistic discriminant function analysis, Mantel, and generalized Mantel Haenszel procedures [Unpublished Doctoral Dissertation]. University of Ottawa.
  • Kristjansson, E., Aylesworth, R., Mcdowell, I., & Zumbo, B.D. (2005). Comparison of four methods for detecting differential item functioning in ordered response items. Educational and Psychological Measurement, 65(6), 935 953. https://doi.org/10.1177/0013164405275668
  • Lim, R.G., & Drasgow, F. (1990). Evaluation of two methods for estimating item response theory parameters when assessing differential item functioning. Journal of Applied Psychology, 75(2), 164-174. https://doi.org/10.1037/0021-9010.75.2.164
  • Lord, F.M. (2012). Applications of item response theory to practical testing problems. Lawrence Erlbaum Associates.
  • Magis, D., Beland, B., & Raiche, G. (2018). difR: Collection of methods to detect dichotomous differential item functioning (DIF). https://cran.r project.org/web/packages/difR/difR.pdf
  • Magis, D., & De Boeck, P. (2012). A robust outlier approach to prevent type I error inflation in differential item functioning. Educational and Psychological Measurement, 72(2), 291-311.
  • Mellenbergh, G.J. (1983). Conditional item bias methods. In S.H. Irvine & J.W. Berry (Ed.), Human assessment and cultural factors (pp. 293-302). Springer.
  • Narayanan, P., & Swaminathan, H. (1994). Performance of the mantel-haenszel and simultaneous item bias procedures for detecting differential. Applied Psychological Measurement, 18(4), 315-328. https://doi.org/10.1177/014662169401800403
  • Narayanan, P., & Swaminathan, H. (1996). Identification of items that show nonuniform DIF. Applied Psychological Measurement, 20(3), 257 274. https://doi.org/10.1177/014662169602000306
  • Osterlind, S.J. (1983). Test item bias. Sage Publications.
  • Osterlind, S.J., & Everson, H.T. (2009). Differential item functioning. Sage Publications.
  • Patton, M.Q. (1990). Qualitative evaluation and research methods. Sage Publications, Inc.
  • Price, E.A. (2014). Item Discrimination, model-data fit, and type I error rates in DIF detection using lord’s χ2, the likelihood ratio test, and the mantel-haenszel procedure [Doctoral thesis, Ohio University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1395842816
  • Raju, N.S. (1988). The area between two item characteristic curves. Psychometrica, 53(4), 495- 502. https://doi.org/10.1007/BF02294403
  • Rizopoulos, D. (2018). Latent trait models under IRT. https://cran.r project.org/web/packages/ltm/ltm.pdf
  • Rogers, H.J., & Swaminathan, H. (1993). A comparison of logistic regression and mantel-haenszel procedures for detecting differential item functioning. Applied Psychological Measurement, 17(2), 105-116. https://doi.org/10.1177/014662169301700201
  • Roussos, L.A., & Stout, W.F. (1996). Simulation studies of the effects of small sample size and studied item parameters on SIBTEST and mantel-haenszel type I error performance. Journal of Educational Measurement, 33(2), 215-230. https://doi.org/10.1111/j.1745-3984.1996.tb00490.x
  • Samuelsen, K.M. (2005). Examining differential item functioning from a latent class perspective (3175148) [Doctoral thesis, University of Maryland]. PreQuest.
  • Shepard, L., Camilli, G., & Averill, M. (1981). Comparison of procedures for detecting test-item bias with both internal and external ability criteria. Journal of Educational Statistics, 6(4), 317-375. https://doi.org/10.3102/10769986006004317
  • Simon, J.L. (1978). Basic research methods in social science. Random House.
  • Sünbül, Ö., & Ömür Sünbül, S. (2016). Değişen madde fonksiyonunun belirlenmesinde kullanılan yöntemlerde I. tip hata ve güç çalışması [Type I error and power study in methods used to determine differential item functioning]. Elementary Education Online, 15(3), 882- 897.
  • Swaminathan, H., & Rogers, H.J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27(4), 361 370. https://www.jstor.org/stable/1434855
  • Vaughn, B.K., & Wang, Q. (2010). DIF trees: Using classification trees to detect differential item functioning. Educational and Psychological Measurement, 70(6) 941–952. https://doi.org/10.1177/0013164410379326
  • Wang, W.C., & Su, Y.H. (2004). Factors influencing the Mantel and generalized Mantel-Haenszel methods for the assessment of differential item functioning in polytomous items. Applied Psychological Measurement, 28, 450–481.
  • Wang, W., Tay, L., & Drasgow, F. (2013). Detecting differential item functioning of polytomous items for an ideal point response process. Applied Psychological Measurement, 37(4), 316- 335. https://doi.org/10.1177/0146621613476156
  • Zumbo, B.D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and likert-type (ordinal) item scores. Directorate of Human Resources Research and Evaluation, Department of National Defense.
  • Zwick, R., Donoghue, J.R., & Grima, A. (1993). Assessment of differential item functioning for performance tasks. Journal of Educational Measurement, 30, 233–251.
  • Zwick, R., Thayer, D.T., & Mazzeo, J. (1997). Describing and categorizing DIF in polytomous items. Educational Testing Service.
Toplam 59 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Eğitimde ve Psikolojide Ölçme Teorileri ve Uygulamaları, Similasyon çalışmaları
Bölüm Makaleler
Yazarlar

Ayşe Bilicioğlu Güneş 0000-0002-1603-8631

Bayram Bıçak 0000-0003-0860-9374

Yayımlanma Tarihi 23 Aralık 2023
Gönderilme Tarihi 29 Eylül 2023
Yayımlandığı Sayı Yıl 2023

Kaynak Göster

APA Bilicioğlu Güneş, A., & Bıçak, B. (2023). Type I error and power rates: A comparative analysis of techniques in differential item functioning. International Journal of Assessment Tools in Education, 10(4), 781-795. https://doi.org/10.21449/ijate.1368341

23823             23825             23824