The main purpose of this study is to examine the Type I error and statistical power ratios of Differential Item Functioning (DIF) techniques based on different theories under different conditions. For this purpose, a simulation study was conducted by using Mantel-Haenszel (MH), Logistic Regression (LR), Lord’s χ2, and Raju’s Areas Measures techniques. In the simulation-based research model, the two-parameter item response model, group’s ability distribution, and DIF type were the fixed conditions while sample size (1800, 3000), rates of sample size (0.50, 1), test length (20, 80) and DIF- containing item rate (0, 0.05, 0.10) were manipulated conditions. The total number of conditions is 24 (2x2x2x3), and statistical analysis was performed in the R software. The current study found that the Type I error rates in all conditions were higher than the nominal error level. It was also demonstrated that MH had the highest error rate while Raju’s Areas Measures had the lowest error rate. Also, MH produced the highest statistical power rates. The analysis of the findings of Type 1 error and statistical power rates illustrated that techniques based on both of the theories performed better in the 1800 sample size. Furthermore, the increase in the sample size affected techniques based on CTT rather than IRT. Also, the findings demonstrated that the techniques’ Type 1 error rates were lower while their statistical power rates were higher under conditions where the test length was 80, and the sample sizes were not equal.
Classical test theory Item response theory Differential item functioning Type I error Statistical power
The main purpose of this study is to examine the Type I error and statistical power ratios of Differential Item Functioning (DIF) techniques based on different theories under different conditions. For this purpose, a simulation study was conducted by using Mantel-Haenszel (MH), Logistic Regression (LR), Lord’s χ2, and Raju’s Areas Measures techniques. In the simulation-based research model, the two-parameter item response model, group’s ability distribution, and DIF type were the fixed conditions while sample size (1800, 3000), rates of sample size (0.50, 1), test length (20, 80) and DIF- containing item rate (0, 0.05, 0.10) were manipulated conditions. The total number of conditions is 24 (2x2x2x3), and statistical analysis was performed in the R software. The current study found that the Type I error rates in all conditions were higher than the nominal error level. It was also demonstrated that MH had the highest error rate while Raju’s Areas Measures had the lowest error rate. Also, MH produced the highest statistical power rates. The analysis of the findings of Type 1 error and statistical power rates illustrated that techniques based on both of the theories performed better in the 1800 sample size. Furthermore, the increase in the sample size affected techniques based on CTT rather than IRT. Also, the findings demonstrated that the techniques’ Type 1 error rates were lower while their statistical power rates were higher under conditions where the test length was 80, and the sample sizes were not equal.
Classical test theory Item response theory Differential item functioning Type I error Statistical power
Birincil Dil | İngilizce |
---|---|
Konular | Eğitimde ve Psikolojide Ölçme Teorileri ve Uygulamaları, Similasyon çalışmaları |
Bölüm | Makaleler |
Yazarlar | |
Yayımlanma Tarihi | 23 Aralık 2023 |
Gönderilme Tarihi | 29 Eylül 2023 |
Yayımlandığı Sayı | Yıl 2023 |