Equality of admission tests using kernel equating under the non-equivalent groups with covariates design

Özge Altıntaş; Gabriel Wallın

doi:10.21449/ijate.976660

Araştırma Makalesi

Equality of admission tests using kernel equating under the non-equivalent groups with covariates design

Yıl 2021, , 729 - 743, 04.12.2021

Özge Altıntaş , Gabriel Wallın

https://doi.org/10.21449/ijate.976660

Cited By: 4

Öz

Educational assessment tests are designed to measure the same psychological constructs over extended periods. This feature is important considering that test results are often used for admittance to university programs. To ensure fair assessments, especially for those whose results weigh heavily in selection decisions, it is necessary to collect evidence demonstrating that the assessments are not biased and to confirm that the scores obtained from different test forms have statistical equality. Therefore, test equating has important functions as it prevents bias generated by differences in the difficulty levels of different test forms, allows the scores obtained from different test forms to be reported on the same scale, and ensures that the reported scores communicate the same meaning. In this study, these important functions were evaluated using real college admission test data from different test administrations. The kernel equating method under the non-equivalent groups with covariates design was applied to determine whether the scores that were obtained from different periods and measured the same psychological constructs were statistically equivalent. The non-equivalent groups with covariates design was specifically used because the test groups of the admission test are non-equivalent and there are no anchor items. Results from the analyses showed that the test forms had different score distributions and that the relationship was non-linear. Thus, the equating procedure was adjusted to eliminate these differences and thereby allowing the tests to be used interchangeably.

Anahtar Kelimeler

Kernel equating, Non-equivalent groups design, NEC design, Background variables, Admission tests

Kaynakça

Akın-Arıkan, Ç. (2020). The impact of covariate variables on kernel equating under the non-equivalent group design. Journal of Measurement and Evaluation in Education and Psychology, 11(4), 362-373. http://dx.doi.org/10.21031/epod.706835
Altıntaş, Ö., & Kutlu, Ö. (2019). Investigating differential item functioning of Ankara University Examination for Foreign Students by Rasch model. International Journal of Assessment Tools in Education, 6(4), 602-616. http://dx.doi.org/10.21449/ijate.554212
Altıntaş, Ö., & Kutlu, Ö. (2020). Investigating the measurement invariance of Ankara University Foreign Student Selection Test by latent class and Rasch model. Education & Science, 45(203), 287-308. http://dx.doi.org/10.15390/EB.2020.8685
Anastasi, A. (1988). Psychological testing (6th ed.). Macmillan.
Andersson B., Bränberg, K., & Wiberg, M. (2013a). kequate: The Kernel Method of Test Equating. R package version 1.6.3. https://CRAN.R-project.org/package=kequate
Andersson, B., Bränberg, K., & Wiberg, M. (2013b). Performing the Kernel Method of Test Equating with the Package kequate. Journal of Statistical Software, 55(6), 1-25. https://www.jstatsoft.org/v55/i06/
Angoff, W. H. (1971). Scale, norms, and equivalent scores. In R. L. Thorndike (Ed.), Educational measurement (2nd ed., pp. 509-600). American Council of Education.
Angoff, W. H. (1982). Summary and derivation of equating methods used at ETS. In P. W. Holland & D. B. Rubin (Eds.), Test equating (pp. 55-69). Academic.
Bränberg, K., Henriksson, W., Nyquist, H., & Wedman, I. (1990). The influence of sex, education, and age on test scores on the Swedish Scholastic Aptitude Test. Scandinavian Journal of Educational Research, 34(3), 189 203. https://www.tandfonline.com/doi/abs/10.1080/0031383900340302
Bränberg, K., & Wiberg, M. (2011). Observed score linear equating with covariates. Journal of Educational Measurement, 48(4), 419-440. https://www.jstor.org/stable/41427533
Braun, H. I., & Holland, P. W. (1982). Observed-score test equating: A mathematical analysis of some ETS equating procedures. In P. W. Holland & D. B. Rubin (Eds.), Test equating (pp. 9-49). Academic.
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297-334. https://doi.org/10.1007/BF02310555
Cronbach, L. J. (1990). Essentials of psychological testing (5th ed.). Harper Collins.
Dorans, N. J., & Holland, P. W. (2000). Population invariance and the equitability of tests: Basic theory and the linear case. Journal of Educational Measurement, 37(4), 281-306. https://doi.org/10.1111/j.1745-3984.2000.tb01088.x
Felan, G. D. (2002, February, 14-16). Test equating: Mean, linear, equipercentile, and item response theory. [Paper presentation]. The Annual Meeting of the Southwest Educational Research Associations, Austin, TX, United States. https://files.eric.ed.gov/fulltext/ED462436.pdf
Fraenkel, J. R., & Wallen, N. E. (2009). How to design and evaluate research in education (7th ed.). McGraw-Hill.
González, J., Barrientos, A. F., & Quintana, F. A. (2015). Bayesian nonparametric estimation of test equating functions with covariates. Computational Statistics & Data Analysis, 89, 222-244. https://doi.org/10.1016/j.csda.2015.03.012
González, J., & von Davier, A. A. (2017). An illustration of the Epanechnikov and adaptive continuization methods in kernel equating. In L. A. van der Ark, M. Wiberg, S. A. Culpepper, J. A. Douglas, & W. C. Wang (Eds.), Quantitative psychology (pp. 253-262). IMPS 2016. Springer Proceedings in Mathematics & Statistics, vol 196. Springer. https://doi.org/10.1007/978-3-319-56294-0_23
González, J., & Wiberg, M. (2017). Applying test equating methods using R. Springer.
Holland, P. W., & Thayer, D. T. (1985). Section pre-equating in the presence of practice effects. Journal of Educational Statistics, 10(2), 109-120. https://www.jstor.org/stable/1164838
Kan, A. (2010). Test eşitleme: Aynı davranışları ölçen, farklı madde formlarına sahip testlerin istatistiksel eşitliğinin sınanması [Test equating: Testing the statistical equality of tests that measure the same behavior, and have different item forms]. Journal of Measurement and Evaluation in Education and Psychology, 1(1), 16 21. https://dergipark.org.tr/en/download/article-file/65994
Kolen, M. J. (1990). Does matching in equating work? A discussion. Applied Measurement in Education, 3(1), 97-104. https://doi.org/10.1207/s15324818ame0301_7
Kolen, M. J., & Brennan, R. L. (2014). Test equating, scaling, and linking: Methods and practices (3rd ed.). Springer.
Kuder, G. F., & Richardson, M. W. (1937). The theory of the estimation of test reliability. Psychometrika, 2(3), 151-160. https://doi.org/10.1007/BF02288391
Kutlu, Ö., & Bal, Ö. (2011). Ankara Üniversitesi Yabancı Uyruklu Öğrenci Seçme ve Yerleştirme Sınavı (AYÖS) projesi kesin raporu [Ankara University Student Selection and Placement Exam for Foreign Students (AYOS) project final report]. (Project No. 11Y5250001). Ankara University Scientific Research Project Office.
Levine, R. (1955). Equating the score scales of alternate forms administered to samples of different ability. ETS Research Bulletin Series, 55(2), i-118. Educational Testing Service. https://doi.org/10.1002/j.2333-8504.1955.tb00266.x
Livingston, S. A., Dorans, N. J., & Wright, N. K. (1990). What combination of sampling and equating methods works best? Applied Measurement in Education, 3(1), 73-95. https://doi.org/10.1207/s15324818ame0301_6
Lord, F. M. (1950). Notes on comparable scales for test scores. ETS Research Bulletin Series, 50(48), 1 20. Educational Testing Service. https://onlinelibrary.wiley.com/doi/pdf/10.1002/j.2333-8504.1950.tb00673.x
Moses, T., & Holland, P. W. (2010). A comparison of statistical selection strategies for univariate and bivariate log-linear models. British Journal of Mathematical and Statistical Psychology, 63(3), 557-574. https://doi.org/10.1348/000711009X478580
R Core Team (2018). R: A language and environment for statistical computing. [Computer software]. R Foundation for Statistical Computing. http://www.R-project.org/
Sansivieri, V., & Wiberg, M. (2017). IRT observed-score equating with the nonequivalent groups with covariates design. In L. A. van der Ark, M. Wiberg, S. S. Culpepper, J. A. Douglas, & W. C. Wang (Eds.), Quantitative psychology (pp. 275-285). IMPS 2016. Springer Proceedings in Mathematics & Statistics, vol. 196. Springer. https://doi.org/10.1007/978-3-319-56294-0_25
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6(2), 461-464. https://projecteuclid.org/download/pdf_1/euclid.aos/1176344136
von Davier, A. A., Holland, P. W., & Thayer, D. T. (2004). The kernel method of test equating. Springer.
Wallin, G. (2019). Extensions of the kernel method of test score equating. [Doctoral dissertation, Umeå University]. Umeå University Libraries. http://umu.diva-portal.org/smash/get/diva2:1378833/FULLTEXT01.pdf
Wallin G., & Wiberg, M. (2017) Nonequivalent groups with covariates design using propensity scores for kernel equating. In L. A. van der Ark, M. Wiberg, S. S. Culpepper, J. A. Douglas, & W. C. Wang (Eds.), Quantitative psychology (pp. 309-319). IMPS 2016. Springer Proceedings in Mathematics & Statistics, vol. 196. Springer. https://doi.org/10.1007/978-3-319-56294-0_27
Wallin, G., & Wiberg, M. (2019). Kernel equating using propensity scores for nonequivalent groups. Journal of Educational and Behavioral Statistics, 44(4), 390 414. https://doi.org/10.3102/1076998619838226
Wiberg, M., & Bränberg, K. (2015). Kernel equating under the non-equivalent groups with covariates design. Applied Psychological Measurement, 39(5), 349 361. https://doi.org/10.1177/0146621614567939
Wiberg, M., & von Davier, A. A. (2017). Examining the impact of covariates on anchor tests to ascertain quality over time in a college admissions test. International Journal of Testing, 17(2), 105-126. https://doi.org/10.1080/15305058.2016.1277357
Yurtçu, M. (2018). Parametrik olmayan Bayes yöntemiyle ortak değişkenlere göre yapılan test eşitlemelerinin karşılaştırılması [The comparison of test equating with covariates using Bayesian nonparametric method]. [Doctoral dissertation, Hacettepe University]. Hacettepe University Libraries. http://hdl.handle.net/11655/5295

Equality of admission tests using kernel equating under the non-equivalent groups with covariates design

Yıl 2021, , 729 - 743, 04.12.2021

Özge Altıntaş , Gabriel Wallın

https://doi.org/10.21449/ijate.976660

Cited By: 4

Öz

Educational assessment tests are designed to measure the same psychological constructs over extended periods of time. This feature is important considering that test results are often used in the selection process for admittance to university programs. To ensure fair assessments, especially for those whose results weigh heavily in selection decisions, it is necessary to collect evidence demonstrating that the assessments are not biased, and to confirm that the scores obtained from different test forms have statistical equality. For this purpose, test equating has important functions, as it prevents bias generated by differences in the difficulty levels of different test forms, allows the scores obtained from different test forms to be reported on the same scale, and ensures that the reported scores communicate the same meaning. In this study, these important functions were evaluated using real college admission test data from different test administrations. The kernel equating method under the non-equivalent groups with covariates design was applied to determine whether the scores obtained from different time periods but measuring the same psychological constructs were statistically equivalent. The non-equivalent groups with covariates design was specifically used because the test groups of the admission test are non-equivalent and there are no anchor items. Results from the analyses showed that the test forms had different score distributions, and that the relationship was non-linear. The equating procedure was thus adjusted to eliminate these differences and thereby allow the tests to be used interchangeably.

Anahtar Kelimeler

Kernel equating, Non-equivalent groups design, NEC design, Background variables, Admission tests

Kaynakça

Akın-Arıkan, Ç. (2020). The impact of covariate variables on kernel equating under the non-equivalent group design. Journal of Measurement and Evaluation in Education and Psychology, 11(4), 362-373. http://dx.doi.org/10.21031/epod.706835
Altıntaş, Ö., & Kutlu, Ö. (2019). Investigating differential item functioning of Ankara University Examination for Foreign Students by Rasch model. International Journal of Assessment Tools in Education, 6(4), 602-616. http://dx.doi.org/10.21449/ijate.554212
Altıntaş, Ö., & Kutlu, Ö. (2020). Investigating the measurement invariance of Ankara University Foreign Student Selection Test by latent class and Rasch model. Education & Science, 45(203), 287-308. http://dx.doi.org/10.15390/EB.2020.8685
Anastasi, A. (1988). Psychological testing (6th ed.). Macmillan.
Andersson B., Bränberg, K., & Wiberg, M. (2013a). kequate: The Kernel Method of Test Equating. R package version 1.6.3. https://CRAN.R-project.org/package=kequate
Andersson, B., Bränberg, K., & Wiberg, M. (2013b). Performing the Kernel Method of Test Equating with the Package kequate. Journal of Statistical Software, 55(6), 1-25. https://www.jstatsoft.org/v55/i06/
Angoff, W. H. (1971). Scale, norms, and equivalent scores. In R. L. Thorndike (Ed.), Educational measurement (2nd ed., pp. 509-600). American Council of Education.
Angoff, W. H. (1982). Summary and derivation of equating methods used at ETS. In P. W. Holland & D. B. Rubin (Eds.), Test equating (pp. 55-69). Academic.
Bränberg, K., Henriksson, W., Nyquist, H., & Wedman, I. (1990). The influence of sex, education, and age on test scores on the Swedish Scholastic Aptitude Test. Scandinavian Journal of Educational Research, 34(3), 189 203. https://www.tandfonline.com/doi/abs/10.1080/0031383900340302
Bränberg, K., & Wiberg, M. (2011). Observed score linear equating with covariates. Journal of Educational Measurement, 48(4), 419-440. https://www.jstor.org/stable/41427533
Braun, H. I., & Holland, P. W. (1982). Observed-score test equating: A mathematical analysis of some ETS equating procedures. In P. W. Holland & D. B. Rubin (Eds.), Test equating (pp. 9-49). Academic.
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297-334. https://doi.org/10.1007/BF02310555
Cronbach, L. J. (1990). Essentials of psychological testing (5th ed.). Harper Collins.
Dorans, N. J., & Holland, P. W. (2000). Population invariance and the equitability of tests: Basic theory and the linear case. Journal of Educational Measurement, 37(4), 281-306. https://doi.org/10.1111/j.1745-3984.2000.tb01088.x
Felan, G. D. (2002, February, 14-16). Test equating: Mean, linear, equipercentile, and item response theory. [Paper presentation]. The Annual Meeting of the Southwest Educational Research Associations, Austin, TX, United States. https://files.eric.ed.gov/fulltext/ED462436.pdf
Fraenkel, J. R., & Wallen, N. E. (2009). How to design and evaluate research in education (7th ed.). McGraw-Hill.
González, J., Barrientos, A. F., & Quintana, F. A. (2015). Bayesian nonparametric estimation of test equating functions with covariates. Computational Statistics & Data Analysis, 89, 222-244. https://doi.org/10.1016/j.csda.2015.03.012
González, J., & von Davier, A. A. (2017). An illustration of the Epanechnikov and adaptive continuization methods in kernel equating. In L. A. van der Ark, M. Wiberg, S. A. Culpepper, J. A. Douglas, & W. C. Wang (Eds.), Quantitative psychology (pp. 253-262). IMPS 2016. Springer Proceedings in Mathematics & Statistics, vol 196. Springer. https://doi.org/10.1007/978-3-319-56294-0_23
González, J., & Wiberg, M. (2017). Applying test equating methods using R. Springer.
Holland, P. W., & Thayer, D. T. (1985). Section pre-equating in the presence of practice effects. Journal of Educational Statistics, 10(2), 109-120. https://www.jstor.org/stable/1164838
Kan, A. (2010). Test eşitleme: Aynı davranışları ölçen, farklı madde formlarına sahip testlerin istatistiksel eşitliğinin sınanması [Test equating: Testing the statistical equality of tests that measure the same behavior, and have different item forms]. Journal of Measurement and Evaluation in Education and Psychology, 1(1), 16 21. https://dergipark.org.tr/en/download/article-file/65994
Kolen, M. J. (1990). Does matching in equating work? A discussion. Applied Measurement in Education, 3(1), 97-104. https://doi.org/10.1207/s15324818ame0301_7
Kolen, M. J., & Brennan, R. L. (2014). Test equating, scaling, and linking: Methods and practices (3rd ed.). Springer.
Kuder, G. F., & Richardson, M. W. (1937). The theory of the estimation of test reliability. Psychometrika, 2(3), 151-160. https://doi.org/10.1007/BF02288391
Kutlu, Ö., & Bal, Ö. (2011). Ankara Üniversitesi Yabancı Uyruklu Öğrenci Seçme ve Yerleştirme Sınavı (AYÖS) projesi kesin raporu [Ankara University Student Selection and Placement Exam for Foreign Students (AYOS) project final report]. (Project No. 11Y5250001). Ankara University Scientific Research Project Office.
Levine, R. (1955). Equating the score scales of alternate forms administered to samples of different ability. ETS Research Bulletin Series, 55(2), i-118. Educational Testing Service. https://doi.org/10.1002/j.2333-8504.1955.tb00266.x
Livingston, S. A., Dorans, N. J., & Wright, N. K. (1990). What combination of sampling and equating methods works best? Applied Measurement in Education, 3(1), 73-95. https://doi.org/10.1207/s15324818ame0301_6
Lord, F. M. (1950). Notes on comparable scales for test scores. ETS Research Bulletin Series, 50(48), 1 20. Educational Testing Service. https://onlinelibrary.wiley.com/doi/pdf/10.1002/j.2333-8504.1950.tb00673.x
Moses, T., & Holland, P. W. (2010). A comparison of statistical selection strategies for univariate and bivariate log-linear models. British Journal of Mathematical and Statistical Psychology, 63(3), 557-574. https://doi.org/10.1348/000711009X478580
R Core Team (2018). R: A language and environment for statistical computing. [Computer software]. R Foundation for Statistical Computing. http://www.R-project.org/
Sansivieri, V., & Wiberg, M. (2017). IRT observed-score equating with the nonequivalent groups with covariates design. In L. A. van der Ark, M. Wiberg, S. S. Culpepper, J. A. Douglas, & W. C. Wang (Eds.), Quantitative psychology (pp. 275-285). IMPS 2016. Springer Proceedings in Mathematics & Statistics, vol. 196. Springer. https://doi.org/10.1007/978-3-319-56294-0_25
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6(2), 461-464. https://projecteuclid.org/download/pdf_1/euclid.aos/1176344136
von Davier, A. A., Holland, P. W., & Thayer, D. T. (2004). The kernel method of test equating. Springer.
Wallin, G. (2019). Extensions of the kernel method of test score equating. [Doctoral dissertation, Umeå University]. Umeå University Libraries. http://umu.diva-portal.org/smash/get/diva2:1378833/FULLTEXT01.pdf
Wallin G., & Wiberg, M. (2017) Nonequivalent groups with covariates design using propensity scores for kernel equating. In L. A. van der Ark, M. Wiberg, S. S. Culpepper, J. A. Douglas, & W. C. Wang (Eds.), Quantitative psychology (pp. 309-319). IMPS 2016. Springer Proceedings in Mathematics & Statistics, vol. 196. Springer. https://doi.org/10.1007/978-3-319-56294-0_27
Wallin, G., & Wiberg, M. (2019). Kernel equating using propensity scores for nonequivalent groups. Journal of Educational and Behavioral Statistics, 44(4), 390 414. https://doi.org/10.3102/1076998619838226
Wiberg, M., & Bränberg, K. (2015). Kernel equating under the non-equivalent groups with covariates design. Applied Psychological Measurement, 39(5), 349 361. https://doi.org/10.1177/0146621614567939
Wiberg, M., & von Davier, A. A. (2017). Examining the impact of covariates on anchor tests to ascertain quality over time in a college admissions test. International Journal of Testing, 17(2), 105-126. https://doi.org/10.1080/15305058.2016.1277357
Yurtçu, M. (2018). Parametrik olmayan Bayes yöntemiyle ortak değişkenlere göre yapılan test eşitlemelerinin karşılaştırılması [The comparison of test equating with covariates using Bayesian nonparametric method]. [Doctoral dissertation, Hacettepe University]. Hacettepe University Libraries. http://hdl.handle.net/11655/5295

Toplam 39 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Eğitim Üzerine Çalışmalar
Bölüm	Makaleler
Yazarlar	Özge Altıntaş 0000-0001-5779-855X Gabriel Wallın Bu kişi benim 0000-0002-7930-6701
Yayımlanma Tarihi	4 Aralık 2021
Gönderilme Tarihi	3 Mart 2021
Yayımlandığı Sayı	Yıl 2021

Kaynak Göster

APA	Altıntaş, Ö., & Wallın, G. (2021). Equality of admission tests using kernel equating under the non-equivalent groups with covariates design. International Journal of Assessment Tools in Education, 8(4), 729-743. https://doi.org/10.21449/ijate.976660

Cited By

Test score equating of multiple-choice mathematics items: techniques from characteristic curve of modern psychometric theory

Discover Education

https://doi.org/10.1007/s44217-023-00052-z

A Comparison of Covariates, Equating Designs, and Methods in Equating TIMSS 2019 Science Tests

Participatory Educational Research

https://doi.org/10.17275/per.23.74.10.5

Model Misspecification and Robustness of Observed-Score Test Equating Using Propensity Scores

Journal of Educational and Behavioral Statistics

https://doi.org/10.3102/10769986231161575

Historical Perspectives on Score Comparability Issues Raised by Innovations in Testing

Journal of Educational Measurement

https://doi.org/10.1111/jedm.12318

Makale Dosyaları

Tam Metin

23823 23825 23824