Investigating Invariant Item Ordering in Intelligence Tests: Mokken Scale Analysis of KBIT-2

Eren Halil Özberk; Elif Bengi Ünsal Özberk; Sait Uluç; Ferhunde Öktem

doi:10.21449/ijate.858183

Research Article

Investigating Invariant Item Ordering in Intelligence Tests: Mokken Scale Analysis of KBIT-2

Year 2021, Volume: 8 Issue: 3, 714 - 728, 05.09.2021

Eren Halil Özberk , Elif Bengi Ünsal Özberk , Sait Uluç , Ferhunde Öktem

https://doi.org/10.21449/ijate.858183

Abstract

The Kaufman Brief Intelligence Test – Second Edition (KBIT-2) is designed to measure verbal and nonverbal abilities in a wide range of individuals from 4 years 0 months to 90 years 11 months of age. This study examines both the advantages of using Mokken Scale Analysis (MSA) in intelligence tests and the hierarchical order of the items in the KBIT-2: Turkish form by estimating the parameters of each of the three subtests by testing the dimensionality of the KBIT-2 subtests by using the Invariant Item Ordering (IIO) assumptions. 2850 people participated in the study, including children, adolescents, and adults. Participants' ages varied from 48 months (4 years 0 months) to 539 months (44 years 11 months). Automated Item Selection Procedure (AISP) was applied for the assessment of unidimensionality under three different lower bounds as 0.30, 0.40, and 0.55. The items of all three subtests formed a unidimensional scale. Backward Item Selection (BIS) procedure detected seven items in the Matrices and 17 items in the Verbal Knowledge, while six items in the Riddles subtest violated the IIO criteria. KBIT-2: Reliability values obtained using MSA analysis show that all three subtests have a high degree of internal consistency. However, care should be taken when IIO assumptions do not fit the intelligence scales in the original form.

Keywords

Mokken scale analysis, Intelligence tests, Invariant item ordering

References

Abdelhamid, G. S. M., Gómez-Benito, J., Abdeltawwab, A. T. M., Abu Bakr, M. H. S., & Kazem, A. M. (2020). A Demonstration of Mokken Scale Analysis Methods Applied to Cognitive Test Validation Using the Egyptian WAIS-IV. Journal of Psychoeducational Assessment, 38(4), 493–506. https://doi.org/10.1177/0734282919862144
Atalay, Z. Ö. (2007). Kaufman brief intelligence test the studies of validity, reliability, and pre norm on children who are 13-14 years of age [Unpublished master's thesis], İstanbul University, İstanbul.
Chernyshenko, O. S., Stark, S., Chan, K. Y., Drasgow, F., Williams, B. A. (2001). Fitting item response theory models to two personality inventories: Issues and insights. Multivariate Behavioral Research, 36, 523–562. https://doi.org/10.1207/S15327906MBR3604_03
Cole, J. C., & Randall, M. K. (2003). Comparing the cognitive ability models of Spearman, Horn and Cattell, and Carroll. Journal of Psychoeducational Assessment, 21, 160-179. https://doi.org/10.1177%2F073428290302100204
Crișan, D. R., Tendeiro, J., & Meijer, R. (2019). The Crit Value as an Effect Size Measure for Violations of Model Assumptions in Mokken Scale Analysis for Binary Data. https://doi.org/10.31234/osf.io/8ydmr
Embretson, S. E., Reise, S. P. (2000). Item response theory for psychologists. Lawrence Erlbaum Associates.
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of Item Response Theory. Sage.
Hogarty, K. Y., Hines, C. V., Kromrey, J. D., Ferron, J. M., & Mumford, K. R. (2005). The quality of factor solutions in exploratory factor analysis: The influence of sample size, communality, and overdetermination. Educational and Psychological Measurement, 65, 202–226. https://psycnet.apa.org/doi/10.1177/0013164404267287
Junker, B. W., & Sijtsma, K. (2000). Latent and manifest monotonicity in item response models. Applied Psychological Measurement, 24, 65 81. https://doi.org/10.1177%2F01466216000241004
Junker, B. W., & Sijtsma, K. (2001). Nonparametric item response theory in action: An overview of the special issue. Applied Psychological Measurement, 25, 211–220. https://doi.org/10.1177%2F01466210122032028
Kaufman, A. S., & Kaufman, N. L. (2004). Kaufman Brief Intelligence Test (2nd ed.). American Guidance Service.
Ligtvoet, R., Van der Ark, L. A., Te Marvelde, J. M., & Sijtsma, K. (2010). Investigating an invariant item ordering for polytomously scored items. Educational and Psychological Measurement, 70, 578–595. https://doi.org/10.1177%2F0013164409355697
Lord, F. M. , & Novick, R. (1968). Statistical theories of mental test scores. Addison-Wesley.
Meijer, R. R., & Baneke, J. J. (2004). Analyzing psychopathology items: A case for nonparametric item response theory modeling. Psychological Methods, 9, 354–368. https://doi.org/10.1037/1082-989x.9.3.354
Meijer, R. R., Sijtsma, K., & Smid, N. G. (1990). Theoretical and empirical comparison of the Mokkenand the Rasch approach to IRT. Applied Psychological Measurement,14, 283–298. https://doi.org/10.1177/014662169001400306
Meijer, R. R., de Vries, R. M., & van Bruggen, V. (2011). An evaluation of the Brief Symptom Inventory-18 using item response theory: Which items are most strongly related to psychological distress?. Psychological Assessment, 23, 193 202. https://doi.org/10.1037/a0021292
Molenaar, I. W. (1997). Nonparametric models for polytomous responses. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 369–380). Springer.
Molenaar, I. W., & Sijtsma, K. (2000). MSP5 for Windows. A program for Mokken scale analysis for polytomous items. Groningen: iecProGAMMA.
Mokken, R. J. (1971). A theory and procedure of scale analysis. Mouton.
Mokken, R. J., & Lewis, C. (1982). A nonparametric approach to the analysis of dichotomous item responses. Applied Psychological Measurement, 6, 417 430. https://doi.org/10.1177%2F014662168200600404
Nunnally, J. C. (1978). Psychometric theory (2nd ed.). McGraw-Hill.
Öktem, F., (2016). Brief Intelligence Tests and Kaufman Brief Intelligence Test (KBIT-2). Turkiye Klinikleri J Psychol-Special Topics, 1(1), 10-6.
Reise, S. P., & Waller, N. G. (2003). How many IRT parameters does it take to model psychopathology items?. Psychological Methods, 8(2), 164 184. https://doi.org/10.1037/1082-989x.8.2.164
Robie, C., Zickar, M. J., & Schmit, M. J. (2001). Measurement equivalence between applicant and incumbent groups: An IRT analysis of personality scales. Human Performance, 14, 187–207. https://doi.org/10.1207/S15327043HUP1402_04
Savaşan, G. (2006). Kaufman Brief Intelligence Test the studies of validity, reliability and pre norm (age 9-10) [Unpublished master's thesis], İstanbul University, İstanbul.
Sijtsma, K. (2009). Correcting fallacies in validity, reliability, and classification. International Journal of Testing, 9, 167-194. https://doi.org/10.1080/15305050903106883
Sijtsma, K., Debets, P., & Molenaar, I. W. (1990). Mokken scale analysis for polychotomous items: Theory, a computer program and an empirical application. Quality and Quantity, 24, 173-188. https://doi.org/10.1007/BF00209550
Sijtsma, K.,& Meijer, R. R. (1992). A method for investigating the intersection of item response functions in Mokken's nonparametric IRT model. Applied Psychological Measurement, 16, 149–157. https://doi.org/10.1177/014662169201600204
Sijtsma, K., Meijer, R. R., & Van der Ark, L. A. (2011). Mokken scale analysis as time goes by: An update for scaling practitioners. Personality and Individual Differences, 50(1), 31-37. https://psycnet.apa.org/doi/10.1016/j.paid.2010.08.016
Sijtsma, K., & Molenaar, I. W. (2002). Introduction to nonparametric item response theory. SAGE.
Sijtsma, K., & Molenaar, I. W. (2016). Mokken models. In W. J. van der Linden (Ed.), Handbook of item response theory, Volume One: Models (pp. 303–321). Chapman & Hall/CRC.
Sijtsma, K., & Van der Ark, L. A. (2017). A tutorial on how to do a Mokken scale analysis on your test and questionnaire data. British Journal of Mathematical and Statistical Psychology,70(1), 137–158. https://doi.org/10.1111/bmsp.12078
Steinberg, L. (1994). Context and serial-order effects in personality measurement: Limits on the generality of measuring changes the measure. Journal of Personality and Social Psychology, 66(2),341–349. https://psycnet.apa.org/doi/10.1037/0022-3514.66.2.341
Straat, J. H., Van der Ark, L. A., & Sijtsma, K. (2013). Comparing optimization algorithms for item selection in Mokken scale analysis. Journal of Classification, 30, 72–99. https://doi:10.1007/s00357-013-9122-y
Straat, J. H., Van der Ark, L. A., & Sijtsma, K. (2014). Minimum sample size requirements for Mokken scale analysis. Educational and Psychological Measurement, 74(5), 809–822. https://doi:10.1177/0013164414529793
Straat, J. H., Van der Ark, L. A., & Sijtsma, K. (2016). Using conditional association to identify locally independent item sets. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 12(4), 117–123. https://doi.org/10.1027/1614-2241/a000115
Uluç, S., Öktem, F., Korkmaz, B. (2015). Brief screening tests: Kaufman Brief Intelligence Test-2 standardization for the Turkish version. VII. Işık Savaşır Clinical Psychology Symposium, Ankara.
Van der Ark LA (2012). "New Developments in Mokken Scale Analysis in R." Journal of Statistical Software, 48(5), 1–27. https://doi.org/10.18637/jss.v048.i05
Van der Ark, L. A., Van der Palm, D. W., & Sijtsma, K. (2011). A latent class approach to estimating test score reliability. Applied Psychological Measurement, 35(5), 380-392. https://doi.org/10.1177%2F0146621610392911
Waller, N. G., Thompson, J. S., & Wenk, E. (2000). Using IRT to separate measurement bias from true group differences on homogeneous and heterogeneous scales: An illustration with the MMPI. Psychological Methods, 5(1), 125–146. https://doi.org/10.1037/1082-989X.5.1.125
Watson, R., Deary, L., & Shipley, B. (2008). A hierarchy of distress: Mokken scaling of the GHQ 30. Psyhcological Medicine, 38(4), 575 579. https://doi.org/10.1017/S003329170800281X
Wind, S. (2016). Examining the psychometric quality of multiple-choice assessment items using Mokken scale analysis. Journal of Applied Measurement, 17(2), 142–165.
Wind, S. (2017). An instructional module on Mokken scale analysis. Educational Measurement: Issues and Practice, 36(2), 50–66. https://doi.org/10.1111/emip.12153
Zhu, J., Weiss, L. G., Prifitera, A., & Coalson, D. (2004). The Wechsler Intelligence Scales for Children and Adults. In G. Goldstein, S. R. Beers, & M. Hersen (Eds.), Comprehensive handbook of psychological assessment, Vol. 1. Intellectual and neuropsychological assessment (p. 51–75). John Wiley & Sons, Inc.

Investigating Invariant Item Ordering in Intelligence Tests: Mokken Scale Analysis of KBIT-2

Year 2021, Volume: 8 Issue: 3, 714 - 728, 05.09.2021

Eren Halil Özberk , Elif Bengi Ünsal Özberk , Sait Uluç , Ferhunde Öktem

https://doi.org/10.21449/ijate.858183

Abstract

Keywords

Mokken scale analysis, Intelligence tests, Invariant item ordering

References

Abdelhamid, G. S. M., Gómez-Benito, J., Abdeltawwab, A. T. M., Abu Bakr, M. H. S., & Kazem, A. M. (2020). A Demonstration of Mokken Scale Analysis Methods Applied to Cognitive Test Validation Using the Egyptian WAIS-IV. Journal of Psychoeducational Assessment, 38(4), 493–506. https://doi.org/10.1177/0734282919862144
Atalay, Z. Ö. (2007). Kaufman brief intelligence test the studies of validity, reliability, and pre norm on children who are 13-14 years of age [Unpublished master's thesis], İstanbul University, İstanbul.
Chernyshenko, O. S., Stark, S., Chan, K. Y., Drasgow, F., Williams, B. A. (2001). Fitting item response theory models to two personality inventories: Issues and insights. Multivariate Behavioral Research, 36, 523–562. https://doi.org/10.1207/S15327906MBR3604_03
Cole, J. C., & Randall, M. K. (2003). Comparing the cognitive ability models of Spearman, Horn and Cattell, and Carroll. Journal of Psychoeducational Assessment, 21, 160-179. https://doi.org/10.1177%2F073428290302100204
Crișan, D. R., Tendeiro, J., & Meijer, R. (2019). The Crit Value as an Effect Size Measure for Violations of Model Assumptions in Mokken Scale Analysis for Binary Data. https://doi.org/10.31234/osf.io/8ydmr
Embretson, S. E., Reise, S. P. (2000). Item response theory for psychologists. Lawrence Erlbaum Associates.
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of Item Response Theory. Sage.
Hogarty, K. Y., Hines, C. V., Kromrey, J. D., Ferron, J. M., & Mumford, K. R. (2005). The quality of factor solutions in exploratory factor analysis: The influence of sample size, communality, and overdetermination. Educational and Psychological Measurement, 65, 202–226. https://psycnet.apa.org/doi/10.1177/0013164404267287
Junker, B. W., & Sijtsma, K. (2000). Latent and manifest monotonicity in item response models. Applied Psychological Measurement, 24, 65 81. https://doi.org/10.1177%2F01466216000241004
Junker, B. W., & Sijtsma, K. (2001). Nonparametric item response theory in action: An overview of the special issue. Applied Psychological Measurement, 25, 211–220. https://doi.org/10.1177%2F01466210122032028
Kaufman, A. S., & Kaufman, N. L. (2004). Kaufman Brief Intelligence Test (2nd ed.). American Guidance Service.
Ligtvoet, R., Van der Ark, L. A., Te Marvelde, J. M., & Sijtsma, K. (2010). Investigating an invariant item ordering for polytomously scored items. Educational and Psychological Measurement, 70, 578–595. https://doi.org/10.1177%2F0013164409355697
Lord, F. M. , & Novick, R. (1968). Statistical theories of mental test scores. Addison-Wesley.
Meijer, R. R., & Baneke, J. J. (2004). Analyzing psychopathology items: A case for nonparametric item response theory modeling. Psychological Methods, 9, 354–368. https://doi.org/10.1037/1082-989x.9.3.354
Meijer, R. R., Sijtsma, K., & Smid, N. G. (1990). Theoretical and empirical comparison of the Mokkenand the Rasch approach to IRT. Applied Psychological Measurement,14, 283–298. https://doi.org/10.1177/014662169001400306
Meijer, R. R., de Vries, R. M., & van Bruggen, V. (2011). An evaluation of the Brief Symptom Inventory-18 using item response theory: Which items are most strongly related to psychological distress?. Psychological Assessment, 23, 193 202. https://doi.org/10.1037/a0021292
Molenaar, I. W. (1997). Nonparametric models for polytomous responses. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 369–380). Springer.
Molenaar, I. W., & Sijtsma, K. (2000). MSP5 for Windows. A program for Mokken scale analysis for polytomous items. Groningen: iecProGAMMA.
Mokken, R. J. (1971). A theory and procedure of scale analysis. Mouton.
Mokken, R. J., & Lewis, C. (1982). A nonparametric approach to the analysis of dichotomous item responses. Applied Psychological Measurement, 6, 417 430. https://doi.org/10.1177%2F014662168200600404
Nunnally, J. C. (1978). Psychometric theory (2nd ed.). McGraw-Hill.
Öktem, F., (2016). Brief Intelligence Tests and Kaufman Brief Intelligence Test (KBIT-2). Turkiye Klinikleri J Psychol-Special Topics, 1(1), 10-6.
Reise, S. P., & Waller, N. G. (2003). How many IRT parameters does it take to model psychopathology items?. Psychological Methods, 8(2), 164 184. https://doi.org/10.1037/1082-989x.8.2.164
Robie, C., Zickar, M. J., & Schmit, M. J. (2001). Measurement equivalence between applicant and incumbent groups: An IRT analysis of personality scales. Human Performance, 14, 187–207. https://doi.org/10.1207/S15327043HUP1402_04
Savaşan, G. (2006). Kaufman Brief Intelligence Test the studies of validity, reliability and pre norm (age 9-10) [Unpublished master's thesis], İstanbul University, İstanbul.
Sijtsma, K. (2009). Correcting fallacies in validity, reliability, and classification. International Journal of Testing, 9, 167-194. https://doi.org/10.1080/15305050903106883
Sijtsma, K., Debets, P., & Molenaar, I. W. (1990). Mokken scale analysis for polychotomous items: Theory, a computer program and an empirical application. Quality and Quantity, 24, 173-188. https://doi.org/10.1007/BF00209550
Sijtsma, K.,& Meijer, R. R. (1992). A method for investigating the intersection of item response functions in Mokken's nonparametric IRT model. Applied Psychological Measurement, 16, 149–157. https://doi.org/10.1177/014662169201600204
Sijtsma, K., Meijer, R. R., & Van der Ark, L. A. (2011). Mokken scale analysis as time goes by: An update for scaling practitioners. Personality and Individual Differences, 50(1), 31-37. https://psycnet.apa.org/doi/10.1016/j.paid.2010.08.016
Sijtsma, K., & Molenaar, I. W. (2002). Introduction to nonparametric item response theory. SAGE.
Sijtsma, K., & Molenaar, I. W. (2016). Mokken models. In W. J. van der Linden (Ed.), Handbook of item response theory, Volume One: Models (pp. 303–321). Chapman & Hall/CRC.
Sijtsma, K., & Van der Ark, L. A. (2017). A tutorial on how to do a Mokken scale analysis on your test and questionnaire data. British Journal of Mathematical and Statistical Psychology,70(1), 137–158. https://doi.org/10.1111/bmsp.12078
Steinberg, L. (1994). Context and serial-order effects in personality measurement: Limits on the generality of measuring changes the measure. Journal of Personality and Social Psychology, 66(2),341–349. https://psycnet.apa.org/doi/10.1037/0022-3514.66.2.341
Straat, J. H., Van der Ark, L. A., & Sijtsma, K. (2013). Comparing optimization algorithms for item selection in Mokken scale analysis. Journal of Classification, 30, 72–99. https://doi:10.1007/s00357-013-9122-y
Straat, J. H., Van der Ark, L. A., & Sijtsma, K. (2014). Minimum sample size requirements for Mokken scale analysis. Educational and Psychological Measurement, 74(5), 809–822. https://doi:10.1177/0013164414529793
Straat, J. H., Van der Ark, L. A., & Sijtsma, K. (2016). Using conditional association to identify locally independent item sets. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 12(4), 117–123. https://doi.org/10.1027/1614-2241/a000115
Uluç, S., Öktem, F., Korkmaz, B. (2015). Brief screening tests: Kaufman Brief Intelligence Test-2 standardization for the Turkish version. VII. Işık Savaşır Clinical Psychology Symposium, Ankara.
Van der Ark LA (2012). "New Developments in Mokken Scale Analysis in R." Journal of Statistical Software, 48(5), 1–27. https://doi.org/10.18637/jss.v048.i05
Van der Ark, L. A., Van der Palm, D. W., & Sijtsma, K. (2011). A latent class approach to estimating test score reliability. Applied Psychological Measurement, 35(5), 380-392. https://doi.org/10.1177%2F0146621610392911
Waller, N. G., Thompson, J. S., & Wenk, E. (2000). Using IRT to separate measurement bias from true group differences on homogeneous and heterogeneous scales: An illustration with the MMPI. Psychological Methods, 5(1), 125–146. https://doi.org/10.1037/1082-989X.5.1.125
Watson, R., Deary, L., & Shipley, B. (2008). A hierarchy of distress: Mokken scaling of the GHQ 30. Psyhcological Medicine, 38(4), 575 579. https://doi.org/10.1017/S003329170800281X
Wind, S. (2016). Examining the psychometric quality of multiple-choice assessment items using Mokken scale analysis. Journal of Applied Measurement, 17(2), 142–165.
Wind, S. (2017). An instructional module on Mokken scale analysis. Educational Measurement: Issues and Practice, 36(2), 50–66. https://doi.org/10.1111/emip.12153
Zhu, J., Weiss, L. G., Prifitera, A., & Coalson, D. (2004). The Wechsler Intelligence Scales for Children and Adults. In G. Goldstein, S. R. Beers, & M. Hersen (Eds.), Comprehensive handbook of psychological assessment, Vol. 1. Intellectual and neuropsychological assessment (p. 51–75). John Wiley & Sons, Inc.

There are 44 citations in total.

Details

Primary Language	English
Subjects	Studies on Education
Journal Section	Articles
Authors	Eren Halil Özberk 0000-0003-2136-3081 Elif Bengi Ünsal Özberk 0000-0003-3605-3983 Sait Uluç 0000-0002-7048-8545 Ferhunde Öktem 0000-0001-6971-6822
Publication Date	September 5, 2021
Submission Date	January 11, 2021
Published in Issue	Year 2021 Volume: 8 Issue: 3

Cite

APA	Özberk, E. H., Ünsal Özberk, E. B., Uluç, S., Öktem, F. (2021). Investigating Invariant Item Ordering in Intelligence Tests: Mokken Scale Analysis of KBIT-2. International Journal of Assessment Tools in Education, 8(3), 714-728. https://doi.org/10.21449/ijate.858183

Article Files

Full Text

23823 23825 23824