Examination of Wording Effect of the TIMSS 2015 Mathematical Self-Esteem Scale Through the Bifactor Models
Year 2021,
Volume: 8 Issue: 2, 326 - 341, 10.06.2021
Esra Oyar
,
Hakan Yavuz Atar
Abstract
The aim of this study is to examine whether or not the positive and negative items in the Mathematical Self-Confidence Scale employed in TIMSS 2015 lead to wording effect. While examining whether the expression effect is present or not, analyzes were conducted both on the general sample and on a separate sample for female and male students. To this end, data of 5724 students from Turkey who participated in TIMSS 2015 were used. Six different measurement models were created in the analysis of data and tested with Confirmatory Factor Analysis. The study revealed that positive items have a higher mean than the negative ones. In addition, it was concluded that the bifactor models fit the data better compared to the traditional DFA model, in which the model where negative items were taken as a separate factor are those that best fit the data. This situation is verified both in the general sample and the subgroups of females and males. In conclusion, it is recommended that the scale items should be created carefully and whether the positive and negative items result in separate factors should be examined.
References
- Benson, J., & Hocevar, D. (1985). The impact of item phrasing on the validity of attitude scales for elementary school children. Journal of Educational Measurement, 22(3), 231-240. https://doi.org/10.1111/j.1745-3984.1985.tb01061.x
- Brown, T. A. (2006). Confirmatory factor analysis for applied research. Guilford Publications.
- Büyüköztürk, Ş. (2012). Sosyal bilimler için veri analizi el kitabi: İstatistik, araştirma deseni, SPSS uygulamaları ve yorum (16. Baskı). Pegem Akademi. [Handbook of data analysis for social sciences: Statistics, research design, SPSS practice and interpretation (16. Edition). Pegem Academy].
- Büyüköztürk, Ş., Çakmak, E. K., Akgün, Ö. E., Karadeniz, Ş., & Demirel, F. (2017). Bilimsel araştırma yöntemleri. Pegem Yayıncılık.
- Chen, Y. (2017). On the ımpact of negatıvely keyed ıtems on the assessment of the unıdımensıonalıty of psychologıcal tests and measures. [Doctoral dissertation, The University of British Columbia]. ProQuest Dissertations and Theses.
- Chen, Y. H., Rendina-Gobioff, G., & Dedrick, R. F. (2010). Factorial invariance of a Chinese self-esteem scale for third and sixth grade students: evaluating method effects associated with positively and negatively worded items. The International Journal of Educational and Psychological Assessment, 6 (1), 21-35.
- Chen, F. F., & Zhang, Z. (2018). Bifactor models in psychometric test development. In P. Irwing, T. Booth, & D. J. Hughes (Eds.), The Wiley handbook of psychometric testing: A multidisciplinary reference on survey, scale and test development (pp. 325–345). John Wiley, Sons Ltd.
- Cronbach, L. J. (1984). Essentials of psychological testing (4th edition). Harper & Row.
- Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd edition). Lawrence Erlbaum.
- DeVellis, R. F. (2003). Scale development: Theory and applications (2nd edition). Sage
- DiStefano, C., & Motl, R. W. (2006). Further investigating method effects associated with negatively worded items on self-report surveys. Structural Equation Modeling, 13, 440-464. https://doi.org/10.1207/s15328007sem1303_6
- DiStefano, C. & Motl, R. W. (2009). Self-esteem and method effects associated with negatively worded ıtems: Investigating factorial ınvariance by sex. Structural Equation Modeling: A Multidisciplinary Journal, 16(1), 134 146. https://doi.org/10.1080/10705510802565403
- Erkuş A. (2003). Psikometri üzerine yazılar. (1. baskı). Türk Psikologlar Derneği Yayınları.
- Ford, L. R., & Scandura, T. A. (2018). A typology of threats to construct validity in item generation. American Journal of Management, 18(2). https://doi.org/10.33423/ajm.v18i2.298
- Greenberger, E., Chen, C., Dmitrieva, J., & Farruggia, S.P. (2003). Item-wording and the dimensionality of the rosenberg self-esteem scale: Do they matter?. Personality and Individual Differences, 35(2003), 1241 1254. https://doi.org/10.1016/S0191 8869(02)00331-8
- Grimm, P. (2010). Social desirability bias. Wiley International Encyclopedia of Marketing. Hoboken, Wiley.
- Gu, H., Wen, Z., & Fan, X. (2015). The impact of wording effect on reliability and validity of the core self-evaluation scale (CSES): A bi-factor perspective. Personality and Individual Differences, 83, 142-147. https://doi.org/10.1016/j.paid.2015.04.006
- Harvey, R. J., Billings, R. S., & Nilan, K. J. (1985). Confirmatory factor analysis of the job diagnostic survey: Good news and bad news. Journal of Applied Psychology, 70, 461-468. https://doi.org/10.1037/0021-9010.70.3.461
- Hooper, M., Arora, A., Martin, M. O., & Mullis, I. V. S., (2013, June). Examining the behavior of “reverse directional” items in the TIMSS 2011 context questionnaire scales. Paper Presented at the 5th IEA International Research Conference. National Institute of Education, Nanyang Technological University, Singapore.
- Horan, P. M. , DiStefano, C. & Motl, R. W. (2003) Wording effects in self-esteem scales: methodological artifact or response style?. Structural Equation Modeling, 10(3), 435-455. https://doi.org/10.1207/S15328007SEM1003_6
- Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 1-55. https://doi.org/10.1080/10705519909540118
- Hyland, P., Boduszek, D., Dhingra, K., Shevlin, M., & Egan, A. (2014). A bifactor approach to modelling the Rosenberg Self Esteem Scale. Personality and Individual Differences, 66, 188-192. https://doi.org/10.1016/j.paid.2014.03.034
- Ibrahim, A.M. (2001). Differential responding to positive and negative items: The case of a negative item in a questionnaire for course and faculty evaluation. Psychological Reports, 88, 497–500. https://doi.org/10.2466/pr0.2001.88.2.497
- Kirk, R. (2007). Statistics: an introduction. Nelson Education.
- Lindwall, M., Barkoukis, V., Grano, C., Lucidi, F., Raudsepp, L., Liukkonen, J., & Thøgersen-Ntoumani, C. (2012). Method effects: The problem with negatively versus positively keyed items. Journal of personality assessment, 94(2), 196 204. https://doi.org/10.1080/00223891.2011.645936
- Marsh, H. W. (1986). The bias of negatively worded items in rating scales for young children: A cognitive-developmental phenomenon. Developmental Psychology, 22, 37-49.
- Marsh, H. (1996). Positive and negative global self-esteem: A substantively meaningful distinction or artifactors?. Journal of Personality and Social Psychology, 70, 810-819. https://doi.org/10.1037/0022-3514.70.4.810
- Maul, A. (2013). Method effects and the meaning of measurement. Frontiers in Psychology, 4, 169. https://doi.org/10.3389/fpsyg.2013.00169
- McLarty, J. R., Noble, A. C., & Huntley, R. M. (1989). Effects of item wording on sex bias. Journal of Educational Measurement, 26(3), 285-293. https://doi.org/10.1111/j.1745-3984.1989.tb00334.x
- Michaelides, M. P. (2019). Negative keying effects in the factor structure of TIMSS 2011 motivation scales and associations with reading achievement. Applied Measurement in Education, 32(4), 365-378. https://doi.org/10.1080/08957347.2019.1660349
- Michaelides, M. P., Zenger, M., Koutsogiorgi, C., Brähler, E., Stöbel-Richter, Y., & Berth, H. (2016). Personality correlates and gender invariance of wording effects in the German version of the rosenberg self-esteem scale. Personality and Individual Differences, 97, 13-18. https://doi.org/10.1016/j.paid.2016.03.011
- Podsakoff, P. M., MacKenzie, S. B., Lee, J. Y., & Podsakoff, N. P. (2003). Common method biases in behavioral research: A critical review of the literature and recommended remedies. Journal of Applied Psychology, 88, 879-903. https://doi.org/10.1037/0021-9010.88.5.879
- Reise, S. P. (2012). The rediscovery of bifactor measurement models. Multivariate Behavioral Research, 47 (5), 667–696. https://doi.org/10.1080/00273171.2012.715555
- Reise, S. P., Moore, T. M., & Haviland, M. G. (2010). Bifactor models and rotations: Exploring the extent to which multidimensional data yield univocal scale scores. Journal of personality assessment, 92(6), 544-559. https://doi.org/10.1080/00223891.2010.496477
- Schmitt, N., & Stuits, D.M. (1985). Factors defined by negatively keyed items: The result of careless respondents?. Applied Psychological Measurement, 9, 367 373. https://doi.org/10.1177/014662168500900405
- Schriesheim, C. A., Eisenbach, R. J., & Hill, K. D. (1991). The effect of negation and polar opposite item reversals on questionnaire reliability and validity: An experimental investigation. Educational and Psychological Measurement, 51(1), 67 78. https://doi.org/10.1177/0013164491511005
- Tomas, J. M. & Oliver, A. (1999). Rosenberg's self‐esteem scale: Two factors or method effects. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 84-98. https://doi.org/10.1080/10705519909540120
- Wang, W. C., Chen, H. F., & Jin, K. Y. (2015). Item response theory models for wording effects in mixed-format scales. Educational and Psychological Measurement, 75(1), 157-178. https://doi.org/10.1177/0013164414528209
- Wang, Y., Kim, E. S., Dedrick, R. F., Ferron, J. M., & Tan, T. (2018). A multilevel bifactor approach to construct validation of mixed-format scales. Educational and psychological measurement, 78(2), 253-271. https://doi.org/10.1177/0013164417690858
- Weems, G.H., Onwuegbuzie, A.J., & Collins, K.M.T. (2006). The role of reading comprehension in responses to positively and negatively worded items on rating scales. Evaluation & Research in Education, 19(1), 3 20. https://doi.org/10.1080/09500790608668322
- Weems, G. H., Onwuegbuzie, A. J., & Lustig, D. (2003). Profiles of respondents who respond inconsistently to positively-and negatively-worded items on rating scales. Evaluation & Research in Education, 17(1), 45-60. https://doi.org/10.1080/14664200308668290
- Weijters, B., Baumgartner, H., & Schillewaert, N. (2013). Reversed item bias: An integrative model. Psychological Methods, 18, 320–334. https://doi.org/10.1037/a0032121
- Woods, C.M. (2006). Careless responding to reverse-worded items: Implications for confirmatory factory analysis. Journal of Psychopathology and Behavioral Assessment, 28(3), 189–194. https://doi.org/10.1007/s10862-005-9004-7
- Wu, C. H. (2008). an examination of the wording effect in the rosenberg self-esteem scale among culturally chinese people. The Journal of Social Psychology, 148 (5), 535-552. https://doi.org/10.3200/SOCP.148.5.535-552
- Wu, Y., Zuo, B., Wen, F., & Yan, L. (2017). Rosenberg self-esteem scale: Method effects, factorial structure and scale invariance across migrant child and urban child populations in China. Journal of personality assessment, 99(1), 83 93. https://doi.org/10.1080/00223891.2016.1217420
- Yang, Y., Chen, Y. H., Lo, W. J., & Turner, J. E. (2012). Cross-cultural evaluation of item wording effects on an attitudinal scale. Journal of Psychoeducational Assessment, 30(5), 509-519. https://doi.org/10.1177/0734282911435461
Examination of Wording Effect of the TIMSS 2015 Mathematical Self-Esteem Scale Through the Bifactor Models
Year 2021,
Volume: 8 Issue: 2, 326 - 341, 10.06.2021
Esra Oyar
,
Hakan Yavuz Atar
Abstract
The aim of this study is to examine whether or not the positive and negative items in the Mathematical Self-Confidence Scale employed in TIMSS 2015 lead to wording effect. While examining whether the expression effect is present or not, analyzes were conducted both on the general sample and on a separate sample for female and male students. To this end, data of 5724 students from Turkey who participated in TIMSS 2015 were used. Six different measurement models were created in the analysis of data and tested with Confirmatory Factor Analysis. The study revealed that positive items have a higher mean than the negative ones. In addition, it was concluded that the bifactor models fit the data better compared to the traditional DFA model, in which the model where negative items were taken as a separate factor are those that best fit the data. This situation is verified both in the general sample and the subgroups of females and males. In conclusion, it is recommended that the scale items should be created carefully and whether the positive and negative items result in separate factors should be examined.
References
- Benson, J., & Hocevar, D. (1985). The impact of item phrasing on the validity of attitude scales for elementary school children. Journal of Educational Measurement, 22(3), 231-240. https://doi.org/10.1111/j.1745-3984.1985.tb01061.x
- Brown, T. A. (2006). Confirmatory factor analysis for applied research. Guilford Publications.
- Büyüköztürk, Ş. (2012). Sosyal bilimler için veri analizi el kitabi: İstatistik, araştirma deseni, SPSS uygulamaları ve yorum (16. Baskı). Pegem Akademi. [Handbook of data analysis for social sciences: Statistics, research design, SPSS practice and interpretation (16. Edition). Pegem Academy].
- Büyüköztürk, Ş., Çakmak, E. K., Akgün, Ö. E., Karadeniz, Ş., & Demirel, F. (2017). Bilimsel araştırma yöntemleri. Pegem Yayıncılık.
- Chen, Y. (2017). On the ımpact of negatıvely keyed ıtems on the assessment of the unıdımensıonalıty of psychologıcal tests and measures. [Doctoral dissertation, The University of British Columbia]. ProQuest Dissertations and Theses.
- Chen, Y. H., Rendina-Gobioff, G., & Dedrick, R. F. (2010). Factorial invariance of a Chinese self-esteem scale for third and sixth grade students: evaluating method effects associated with positively and negatively worded items. The International Journal of Educational and Psychological Assessment, 6 (1), 21-35.
- Chen, F. F., & Zhang, Z. (2018). Bifactor models in psychometric test development. In P. Irwing, T. Booth, & D. J. Hughes (Eds.), The Wiley handbook of psychometric testing: A multidisciplinary reference on survey, scale and test development (pp. 325–345). John Wiley, Sons Ltd.
- Cronbach, L. J. (1984). Essentials of psychological testing (4th edition). Harper & Row.
- Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd edition). Lawrence Erlbaum.
- DeVellis, R. F. (2003). Scale development: Theory and applications (2nd edition). Sage
- DiStefano, C., & Motl, R. W. (2006). Further investigating method effects associated with negatively worded items on self-report surveys. Structural Equation Modeling, 13, 440-464. https://doi.org/10.1207/s15328007sem1303_6
- DiStefano, C. & Motl, R. W. (2009). Self-esteem and method effects associated with negatively worded ıtems: Investigating factorial ınvariance by sex. Structural Equation Modeling: A Multidisciplinary Journal, 16(1), 134 146. https://doi.org/10.1080/10705510802565403
- Erkuş A. (2003). Psikometri üzerine yazılar. (1. baskı). Türk Psikologlar Derneği Yayınları.
- Ford, L. R., & Scandura, T. A. (2018). A typology of threats to construct validity in item generation. American Journal of Management, 18(2). https://doi.org/10.33423/ajm.v18i2.298
- Greenberger, E., Chen, C., Dmitrieva, J., & Farruggia, S.P. (2003). Item-wording and the dimensionality of the rosenberg self-esteem scale: Do they matter?. Personality and Individual Differences, 35(2003), 1241 1254. https://doi.org/10.1016/S0191 8869(02)00331-8
- Grimm, P. (2010). Social desirability bias. Wiley International Encyclopedia of Marketing. Hoboken, Wiley.
- Gu, H., Wen, Z., & Fan, X. (2015). The impact of wording effect on reliability and validity of the core self-evaluation scale (CSES): A bi-factor perspective. Personality and Individual Differences, 83, 142-147. https://doi.org/10.1016/j.paid.2015.04.006
- Harvey, R. J., Billings, R. S., & Nilan, K. J. (1985). Confirmatory factor analysis of the job diagnostic survey: Good news and bad news. Journal of Applied Psychology, 70, 461-468. https://doi.org/10.1037/0021-9010.70.3.461
- Hooper, M., Arora, A., Martin, M. O., & Mullis, I. V. S., (2013, June). Examining the behavior of “reverse directional” items in the TIMSS 2011 context questionnaire scales. Paper Presented at the 5th IEA International Research Conference. National Institute of Education, Nanyang Technological University, Singapore.
- Horan, P. M. , DiStefano, C. & Motl, R. W. (2003) Wording effects in self-esteem scales: methodological artifact or response style?. Structural Equation Modeling, 10(3), 435-455. https://doi.org/10.1207/S15328007SEM1003_6
- Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 1-55. https://doi.org/10.1080/10705519909540118
- Hyland, P., Boduszek, D., Dhingra, K., Shevlin, M., & Egan, A. (2014). A bifactor approach to modelling the Rosenberg Self Esteem Scale. Personality and Individual Differences, 66, 188-192. https://doi.org/10.1016/j.paid.2014.03.034
- Ibrahim, A.M. (2001). Differential responding to positive and negative items: The case of a negative item in a questionnaire for course and faculty evaluation. Psychological Reports, 88, 497–500. https://doi.org/10.2466/pr0.2001.88.2.497
- Kirk, R. (2007). Statistics: an introduction. Nelson Education.
- Lindwall, M., Barkoukis, V., Grano, C., Lucidi, F., Raudsepp, L., Liukkonen, J., & Thøgersen-Ntoumani, C. (2012). Method effects: The problem with negatively versus positively keyed items. Journal of personality assessment, 94(2), 196 204. https://doi.org/10.1080/00223891.2011.645936
- Marsh, H. W. (1986). The bias of negatively worded items in rating scales for young children: A cognitive-developmental phenomenon. Developmental Psychology, 22, 37-49.
- Marsh, H. (1996). Positive and negative global self-esteem: A substantively meaningful distinction or artifactors?. Journal of Personality and Social Psychology, 70, 810-819. https://doi.org/10.1037/0022-3514.70.4.810
- Maul, A. (2013). Method effects and the meaning of measurement. Frontiers in Psychology, 4, 169. https://doi.org/10.3389/fpsyg.2013.00169
- McLarty, J. R., Noble, A. C., & Huntley, R. M. (1989). Effects of item wording on sex bias. Journal of Educational Measurement, 26(3), 285-293. https://doi.org/10.1111/j.1745-3984.1989.tb00334.x
- Michaelides, M. P. (2019). Negative keying effects in the factor structure of TIMSS 2011 motivation scales and associations with reading achievement. Applied Measurement in Education, 32(4), 365-378. https://doi.org/10.1080/08957347.2019.1660349
- Michaelides, M. P., Zenger, M., Koutsogiorgi, C., Brähler, E., Stöbel-Richter, Y., & Berth, H. (2016). Personality correlates and gender invariance of wording effects in the German version of the rosenberg self-esteem scale. Personality and Individual Differences, 97, 13-18. https://doi.org/10.1016/j.paid.2016.03.011
- Podsakoff, P. M., MacKenzie, S. B., Lee, J. Y., & Podsakoff, N. P. (2003). Common method biases in behavioral research: A critical review of the literature and recommended remedies. Journal of Applied Psychology, 88, 879-903. https://doi.org/10.1037/0021-9010.88.5.879
- Reise, S. P. (2012). The rediscovery of bifactor measurement models. Multivariate Behavioral Research, 47 (5), 667–696. https://doi.org/10.1080/00273171.2012.715555
- Reise, S. P., Moore, T. M., & Haviland, M. G. (2010). Bifactor models and rotations: Exploring the extent to which multidimensional data yield univocal scale scores. Journal of personality assessment, 92(6), 544-559. https://doi.org/10.1080/00223891.2010.496477
- Schmitt, N., & Stuits, D.M. (1985). Factors defined by negatively keyed items: The result of careless respondents?. Applied Psychological Measurement, 9, 367 373. https://doi.org/10.1177/014662168500900405
- Schriesheim, C. A., Eisenbach, R. J., & Hill, K. D. (1991). The effect of negation and polar opposite item reversals on questionnaire reliability and validity: An experimental investigation. Educational and Psychological Measurement, 51(1), 67 78. https://doi.org/10.1177/0013164491511005
- Tomas, J. M. & Oliver, A. (1999). Rosenberg's self‐esteem scale: Two factors or method effects. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 84-98. https://doi.org/10.1080/10705519909540120
- Wang, W. C., Chen, H. F., & Jin, K. Y. (2015). Item response theory models for wording effects in mixed-format scales. Educational and Psychological Measurement, 75(1), 157-178. https://doi.org/10.1177/0013164414528209
- Wang, Y., Kim, E. S., Dedrick, R. F., Ferron, J. M., & Tan, T. (2018). A multilevel bifactor approach to construct validation of mixed-format scales. Educational and psychological measurement, 78(2), 253-271. https://doi.org/10.1177/0013164417690858
- Weems, G.H., Onwuegbuzie, A.J., & Collins, K.M.T. (2006). The role of reading comprehension in responses to positively and negatively worded items on rating scales. Evaluation & Research in Education, 19(1), 3 20. https://doi.org/10.1080/09500790608668322
- Weems, G. H., Onwuegbuzie, A. J., & Lustig, D. (2003). Profiles of respondents who respond inconsistently to positively-and negatively-worded items on rating scales. Evaluation & Research in Education, 17(1), 45-60. https://doi.org/10.1080/14664200308668290
- Weijters, B., Baumgartner, H., & Schillewaert, N. (2013). Reversed item bias: An integrative model. Psychological Methods, 18, 320–334. https://doi.org/10.1037/a0032121
- Woods, C.M. (2006). Careless responding to reverse-worded items: Implications for confirmatory factory analysis. Journal of Psychopathology and Behavioral Assessment, 28(3), 189–194. https://doi.org/10.1007/s10862-005-9004-7
- Wu, C. H. (2008). an examination of the wording effect in the rosenberg self-esteem scale among culturally chinese people. The Journal of Social Psychology, 148 (5), 535-552. https://doi.org/10.3200/SOCP.148.5.535-552
- Wu, Y., Zuo, B., Wen, F., & Yan, L. (2017). Rosenberg self-esteem scale: Method effects, factorial structure and scale invariance across migrant child and urban child populations in China. Journal of personality assessment, 99(1), 83 93. https://doi.org/10.1080/00223891.2016.1217420
- Yang, Y., Chen, Y. H., Lo, W. J., & Turner, J. E. (2012). Cross-cultural evaluation of item wording effects on an attitudinal scale. Journal of Psychoeducational Assessment, 30(5), 509-519. https://doi.org/10.1177/0734282911435461