How many grades of response categories does the commitment to the profession of medicine scale provide the most information?

Murat Tekin; Çetin Toraman; Ayşen Melek Aytuğ Koşan

doi:10.21449/ijate.1400157

Research Article

How many grades of response categories does the commitment to the profession of medicine scale provide the most information?

Year 2024, Volume: 11 Issue: 3, 524 - 536, 09.09.2024

Murat Tekin , Çetin Toraman , Ayşen Melek Aytuğ Koşan

https://doi.org/10.21449/ijate.1400157

Abstract

In the present study, we examined the psychometric properties of the data obtained from the Commitment to Profession of Medicine Scale (CPMS) with 4-point, 5-point, 6-point, and 7-point response sets based on Item Response Theory (IRT). A total of 2150 medical students from 16 different universities participated in the study. The participants were divided into four groups consisting of 560, 544, 502, and 544 medical students. The first group (n=560) was assigned four-point, the second group (n=544) five-point, the third group (n=502) six-point, and the fourth group (n=544) seven-point Likert forms. We used R statistical software to analyze the data. The results of item calibrations conducted with the Graded Response Model (GRM) were analyzed. The results show that the eigenvalue increased from 4-point to 7-point. Similarly, the explained variance percentage and the scale's reliability increased gradually from 4-point to 7-point. The explained variance, reliability level, and eigenvalue were very close in the 5-point and 6-point forms.

Keywords

Likert scale, Response set, Item response theory, Medical student

References

Adelson, J.L., & McCoach, D.B. (2010). Measuring the mathematical attitudes of elementary students: The effects of a 4-point or 5-point Likert-Type scale. Educational and Psychological Measurement, 70(5) 796-807. https://doi.org/10.1177/0013164410366694
Aiken, L.R. (1983). Number of response categories and statistics on a teacher rating scale. Educational and Psychological Measurement, 43, 397-401.
Anastasi, A., & Urbina, S. (1997). Psychological testing. Prentice-Hall International, Inc.
Aybek, E.C., & Toraman, C. (2022). How many response categories are sufficient for Likert type scales? An empirical study based on the Item Response Theory. International Journal of Assessment Tools in Education, 9(2), 534 547. https://doi.org/10.21449/ijate.1132931
Aytug Kosan, A.M., & Toraman, C. (2020). Development and application of the commitment to profession of medicine scale using classical test theory and item response theory. Croatian Medical Journal, 61(5), 391-400. https://doi.org/10.3325/cmj.2020.61.391
Bora, B. (2013). A study on the applicability of the likert type scales in marketing. Doctoral Thesis. Sakarya University. Sakarya.
Browne, M.W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In Bollen K., Long J. (Eds.), Testing structural equation models (pp. 136-162). SAGE.
Chalmers, R.P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1 29. https://doi.org/10.18637/jss.v048.i06
Champney, H., & Marshall, H. (1939). Optimal refinement of the rating scale. Journal of Applied Psychology, 23, 323-331.
Chang, L. (1994). A psychometric evaluation of 4-point and 6-point Likert-type scales in relation to reliability and validity. Applied Psychological Measurement, 18(3), 205-215. https://doi.org/10.1177/014662169401800302
Dawes, J. (2008). Do data characteristics change according to the number of scale points used? An experiment using 5-point, 7-point and 10-point scales. International Journal of Market Research, 50(1), 61-104. https://doi.org/10.1177/147078530805000106
DeVellis, R.F. (2003). Scale development, theory and applications. SAGE Publications.
Dunn-Rankin, P., Knezek, G.A., Wallace, S., & Zhang, S. (2004). Scaling methods. Lawrence Erlbaum Associates, Inc.
Flannery, W.P., Reise, S.P., & Widaman, K.F. (1995). An item response theory analysis of the general and academic scales of the self-description questionnarie II. Research in Personality, 29(2), 168-188. https://doi.org/10.1006/jrpe.1995.1010
Fornell, C., & Larcker, D.F. (1981). Evaluating Structural Equation Models with Unobservable Variables and Measurement Error. Journal of Marketing Research, 18(1), 39–50. https://doi.org/10.2307/3151312
Hair, J.F., Black, W.C., Babin, B.J., & Anderson, R.E. (2014). Multivariate data analysis. Pearson Education Limited.
Hambleton, R.K. (1994). Guidelines for adapting educational and psychological test: A progress report. European Journal of Psychological Assessment, 10(3), 229-244.
Hu, L., & Bentler, P.M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1-55.
Jamieson, S. (2004). Likert scales: How to (ab)use them. Medical Education, 38, 1217‐1218
Joshi, A., Kale, S., Chandel, S., & Pal, D.K. (2015). Likert scale: Explored and explained. British Journal of Applied Science & Technology (BJAST), 7(4), 396 403. https://doi.org/10.9734/BJAST/2015/14975
Korkmaz, S., Goksuluk, D., & Zararsiz, G. (2014). MVN: An R package for assessing multivariate normality. The R Journal, 6(2), 151-162. https://doi.org/10.32614/RJ-2014-031
Leung, S.O. (2011). A comparison of psychometric properties and normality in 4-, 5-, 6-, and 11 point Likert Scales. Journal of Social Service Research, 37, 412 421. https://doi.org/10.1080/01488376.2011.580697
Likert, R. (1932). A technique for the measurement of attitudes. Arch Psychology, 22(140), 55.
Lord, F.M. (1954). Chapter II: Scaling. Review of Educational Research, 24(5), 375-392. https://doi.org/10.3102/00346543024005375
Mariano, L.T., Phillips, A., Estes, K., & Kilburn, R. (2024). Should survey Likert Scales include neutral responce categories? Evidence from a randomized school climate survey. Working Paper. Rand Corporation. https://www.rand.org/content/dam/rand/pubs/working_papers/WRA3100/WRA3135-2/RAND_WRA3135-2.pdf
Norman, G. (2010). Likert scales, levels of measurement and the “laws” of statistics. Adv in Health Sci Educ 15, 625-632. https://doi.org/10.1007/s10459-010-9222-y
Nunnally, J.C., & Bernstein, I.H. (1994). Psychometric theory. McGraw-Hill, Inc.
Preston, C.C., & Colman, A.M. (2000). Optimal number of response categories in rating scales: reliability, validity, discriminating power, and respondent preferences. Acta Psychologica 104, 1-15. https://doi.org/10.1016/s0001-6918(99)00050-5
Price, L.R. (2017). Psychometric methods, theory into practice. The Guilford Press
R Core Team (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/
Revelle, W. (2021). psych: Procedures for Personality and Psychological Research. Northwestern University, Evanston, Illinois. R package version 2.1.6. https://CRAN.R-project.org/package=psych
Samejima, F. (2005). Graded response model in encyclopedia of social measurement, edit. Kimberly Kempf-Leonard (pp: 145-153). Elsevier. https://doi.org/10.1016/B0-12-369398-5/00451-5
Smits, N., Öğreden, O., Garnier-Villarreal, M., Terwee, C.B., & Chalmers, R.P. (2020). A study of alternative approaches to non-normal latent trait distributions in item response theory models used for health outcome measurement. Statistical Methods in Medical Research, 29(4), 1030-1048. https://doi.org/10.1177/0962280220907625
Sodano, S.M., Tracey, T.J.G., & Hafkenscheid, A. (2014) A brief Dutch language ımpact message ınventory-circumplex (IMI-C Short) using non-parametric item response theory. Psychotherapy Research, 24(5), 616 628. https://doi.org/10.1080/10503307.2013.847984
Stevens, S.S. (1946). On the theory of scales of measurement. Science, 103, 677-680
Thomas, H. (1982). IQ interval scales, and normal distributions. Psychological Bulletin, 91, 198-202
Torgerson, W.S. (1958). Theory and methods of scaling. John Willey & Sons, Inc.
Warner, R.M. (2013). Applied statistics, from bivariate through multivariate tecniques. SAGE Publications, Inc.
Wakita, T., Ueshima, N., & Noguchi, H. (2012). Psychological distance between categories in the Likert scale: Comparing different numbers of options. Educational and Psychological Measurement, 72(4), 533–546. https://doi.org/10.1177/0013164411431162
Wong, C.-S., Chuen, K.-C., & Fung, M.-Y. (1993). Differences between odd and even number of response scales: Some empirical evidence. Chinese Journal of Psychology, 35, 75-86.
Wu, H., & Leung, S.O. (2017). Can Likert Scales be treated as interval scales? A simulation study. Journal of Social Service Research, 43(4), 527 532. https://doi.org/10.1080/01488376.2017.1329775
Yen, W.M. (1993). Scaling performance assessments: Strategies for managing local item dependence. Journal of Educational Measurement, 30(3),187 213. https://doi.org/10.1111/j.1745-3984.1993.tb00423.x
Yen, W.M. (1984). Effects of local item dependence on the fit and equating performance of the three parameter logistic model. Applied Psychological Measurement, 8, 125-145.

How many grades of response categories does the commitment to the profession of medicine scale provide the most information?

Year 2024, Volume: 11 Issue: 3, 524 - 536, 09.09.2024

Murat Tekin , Çetin Toraman , Ayşen Melek Aytuğ Koşan

https://doi.org/10.21449/ijate.1400157

Abstract

Keywords

Likert scale, Response set, Item response theory, Medical student

References

Adelson, J.L., & McCoach, D.B. (2010). Measuring the mathematical attitudes of elementary students: The effects of a 4-point or 5-point Likert-Type scale. Educational and Psychological Measurement, 70(5) 796-807. https://doi.org/10.1177/0013164410366694
Aiken, L.R. (1983). Number of response categories and statistics on a teacher rating scale. Educational and Psychological Measurement, 43, 397-401.
Anastasi, A., & Urbina, S. (1997). Psychological testing. Prentice-Hall International, Inc.
Aybek, E.C., & Toraman, C. (2022). How many response categories are sufficient for Likert type scales? An empirical study based on the Item Response Theory. International Journal of Assessment Tools in Education, 9(2), 534 547. https://doi.org/10.21449/ijate.1132931
Aytug Kosan, A.M., & Toraman, C. (2020). Development and application of the commitment to profession of medicine scale using classical test theory and item response theory. Croatian Medical Journal, 61(5), 391-400. https://doi.org/10.3325/cmj.2020.61.391
Bora, B. (2013). A study on the applicability of the likert type scales in marketing. Doctoral Thesis. Sakarya University. Sakarya.
Browne, M.W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In Bollen K., Long J. (Eds.), Testing structural equation models (pp. 136-162). SAGE.
Chalmers, R.P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1 29. https://doi.org/10.18637/jss.v048.i06
Champney, H., & Marshall, H. (1939). Optimal refinement of the rating scale. Journal of Applied Psychology, 23, 323-331.
Chang, L. (1994). A psychometric evaluation of 4-point and 6-point Likert-type scales in relation to reliability and validity. Applied Psychological Measurement, 18(3), 205-215. https://doi.org/10.1177/014662169401800302
Dawes, J. (2008). Do data characteristics change according to the number of scale points used? An experiment using 5-point, 7-point and 10-point scales. International Journal of Market Research, 50(1), 61-104. https://doi.org/10.1177/147078530805000106
DeVellis, R.F. (2003). Scale development, theory and applications. SAGE Publications.
Dunn-Rankin, P., Knezek, G.A., Wallace, S., & Zhang, S. (2004). Scaling methods. Lawrence Erlbaum Associates, Inc.
Flannery, W.P., Reise, S.P., & Widaman, K.F. (1995). An item response theory analysis of the general and academic scales of the self-description questionnarie II. Research in Personality, 29(2), 168-188. https://doi.org/10.1006/jrpe.1995.1010
Fornell, C., & Larcker, D.F. (1981). Evaluating Structural Equation Models with Unobservable Variables and Measurement Error. Journal of Marketing Research, 18(1), 39–50. https://doi.org/10.2307/3151312
Hair, J.F., Black, W.C., Babin, B.J., & Anderson, R.E. (2014). Multivariate data analysis. Pearson Education Limited.
Hambleton, R.K. (1994). Guidelines for adapting educational and psychological test: A progress report. European Journal of Psychological Assessment, 10(3), 229-244.
Hu, L., & Bentler, P.M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1-55.
Jamieson, S. (2004). Likert scales: How to (ab)use them. Medical Education, 38, 1217‐1218
Joshi, A., Kale, S., Chandel, S., & Pal, D.K. (2015). Likert scale: Explored and explained. British Journal of Applied Science & Technology (BJAST), 7(4), 396 403. https://doi.org/10.9734/BJAST/2015/14975
Korkmaz, S., Goksuluk, D., & Zararsiz, G. (2014). MVN: An R package for assessing multivariate normality. The R Journal, 6(2), 151-162. https://doi.org/10.32614/RJ-2014-031
Leung, S.O. (2011). A comparison of psychometric properties and normality in 4-, 5-, 6-, and 11 point Likert Scales. Journal of Social Service Research, 37, 412 421. https://doi.org/10.1080/01488376.2011.580697
Likert, R. (1932). A technique for the measurement of attitudes. Arch Psychology, 22(140), 55.
Lord, F.M. (1954). Chapter II: Scaling. Review of Educational Research, 24(5), 375-392. https://doi.org/10.3102/00346543024005375
Mariano, L.T., Phillips, A., Estes, K., & Kilburn, R. (2024). Should survey Likert Scales include neutral responce categories? Evidence from a randomized school climate survey. Working Paper. Rand Corporation. https://www.rand.org/content/dam/rand/pubs/working_papers/WRA3100/WRA3135-2/RAND_WRA3135-2.pdf
Norman, G. (2010). Likert scales, levels of measurement and the “laws” of statistics. Adv in Health Sci Educ 15, 625-632. https://doi.org/10.1007/s10459-010-9222-y
Nunnally, J.C., & Bernstein, I.H. (1994). Psychometric theory. McGraw-Hill, Inc.
Preston, C.C., & Colman, A.M. (2000). Optimal number of response categories in rating scales: reliability, validity, discriminating power, and respondent preferences. Acta Psychologica 104, 1-15. https://doi.org/10.1016/s0001-6918(99)00050-5
Price, L.R. (2017). Psychometric methods, theory into practice. The Guilford Press
R Core Team (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/
Revelle, W. (2021). psych: Procedures for Personality and Psychological Research. Northwestern University, Evanston, Illinois. R package version 2.1.6. https://CRAN.R-project.org/package=psych
Samejima, F. (2005). Graded response model in encyclopedia of social measurement, edit. Kimberly Kempf-Leonard (pp: 145-153). Elsevier. https://doi.org/10.1016/B0-12-369398-5/00451-5
Smits, N., Öğreden, O., Garnier-Villarreal, M., Terwee, C.B., & Chalmers, R.P. (2020). A study of alternative approaches to non-normal latent trait distributions in item response theory models used for health outcome measurement. Statistical Methods in Medical Research, 29(4), 1030-1048. https://doi.org/10.1177/0962280220907625
Sodano, S.M., Tracey, T.J.G., & Hafkenscheid, A. (2014) A brief Dutch language ımpact message ınventory-circumplex (IMI-C Short) using non-parametric item response theory. Psychotherapy Research, 24(5), 616 628. https://doi.org/10.1080/10503307.2013.847984
Stevens, S.S. (1946). On the theory of scales of measurement. Science, 103, 677-680
Thomas, H. (1982). IQ interval scales, and normal distributions. Psychological Bulletin, 91, 198-202
Torgerson, W.S. (1958). Theory and methods of scaling. John Willey & Sons, Inc.
Warner, R.M. (2013). Applied statistics, from bivariate through multivariate tecniques. SAGE Publications, Inc.
Wakita, T., Ueshima, N., & Noguchi, H. (2012). Psychological distance between categories in the Likert scale: Comparing different numbers of options. Educational and Psychological Measurement, 72(4), 533–546. https://doi.org/10.1177/0013164411431162
Wong, C.-S., Chuen, K.-C., & Fung, M.-Y. (1993). Differences between odd and even number of response scales: Some empirical evidence. Chinese Journal of Psychology, 35, 75-86.
Wu, H., & Leung, S.O. (2017). Can Likert Scales be treated as interval scales? A simulation study. Journal of Social Service Research, 43(4), 527 532. https://doi.org/10.1080/01488376.2017.1329775
Yen, W.M. (1993). Scaling performance assessments: Strategies for managing local item dependence. Journal of Educational Measurement, 30(3),187 213. https://doi.org/10.1111/j.1745-3984.1993.tb00423.x
Yen, W.M. (1984). Effects of local item dependence on the fit and equating performance of the three parameter logistic model. Applied Psychological Measurement, 8, 125-145.

There are 43 citations in total.

Details

Primary Language	English
Subjects	Measurement Theories and Applications in Education and Psychology
Journal Section	Articles
Authors	Murat Tekin 0000-0001-6841-3045 Çetin Toraman 0000-0001-5319-0731 Ayşen Melek Aytuğ Koşan 0000-0001-5298-2032
Early Pub Date	August 27, 2024
Publication Date	September 9, 2024
Submission Date	December 4, 2023
Acceptance Date	July 24, 2024
Published in Issue	Year 2024 Volume: 11 Issue: 3

Cite

APA	Tekin, M., Toraman, Ç., & Aytuğ Koşan, A. M. (2024). How many grades of response categories does the commitment to the profession of medicine scale provide the most information?. International Journal of Assessment Tools in Education, 11(3), 524-536. https://doi.org/10.21449/ijate.1400157

Article Files

Full Text

23823 23825 23824