Research Article
BibTex RIS Cite

Normal Distribution Dilemma

Year 2022, Volume: 12 Issue: 1, 220 - 248, 07.01.2022
https://doi.org/10.18039/ajesi.962653

Abstract

Researchers examine assumptions before performing most hypothesis testing. A common assumption is that the data are normally distributed. However, normality tests and descriptive statistics often create dilemmas for researchers, making it difficult to decide whether the data is normally distributed. The aim of the study was to compare univariate normality tests (Anderson-Darling, Cramer-von Mises, Jarque-Bera, Kolmogorov-Smirnov, Lilliefors, Pearson chi-square, Shapiro-Francia and Shapiro-Wilk) and descriptive statistics used for normality (standard values of skewness and kurtosis coefficients, skewness coefficient/standard error) according to the value of skewness, sample size, and the continuous-categorical status of the data. The research was a Monte Carlo simulation study. The simulation conditions were determined by the skewness coefficient (-2.5, -1.0, 0.0, 1.0, and 2.5), sample size (20, 30, 50, 100, 500, 1000, and 5000), and continuous or ordinal (number of categories 2, 3, 4, 5, and 7) status of the data. In the study, 210 simulation conditions were studied with fully crossed design. The evaluation criteria were determined as type-1 error and power. As a result of the research, it was determined that Jarque–Bera, the standard value of the skewness coefficient, skewness coefficient/standard error and the standard value of the kurtosis coefficients showed a better performance in terms of type-1 error. In terms of power, there was decrease in the power of all methods when the sample size was small, the data type was continuous, and the skewness coefficient was -1 or +1.

References

  • Anderson, T. W. ve Darling, D. A. (1954). A test of goodness of fit. Journal of the American Statistical Association, 49(268), 765-769. https://doi.org/10.2307/2281537 adresinden 3.5.2021 tarihinde erişilmiştir.
  • Barton, B. ve Peat, J. (2014). Medical statistics: A guide to SPSS, data analysis and critical appraisal. (2. Baskı). Wiley.
  • Büyüköztürk, Ş. (2013). Sosyal bilimler için veri analizi el kitabı: İstatistik, araştırma deseni, SPSS uygulamaları ve yorum. (18. Baskı). Pegem Akademi.
  • Crocker, L. ve Algina, J. (2008). Introduction of classical and modern test theory. Cengage Learning.
  • Feinberg, R. A. ve Rubright, J. D. (2016). Conducting simulation studies in psychometrics. Educational Measurement: Issues and Practice, 35(2), 36-49. https://doi.org/10.1111/emip.12111 adresinden 8.12.2021 tarihinde erişilmiştir.
  • Fialkowski, A. C. (2018). SimMultiCorrData: Simulation of correlated data with multiple variable types [Bilgisayar Yazılımı]. https://cran.r-project.org/package=SimMultiCorrData adresinden 7.6.2021 tarihinde erişilmiştir.
  • Field, A. (2018). Discovering statistics using IBM SPSS statistics. (5. Baskı). Sage.
  • Finney, S. J. ve DiStefano, C. (2013). Nonnormal and categorical data in structural equation modeling. G. R. Hancock ve R. O. Mueller (Ed.), Structural equation modeling: A second course içinde (2. Baskı, ss. 439–492). IAP.
  • George, D. ve Mallery, M. (2001). SPSS for Windows step by step: A simple guide and reference 10.0 update. (3. Baskı). Allyn and Bacon.
  • Gross, J. ve Ligges, U. (2015). nortest: Tests for normality [Bilgisayar Yazılımı]. https://cran.r-project.org/package=nortest adresinden 22.2.2021 tarihinde erişilmiştir.
  • Hair, J. F., Black, W. C., Babin, B. J. ve Anderson, R. E. (2009). Multivariate data analysis. (7. Baskı). Prentice Hall.
  • Harel, D. (2020). sur: Companion to “statistics using R: An integrative approach [Bilgisayar Yazılımı]. https://cran.r-project.org/package=sur adresinden 22.2.2021 tarihinde erişilmiştir.
  • International Business Machines (IBM). (2021). Summarize statistics. https://www.ibm.com/docs/en/spss-statistics/SaaS?topic=summarize-statistics adresinden 22.2.2021 tarihinde erişilmiştir.
  • Jarque, C. M. ve Bera, A. K. (1987). A test for normality of observations and regression residuals. International Statistical Review, 55(2), 163-172. https://doi.org/10.2307/1403192 adresinden 3.5.2021 tarihinde erişilmiştir.
  • Keskin, S. (2006). Comparison of several univariate normality tests regarding type I error rate and power of the test in simulation based small samples. Journal of Applied Science Research 2(5), 296-300. http://www.aensiweb.com/old/jasr/jasr/2006/296-300.pdf adresinden 15.5.2021 tarihinde erişilmiştir.
  • Leech, N. L., Barrett, K. C. ve Morgan, G. A. (2005). SPSS for intermediate statistics: Use and interpretation. (2. Baskı). Taylor & Francis.
  • Lilliefors, H. W. (1967). On the Kolmogorov-Smirnov test for normality with mean and variance unknown. Journal of the American Statistical Association, 62(318), 399-402. https://doi.org/10.2307/2283970 adresinden 3.5.2021 tarihinde erişilmiştir.
  • Luo, H. (2011). Generation of non-normal data – A study of Fleishman’s power method (Yayın No. 2011:1). http://www.diva-portal.org/smash/get/diva2:407995/FULLTEXT01.pdf adresinden 18.2.2021 tarihinde erişilmiştir.
  • Mendes, M. ve Pala, A. (2003). Type I error rate and power of three normality tests. Information Technology Journal, 2(2), 135-139. https://doi.org/10.3923/itj.2003.135.139 adresinden 15.5.2021 tarihinde erişilmiştir.
  • Oppong, F. B. ve Agbedra, S. Y. (2016). Assessing univariate and multivariate normality, a guide for non-statisticians. Mathematical Theory and Modeling, 6(2), 26-33. https://www.iiste.org/Journals/index.php/MTM/article/view/28571 adresinden 24.5.2021 tarihinde erişilmiştir.
  • Öztuna, D., Elhan, A. H. ve Tüccar, E. (2006). Investigation of four different normality tests in terms of type 1 error rate and power under different distributions. Turkish Journal of Medical Sciences, 36(3), 171-176. https://dergipark.org.tr/tr/download/article-file/129239 adresinden 13.5.2021 tarihinde erişilmiştir.
  • Pallant, J. (2016). A step by step guide to data analysis using IBM SPSS. McGraw Hill Education.
  • Pituch, K. A. ve Stevens, J. P. (2016). Applied multivariate statistics for the social sciences. (6. Baskı). Routledge.
  • R Core Team. (2020). R: A language and environment for statistical computing [Bilgisayar Yazılımı]. https://www.r-project.org/ adresinden 22.2.2021 tarihinde erişilmiştir.
  • Razali, N. M. ve Wah, Y. B. (2011). Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests. Journal of Statistical Modeling and Analytics, 2(1), 21-33. https://www.nrc.gov/docs/ML1714/ML17143A100.pdf adresinden 14.5.2021 tarihinde erişilmiştir.
  • Royston, P. (1992). Approximating the Shapiro-Wilk W-test for non-normality. Statistics and Computing, 2(3), 117–119. https://doi.org/10.1007/BF01891203 adresinden 14.5.2021 tarihinde erişilmiştir.
  • Sigal, M. J. ve Chalmers, R. P. (2016). Play it again: Teaching statistics with monte carlo simulation. Journal of Statistics Education, 24(3), 136–156. https://doi.org/10.1080/10691898.2016.1246953 adresinden 10.5.2021 tarihinde erişilmiştir.
  • Stephens, M. A. (1974). EDF statistics for goodness of fit and some comparisons. Journal of the American Statistical Association, 69(347), 730-737. https://doi.org/10.2307/2286009 adresinden 10.5.2021 tarihinde erişilmiştir.
  • Stevens, J. P. (2009). Applied multivariate statistics for the social sciences. (5. Baskı). Routledge.
  • Tabachnick, B. G. ve Fidell, L. S. (2013). Using multivariate statistics. (6. Baskı). Pearson.
  • Thode, H. C. (2002). Testing for normality. Marcel Dekker.
  • Verrill, S. ve Johnson, R. A. (1988). Tables and large-sample distribution theory for censored-data correlation statistics for testing normality. Journal of the American Statistical Association, 83(404), 1192–1197. https://doi.org/10.1080/01621459.1988.10478721 adresinden 14.5.2021 tarihinde erişilmiştir.
  • Wickham, H. (2016). ggplot2: Elegant graphics for data analysis [Bilgisayar Yazılımı]. Springer-Verlag. http://ggplot2.org adresinden 22.2.2021 tarihinde erişilmiştir.
  • Wright, D. B. ve Herrington, J. A. (2011). Problematic standard errors and confidence intervals for skewness and kurtosis. Behavior Research Methods, 43(1), 8–17. https://doi.org/10.3758/s13428-010-0044-x adresinden 22.5.2021 tarihinde erişilmiştir.
  • Wuertz, D., Setz, T., & Chalabi, Y. (2020). fBasics: Rmetrics - markets and basic statistics [Bilgisayar Yazılımı]. https://cran.r-project.org/package=fBasics adresinden 22.2.2021 tarihinde erişilmiştir.
  • Yap, B. W., & Sim, C. H. (2011). Comparisons of various types of normality tests. Journal of Statistical Computation and Simulation, 81(12), 2141-2155. https://doi.org/10.1080/00949655.2010.520163 adresinden 11.5.2021 tarihinde erişilmiştir.

Normal Dağılım İkilemi

Year 2022, Volume: 12 Issue: 1, 220 - 248, 07.01.2022
https://doi.org/10.18039/ajesi.962653

Abstract

Araştırmacılar, çoğu hipotez testinden önce testin varsayımlarını incelemektedir. Sıklıkla karşılaşılan bir varsayım ise verinin normal dağılım göstermesidir. Ancak normallik testleri ve betimsel istatistikler çoğunlukla araştırmacıları ikileme düşürerek verinin normal dağılıp dağılmadığıyla ilgili karar almasını zorlaştırmaktadır. İşte bu yönde araştırmanın amacı çarpıklık katsayısı, örneklem büyüklüğü ve verinin sürekli–sıralı olma durumuna göre tek değişkenli normallik testlerini (Anderson-Darling, Cramer-von Mises, Jarque-Bera, Kolmogorov-Smirnov, Lilliefors, Pearson ki kare, Shapiro-Francia ve Shapiro-Wilk) ve normallik için kullanılan betimsel istatistikleri (çarpıklık katsayısının standart değeri, basıklık katsayısının standart değeri, çarpıklık katsayısı/standart hata) karşılaştırmaktır. Araştırma bir Monte Carlo simülasyon çalışmasıdır. Simülasyon koşulları çarpıklık katsayısı (-2.5, -1.0, 0.0, 1.0, 2.5), örneklem büyüklüğü (20, 30, 50, 100, 500, 1000 ve 5000) ve verinin sürekli ya da sıralı (kategorisi sayısı 2, 3, 4, 5 ve 7) olması olarak belirlenmiştir. Araştırmada tamamen çaprazlanmış desenle 210 simülasyon koşulu üzerinde çalışılmıştır. Değerlendirme ölçütleri 1. tip hata ve güç olarak belirlenmiştir. Araştırma sonucunda 1. tip hata açısından Jarque–Bera, çarpıklık katsayısının standart değeri, çarpıklık katsayısı/standart hata ve basıklık katsayısının standart değerinin koşulların çoğunda diğer yöntemlere göre daha düşük 1. tip hata ve daha yüksek güç değerlerine sahip olduğu belirlenmiştir. Örneklemin küçük, veri tipinin sürekli, çarpıklık katsayısının -1 ya da +1 olduğu koşullarda tüm yöntemlerin gücünde düşüş gözlenmiştir.

References

  • Anderson, T. W. ve Darling, D. A. (1954). A test of goodness of fit. Journal of the American Statistical Association, 49(268), 765-769. https://doi.org/10.2307/2281537 adresinden 3.5.2021 tarihinde erişilmiştir.
  • Barton, B. ve Peat, J. (2014). Medical statistics: A guide to SPSS, data analysis and critical appraisal. (2. Baskı). Wiley.
  • Büyüköztürk, Ş. (2013). Sosyal bilimler için veri analizi el kitabı: İstatistik, araştırma deseni, SPSS uygulamaları ve yorum. (18. Baskı). Pegem Akademi.
  • Crocker, L. ve Algina, J. (2008). Introduction of classical and modern test theory. Cengage Learning.
  • Feinberg, R. A. ve Rubright, J. D. (2016). Conducting simulation studies in psychometrics. Educational Measurement: Issues and Practice, 35(2), 36-49. https://doi.org/10.1111/emip.12111 adresinden 8.12.2021 tarihinde erişilmiştir.
  • Fialkowski, A. C. (2018). SimMultiCorrData: Simulation of correlated data with multiple variable types [Bilgisayar Yazılımı]. https://cran.r-project.org/package=SimMultiCorrData adresinden 7.6.2021 tarihinde erişilmiştir.
  • Field, A. (2018). Discovering statistics using IBM SPSS statistics. (5. Baskı). Sage.
  • Finney, S. J. ve DiStefano, C. (2013). Nonnormal and categorical data in structural equation modeling. G. R. Hancock ve R. O. Mueller (Ed.), Structural equation modeling: A second course içinde (2. Baskı, ss. 439–492). IAP.
  • George, D. ve Mallery, M. (2001). SPSS for Windows step by step: A simple guide and reference 10.0 update. (3. Baskı). Allyn and Bacon.
  • Gross, J. ve Ligges, U. (2015). nortest: Tests for normality [Bilgisayar Yazılımı]. https://cran.r-project.org/package=nortest adresinden 22.2.2021 tarihinde erişilmiştir.
  • Hair, J. F., Black, W. C., Babin, B. J. ve Anderson, R. E. (2009). Multivariate data analysis. (7. Baskı). Prentice Hall.
  • Harel, D. (2020). sur: Companion to “statistics using R: An integrative approach [Bilgisayar Yazılımı]. https://cran.r-project.org/package=sur adresinden 22.2.2021 tarihinde erişilmiştir.
  • International Business Machines (IBM). (2021). Summarize statistics. https://www.ibm.com/docs/en/spss-statistics/SaaS?topic=summarize-statistics adresinden 22.2.2021 tarihinde erişilmiştir.
  • Jarque, C. M. ve Bera, A. K. (1987). A test for normality of observations and regression residuals. International Statistical Review, 55(2), 163-172. https://doi.org/10.2307/1403192 adresinden 3.5.2021 tarihinde erişilmiştir.
  • Keskin, S. (2006). Comparison of several univariate normality tests regarding type I error rate and power of the test in simulation based small samples. Journal of Applied Science Research 2(5), 296-300. http://www.aensiweb.com/old/jasr/jasr/2006/296-300.pdf adresinden 15.5.2021 tarihinde erişilmiştir.
  • Leech, N. L., Barrett, K. C. ve Morgan, G. A. (2005). SPSS for intermediate statistics: Use and interpretation. (2. Baskı). Taylor & Francis.
  • Lilliefors, H. W. (1967). On the Kolmogorov-Smirnov test for normality with mean and variance unknown. Journal of the American Statistical Association, 62(318), 399-402. https://doi.org/10.2307/2283970 adresinden 3.5.2021 tarihinde erişilmiştir.
  • Luo, H. (2011). Generation of non-normal data – A study of Fleishman’s power method (Yayın No. 2011:1). http://www.diva-portal.org/smash/get/diva2:407995/FULLTEXT01.pdf adresinden 18.2.2021 tarihinde erişilmiştir.
  • Mendes, M. ve Pala, A. (2003). Type I error rate and power of three normality tests. Information Technology Journal, 2(2), 135-139. https://doi.org/10.3923/itj.2003.135.139 adresinden 15.5.2021 tarihinde erişilmiştir.
  • Oppong, F. B. ve Agbedra, S. Y. (2016). Assessing univariate and multivariate normality, a guide for non-statisticians. Mathematical Theory and Modeling, 6(2), 26-33. https://www.iiste.org/Journals/index.php/MTM/article/view/28571 adresinden 24.5.2021 tarihinde erişilmiştir.
  • Öztuna, D., Elhan, A. H. ve Tüccar, E. (2006). Investigation of four different normality tests in terms of type 1 error rate and power under different distributions. Turkish Journal of Medical Sciences, 36(3), 171-176. https://dergipark.org.tr/tr/download/article-file/129239 adresinden 13.5.2021 tarihinde erişilmiştir.
  • Pallant, J. (2016). A step by step guide to data analysis using IBM SPSS. McGraw Hill Education.
  • Pituch, K. A. ve Stevens, J. P. (2016). Applied multivariate statistics for the social sciences. (6. Baskı). Routledge.
  • R Core Team. (2020). R: A language and environment for statistical computing [Bilgisayar Yazılımı]. https://www.r-project.org/ adresinden 22.2.2021 tarihinde erişilmiştir.
  • Razali, N. M. ve Wah, Y. B. (2011). Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests. Journal of Statistical Modeling and Analytics, 2(1), 21-33. https://www.nrc.gov/docs/ML1714/ML17143A100.pdf adresinden 14.5.2021 tarihinde erişilmiştir.
  • Royston, P. (1992). Approximating the Shapiro-Wilk W-test for non-normality. Statistics and Computing, 2(3), 117–119. https://doi.org/10.1007/BF01891203 adresinden 14.5.2021 tarihinde erişilmiştir.
  • Sigal, M. J. ve Chalmers, R. P. (2016). Play it again: Teaching statistics with monte carlo simulation. Journal of Statistics Education, 24(3), 136–156. https://doi.org/10.1080/10691898.2016.1246953 adresinden 10.5.2021 tarihinde erişilmiştir.
  • Stephens, M. A. (1974). EDF statistics for goodness of fit and some comparisons. Journal of the American Statistical Association, 69(347), 730-737. https://doi.org/10.2307/2286009 adresinden 10.5.2021 tarihinde erişilmiştir.
  • Stevens, J. P. (2009). Applied multivariate statistics for the social sciences. (5. Baskı). Routledge.
  • Tabachnick, B. G. ve Fidell, L. S. (2013). Using multivariate statistics. (6. Baskı). Pearson.
  • Thode, H. C. (2002). Testing for normality. Marcel Dekker.
  • Verrill, S. ve Johnson, R. A. (1988). Tables and large-sample distribution theory for censored-data correlation statistics for testing normality. Journal of the American Statistical Association, 83(404), 1192–1197. https://doi.org/10.1080/01621459.1988.10478721 adresinden 14.5.2021 tarihinde erişilmiştir.
  • Wickham, H. (2016). ggplot2: Elegant graphics for data analysis [Bilgisayar Yazılımı]. Springer-Verlag. http://ggplot2.org adresinden 22.2.2021 tarihinde erişilmiştir.
  • Wright, D. B. ve Herrington, J. A. (2011). Problematic standard errors and confidence intervals for skewness and kurtosis. Behavior Research Methods, 43(1), 8–17. https://doi.org/10.3758/s13428-010-0044-x adresinden 22.5.2021 tarihinde erişilmiştir.
  • Wuertz, D., Setz, T., & Chalabi, Y. (2020). fBasics: Rmetrics - markets and basic statistics [Bilgisayar Yazılımı]. https://cran.r-project.org/package=fBasics adresinden 22.2.2021 tarihinde erişilmiştir.
  • Yap, B. W., & Sim, C. H. (2011). Comparisons of various types of normality tests. Journal of Statistical Computation and Simulation, 81(12), 2141-2155. https://doi.org/10.1080/00949655.2010.520163 adresinden 11.5.2021 tarihinde erişilmiştir.
There are 36 citations in total.

Details

Primary Language Turkish
Journal Section Research Article
Authors

İbrahim Uysal 0000-0002-6767-0362

Abdullah Kılıç 0000-0003-3129-1763

Publication Date January 7, 2022
Submission Date July 5, 2021
Published in Issue Year 2022 Volume: 12 Issue: 1

Cite

APA Uysal, İ., & Kılıç, A. (2022). Normal Dağılım İkilemi. Anadolu Journal of Educational Sciences International, 12(1), 220-248. https://doi.org/10.18039/ajesi.962653
AMA Uysal İ, Kılıç A. Normal Dağılım İkilemi. AJESI. January 2022;12(1):220-248. doi:10.18039/ajesi.962653
Chicago Uysal, İbrahim, and Abdullah Kılıç. “Normal Dağılım İkilemi”. Anadolu Journal of Educational Sciences International 12, no. 1 (January 2022): 220-48. https://doi.org/10.18039/ajesi.962653.
EndNote Uysal İ, Kılıç A (January 1, 2022) Normal Dağılım İkilemi. Anadolu Journal of Educational Sciences International 12 1 220–248.
IEEE İ. Uysal and A. Kılıç, “Normal Dağılım İkilemi”, AJESI, vol. 12, no. 1, pp. 220–248, 2022, doi: 10.18039/ajesi.962653.
ISNAD Uysal, İbrahim - Kılıç, Abdullah. “Normal Dağılım İkilemi”. Anadolu Journal of Educational Sciences International 12/1 (January 2022), 220-248. https://doi.org/10.18039/ajesi.962653.
JAMA Uysal İ, Kılıç A. Normal Dağılım İkilemi. AJESI. 2022;12:220–248.
MLA Uysal, İbrahim and Abdullah Kılıç. “Normal Dağılım İkilemi”. Anadolu Journal of Educational Sciences International, vol. 12, no. 1, 2022, pp. 220-48, doi:10.18039/ajesi.962653.
Vancouver Uysal İ, Kılıç A. Normal Dağılım İkilemi. AJESI. 2022;12(1):220-48.

Cited By