Year 2022,
Volume: 71 Issue: 2, 601 - 615, 30.06.2022
Gizem Yıldırım
,
Selahattin Kaçıranlar
,
Hasan Yıldırım
References
- Dobson, A. J., Barnett, A. G., An Introduction to Generalized Linear Models, Chapman and Hall/CRC, 2008. https://doi.org/10.1201/9781315182780
- Chatterjee, S., Hadi, A. S., Regression Analysis by Example, John Wiley & Sons, 2015. https://doi.org/10.1002/0470055464
- Cameron, A. C., Trivedi, P. K., Regression Analysis of Count Data, 6th edn., Cambridge University Press, New York, 2007. http://dx.doi.org/10.1017/CBO9780511814365
- Hoerl, E., Kennard, R. W., Ridge regression: biased estimation for nonorthogonal problems, Technometrics, 12 (1970), 55–67. https://doi.org/10.1080/00401706.1970.10488634
- Liu, K., A new class of biased estimate in linear regression, Commun. Stat. Theory Methods, 22 (1993), 393-402. https://doi.org/10.1080/03610929308831027
- Mansson, K., Shukur, G., A Poisson ridge regression estimator. Econ. Model., 28 (2011), 1475-1481. https://doi.org/10.1016/j.econmod.2011.02.030
- Mansson, K., On ridge estimators for the negative binomial regression model, Econ. Model., 29 (2012), 178-184. https://doi.org/10.1016/j.econmod.2011.09.009
- Mansson, K., Kibria, B. M. G., Sjolander, P., Shukur, G., Improved Liu estimators for the poisson regression model. Int. J. Stat. Probab., 1 (2012), 2-6. https://doi.org/10.5539/ijsp.v1n1p2
- Mansson, K., Developing a Liu estimator for the negative binomial regression model: method and application. J. Stat. Comput. Simul., 83 (2013), 1773-1780. https://doi.org/10.1080/00949655.2012.673127
- Wang, W., Famoye, F., Modeling household fertility decisions with generalized Poisson regression, Journal of Population Economics, 10(3) (1997), 273-283. https://doi.org/10.1007/s001480050043
- Famoye, F., Singh, K. P., Zero-inflated generalized Poisson regression model with an application to domestic violence data, Journal of Data Science, 4(1) (2006), 117-130. https://doi.org/10.6339/JDS.2006.04(1).257
- Bandyopadhyay, D., DeSantis, S. M., Korte, J. E., Brady, K. T., Some considerations for excess zeroes in substance abuse research, The American Journal of Drug and Alcohol Abuse, 37(5) (2011), 376-382. https://doi.org/10.3109/00952990.2011.568080
- Buu, A., Johnson, N. J., Li, R., Tan, X., New variable selection methods for zero-inflated count data with applications to the substance abuse field, Statistics in Medicine, 30(18) (2011), 2326-2340. https://doi.org/10.1002/sim.4268
- Mouatassim, Y., Ezzahid, E. H., Poisson regression and zero-inflated Poisson regression: application to private health insurance data, European Actuarial Journal, 2(2) (2012), 187- 204. https://doi.org/10.1007/s13385-012-0056-2
- Xie, H., Tao, J., McHugo, G. J., Drake, R. E., Comparing statistical methods for analyzing skewed longitudinal count data with many zeros: An example of smoking cessation, Journal of Substance Abuse Treatment, 45(1) (2013), 99-108. https://doi.org/10.1016/j.jsat.2013.01.005
- Liyanage, T., Ninomiya, T., Jha, V., Neal, B., Patrice, H. M., Okpechi, I., Zhao, M-H., Lv, J., Garg, A. X., Knight, J., Rodgers, A., Gallagher, M., Kotwal, S., Cass, A., Perkovic, V., Worldwide access to treatment for end-stage kidney disease: a systematic review, The Lancet, 385(9981) (2015), 1975-1982. https://doi.org/10.1016/s0140-6736(14)61601-9
- Martinez, F. J., Calverley, P. M., Goehring, U. M., Brose, M., Fabbri, L. M., Rabe, K. F., Effect of roflumilast on exacerbations in patients with severe chronic obstructive pulmonary disease uncontrolled by combination therapy (REACT): a multicentre randomised controlled trial, The Lancet, 385(9971) (2015), 857-866. https://doi.org/10.1183/13993003.00158-2017
- Oliveira, M., Einbeck, J., Higueras, M., Ainsbury, E., Puig, P., Rothkamm, K., Zero-inflated regression models for radiation-induced chromosome aberration data: A comparative study, Biometrical Journal, 58(2) (2016), 259-279. https://doi.org/10.1002/bimj.201400233
- Tang, Y., Liu, W., Xu, A., Statistical inference for zero-and-one-inflated poisson models, Statistical Theory and Related Fields, 1(2) (2017), 216-226. https://doi.org/10.1002/bimj.201400233
- Chai, T., Xiong, D., Weng, J., A zero-inflated negative binomial regression model to evaluate ship sinking accident mortalities, Transportation Research Record, 2672(11) (2018), 65–72. https://doi.org/10.1177
- Deb, P., Trivedi, P. K., Demand for medical care by the elderly: a finite mixture approach, Journal of Applied Econometrics, 12(3) (1997), 313-336. http://www.jstor.org/stable/2285252?origin=JSTOR-pdf
- Garthwaite, P. H., Jolliffe, I. T., Jones, B., Statistical Inference, Oxford University Press, Oxford, 2002. https://doi.org/10.1017/S0025557200173425
- Hilbe, J. M., Negative Binomial Regression, Cambridge University Press, Cambridge, 2011. https://doi.org/10.1017/CBO9780511973420
- Lambert, D., Zero-inflated Poisson regression with an application to defects in manufacturing, Technometrics, 34 (1992), 1-14. https://doi.org/10.2307/1269547
- https://www.jstatsoft.org/article/view/v016i09
- Core Team, R., R: A language and environment for statistical computing, Vienna: Austria: R foundation for Statistical Computing, (2016). http://www.R-project.org/
- Vuong, Q. H., Likelihood ratio tests for model selection and nonnested hypotheses, Econometrica, 57(2) (1989), 30-33. https://doi.org/10.2307/1912557
- Venables, W. N., Ripley, B. D., Modern Applied Statistics with S, Fourth Edition, Springer, New York, 2002. https://www.stats.ox.ac.uk/pub/MASS4/
- Zeileis, A., Hothorn, T., Diagnostic checking in regression relationships, R News, 2(3) (2002) 7-10. https://CRAN.R-project.org/doc/Rnews/
- Zeileis, A., Kleiber C, Jackman, S., Regression models for count data in R, Journal of Statistical Software, 27(8) (2008), 1-25. http://www.jstatsoft.org/v27/i08/.
Poisson and negative binomial regression models for zero-inflated data: an experimental study
Year 2022,
Volume: 71 Issue: 2, 601 - 615, 30.06.2022
Gizem Yıldırım
,
Selahattin Kaçıranlar
,
Hasan Yıldırım
Abstract
Count data regression has been widely used in various disciplines, particularly health area. Classical models like Poisson and negative binomial regression may not provide reasonable performance in the presence of excessive zeros and overdispersion problems. Zero-inflated and Hurdle variants of these models can be a remedy for dealing with these problems. As well as zero-inflated and Hurdle models, alternatives based on some biased estimators like ridge and Liu may improve the performance against to multicollinearity problem except excessive zeros and overdispersion. In this study, ten different regression models including classical Poisson and negative binomial regression with their variants based on zero-inflated, Hurdle, ridge and Liu approaches have been compared by using a health data. Some criteria including Akaike information criterion, log-likelihood value, mean squared error and mean absolute error have been used to investigate the performance of models. The results show that the zero-inflated negative binomial regression model provides the best fit for the data. The final model estimations have been obtained via this model and interpreted in detail. Finally, the experimental results suggested that models except the classical models should be considered as powerful alternatives for modelling count and give better insights to the researchers in applying statistics on working similar data structures.
References
- Dobson, A. J., Barnett, A. G., An Introduction to Generalized Linear Models, Chapman and Hall/CRC, 2008. https://doi.org/10.1201/9781315182780
- Chatterjee, S., Hadi, A. S., Regression Analysis by Example, John Wiley & Sons, 2015. https://doi.org/10.1002/0470055464
- Cameron, A. C., Trivedi, P. K., Regression Analysis of Count Data, 6th edn., Cambridge University Press, New York, 2007. http://dx.doi.org/10.1017/CBO9780511814365
- Hoerl, E., Kennard, R. W., Ridge regression: biased estimation for nonorthogonal problems, Technometrics, 12 (1970), 55–67. https://doi.org/10.1080/00401706.1970.10488634
- Liu, K., A new class of biased estimate in linear regression, Commun. Stat. Theory Methods, 22 (1993), 393-402. https://doi.org/10.1080/03610929308831027
- Mansson, K., Shukur, G., A Poisson ridge regression estimator. Econ. Model., 28 (2011), 1475-1481. https://doi.org/10.1016/j.econmod.2011.02.030
- Mansson, K., On ridge estimators for the negative binomial regression model, Econ. Model., 29 (2012), 178-184. https://doi.org/10.1016/j.econmod.2011.09.009
- Mansson, K., Kibria, B. M. G., Sjolander, P., Shukur, G., Improved Liu estimators for the poisson regression model. Int. J. Stat. Probab., 1 (2012), 2-6. https://doi.org/10.5539/ijsp.v1n1p2
- Mansson, K., Developing a Liu estimator for the negative binomial regression model: method and application. J. Stat. Comput. Simul., 83 (2013), 1773-1780. https://doi.org/10.1080/00949655.2012.673127
- Wang, W., Famoye, F., Modeling household fertility decisions with generalized Poisson regression, Journal of Population Economics, 10(3) (1997), 273-283. https://doi.org/10.1007/s001480050043
- Famoye, F., Singh, K. P., Zero-inflated generalized Poisson regression model with an application to domestic violence data, Journal of Data Science, 4(1) (2006), 117-130. https://doi.org/10.6339/JDS.2006.04(1).257
- Bandyopadhyay, D., DeSantis, S. M., Korte, J. E., Brady, K. T., Some considerations for excess zeroes in substance abuse research, The American Journal of Drug and Alcohol Abuse, 37(5) (2011), 376-382. https://doi.org/10.3109/00952990.2011.568080
- Buu, A., Johnson, N. J., Li, R., Tan, X., New variable selection methods for zero-inflated count data with applications to the substance abuse field, Statistics in Medicine, 30(18) (2011), 2326-2340. https://doi.org/10.1002/sim.4268
- Mouatassim, Y., Ezzahid, E. H., Poisson regression and zero-inflated Poisson regression: application to private health insurance data, European Actuarial Journal, 2(2) (2012), 187- 204. https://doi.org/10.1007/s13385-012-0056-2
- Xie, H., Tao, J., McHugo, G. J., Drake, R. E., Comparing statistical methods for analyzing skewed longitudinal count data with many zeros: An example of smoking cessation, Journal of Substance Abuse Treatment, 45(1) (2013), 99-108. https://doi.org/10.1016/j.jsat.2013.01.005
- Liyanage, T., Ninomiya, T., Jha, V., Neal, B., Patrice, H. M., Okpechi, I., Zhao, M-H., Lv, J., Garg, A. X., Knight, J., Rodgers, A., Gallagher, M., Kotwal, S., Cass, A., Perkovic, V., Worldwide access to treatment for end-stage kidney disease: a systematic review, The Lancet, 385(9981) (2015), 1975-1982. https://doi.org/10.1016/s0140-6736(14)61601-9
- Martinez, F. J., Calverley, P. M., Goehring, U. M., Brose, M., Fabbri, L. M., Rabe, K. F., Effect of roflumilast on exacerbations in patients with severe chronic obstructive pulmonary disease uncontrolled by combination therapy (REACT): a multicentre randomised controlled trial, The Lancet, 385(9971) (2015), 857-866. https://doi.org/10.1183/13993003.00158-2017
- Oliveira, M., Einbeck, J., Higueras, M., Ainsbury, E., Puig, P., Rothkamm, K., Zero-inflated regression models for radiation-induced chromosome aberration data: A comparative study, Biometrical Journal, 58(2) (2016), 259-279. https://doi.org/10.1002/bimj.201400233
- Tang, Y., Liu, W., Xu, A., Statistical inference for zero-and-one-inflated poisson models, Statistical Theory and Related Fields, 1(2) (2017), 216-226. https://doi.org/10.1002/bimj.201400233
- Chai, T., Xiong, D., Weng, J., A zero-inflated negative binomial regression model to evaluate ship sinking accident mortalities, Transportation Research Record, 2672(11) (2018), 65–72. https://doi.org/10.1177
- Deb, P., Trivedi, P. K., Demand for medical care by the elderly: a finite mixture approach, Journal of Applied Econometrics, 12(3) (1997), 313-336. http://www.jstor.org/stable/2285252?origin=JSTOR-pdf
- Garthwaite, P. H., Jolliffe, I. T., Jones, B., Statistical Inference, Oxford University Press, Oxford, 2002. https://doi.org/10.1017/S0025557200173425
- Hilbe, J. M., Negative Binomial Regression, Cambridge University Press, Cambridge, 2011. https://doi.org/10.1017/CBO9780511973420
- Lambert, D., Zero-inflated Poisson regression with an application to defects in manufacturing, Technometrics, 34 (1992), 1-14. https://doi.org/10.2307/1269547
- https://www.jstatsoft.org/article/view/v016i09
- Core Team, R., R: A language and environment for statistical computing, Vienna: Austria: R foundation for Statistical Computing, (2016). http://www.R-project.org/
- Vuong, Q. H., Likelihood ratio tests for model selection and nonnested hypotheses, Econometrica, 57(2) (1989), 30-33. https://doi.org/10.2307/1912557
- Venables, W. N., Ripley, B. D., Modern Applied Statistics with S, Fourth Edition, Springer, New York, 2002. https://www.stats.ox.ac.uk/pub/MASS4/
- Zeileis, A., Hothorn, T., Diagnostic checking in regression relationships, R News, 2(3) (2002) 7-10. https://CRAN.R-project.org/doc/Rnews/
- Zeileis, A., Kleiber C, Jackman, S., Regression models for count data in R, Journal of Statistical Software, 27(8) (2008), 1-25. http://www.jstatsoft.org/v27/i08/.