Bir Simülasyon Çalışması ile Cezalı Regresyon Yöntemlerinin Karşılaştırılması

Murat Genç

doi:10.35193/bseufbd.994181

Araştırma Makalesi

Comparison of Penalized Regression Methods through a Simulation Study

Yıl 2022, Cilt: 9 Sayı: 1, 80 - 91, 30.06.2022

Murat Genç

https://doi.org/10.35193/bseufbd.994181

Cited By: 2

Öz

Penalized regression methods are often used to obtain stable coefficient estimates in case of multicollinearity problems in the dataset. In addition, these methods can make automatic variable selection depending on the nature of the penalty term applied. In this study, a detailed comparison of the performances of ridge, LASSO, elastic net and adaptive LASSO penalized regression methods, which are widely used in the literature, is made through simulation studies depending on the structure of the real coefficient vector. Mean squared error on the test set, misclassification rate, false positive rate and active set sizes are used as comparison criteria in the study. Simulation studies show that the structure of the real coefficient vector has a significant effect on the model performance revealed by the methods.

Anahtar Kelimeler

Linear Regression, Ridge, Lasso, Elastic Net, Multicollinearity

Kaynakça

Montgomery, D. C., Peck, E. A. & Vining, G. G. (2021). Introduction to linear regression analysis, John Wiley & Sons.
Hoerl, A. E. & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics12 (1), 55-67.
Rao, C. R. & Toutenburg, H. (1995). Linear models, Springer.
Sarkar, N. (1992). A new estimator combining the ridge regression and the restricted least squares methods of estimation. Communications in statistics-theory and methods21 (7), 1987-2000.
Kaçıranlar, S., Sakallıoğlu, S., Akdeniz, F., Styan, G. P. & Werner, H. J. (1999). A new biased estimator in linear regression and a detailed analysis of the widely-analysed dataset on Portland cement. Sankhyā: The Indian Journal of Statistics, Series B, 443-459.
Özkale, M. R. & Kaçıranlar, S. (2007). The restricted and unrestricted two-parameter estimators. Communications in Statistics-Theory and Methods36 (15), 2707-2725.
Miller, A. (2002). Subset selection in regression, CRC Press.
Frank, L. E. & Friedman, J. H. (1993). A statistical view of some chemometrics regression tools. Technometrics35 (2), 109-135.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological)58 (1), 267-288.
Zou, H. & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the royal statistical society: series B (statistical methodology)67 (2), 301-320.
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American statistical association 101(476), 1418-1429.
Sirimongkolkasem, T., & Drikvandi, R. (2019). On regularisation methods for analysis of high dimensional data. Annals of Data Science 6(4), 737-763.
Meinshausen, N., & Yu, B. (2009). Lasso-type recovery of sparse representations for high-dimensional data. The annals of statistics 37(1), 246-270.
Yüzbaşı, B., Arashi, M., & Ejaz Ahmed, S. (2020). Shrinkage Estimation Strategies in Generalised Ridge Regression Models: Low/High‐Dimension Regime. International Statistical Review 88(1), 229-251.
Ahmed, S. E., Kim, H., Yıldırım, G., & Yüzbaşı, B. (2016). High-Dimensional Regression Under Correlated Design: An Extensive Simulation Study. In International Workshop on Matrices and Statistics (pp. 145-175). Springer, Cham.
Shahriari, S., Faria, S., & Gonçalves, A. M. (2015). Variable selection methods in high-dimensional regression—A simulation study. Communications in Statistics-Simulation and Computation 44(10), 2548-2561.
Hastie, T., Tibshirani, R., Friedman, J. H., & Friedman, J. H. (2009). The elements of statistical learning: data mining, inference, and prediction (Vol. 2, pp. 1-758). New York: springer.
Tibshirani, R. (1997). The lasso method for variable selection in the Cox model. Statistics in medicine16 (4), 385-395.
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J. & Knight, K. (2005). Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology)67 (1), 91-108.
Zhao, P. & Yu, B. (2006). On model selection consistency of Lasso. The Journal of Machine Learning Research7, 2541-2563.
Chang, L., Roberts, S. & Welsh, A. (2018). Robust lasso regression using Tukey's biweight criterion. Technometrics60 (1), 36-47.
Hussami, N., & Tibshirani, R. J. (2015). A component lasso. Canadian Journal of Statistics 43(4), 624-646.

Bir Simülasyon Çalışması ile Cezalı Regresyon Yöntemlerinin Karşılaştırılması

Yıl 2022, Cilt: 9 Sayı: 1, 80 - 91, 30.06.2022

Murat Genç

https://doi.org/10.35193/bseufbd.994181

Cited By: 2

Öz

Veri kümesinde çoklu iç ilişki problemi olması durumunda kararlı katsayı tahminleri elde etmek için sıklıkla cezalı regresyon yöntemleri kullanılır. Ayrıca bu yöntemler uygulanan ceza teriminin yapısına bağlı olarak otomatik değişken seçimi de yapabilmektedir. Bu çalışmada literatürde yaygın kullanım alanı bulan ridge, LASSO, elastik net ve uyarlanabilir LASSO cezalı regresyon yöntemlerinin gerçek katsayı vektörünün yapısına bağlı olarak simülasyon çalışmaları yoluyla performanslarının ayrıntılı olarak karşılaştırılması yapılmıştır. Çalışmada karşılaştırma kriteri olarak test kümesi üzerinde hata kareler ortalaması, yanlış sınıflama oranı, yanlış pozitif oranı ve aktif küme büyüklükleri kullanılmıştır. Simülasyon çalışmaları, gerçek katsayı vektörünün yapısının yöntemlerin ortaya çıkardığı model performansı üzerinde önemli etkisinin olduğunu göstermektedir.

Anahtar Kelimeler

Doğrusal Regresyon, Ridge, Lasso, Elastik Net, Çoklu İç İlişki

Kaynakça

Montgomery, D. C., Peck, E. A. & Vining, G. G. (2021). Introduction to linear regression analysis, John Wiley & Sons.
Hoerl, A. E. & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics12 (1), 55-67.
Rao, C. R. & Toutenburg, H. (1995). Linear models, Springer.
Sarkar, N. (1992). A new estimator combining the ridge regression and the restricted least squares methods of estimation. Communications in statistics-theory and methods21 (7), 1987-2000.
Kaçıranlar, S., Sakallıoğlu, S., Akdeniz, F., Styan, G. P. & Werner, H. J. (1999). A new biased estimator in linear regression and a detailed analysis of the widely-analysed dataset on Portland cement. Sankhyā: The Indian Journal of Statistics, Series B, 443-459.
Özkale, M. R. & Kaçıranlar, S. (2007). The restricted and unrestricted two-parameter estimators. Communications in Statistics-Theory and Methods36 (15), 2707-2725.
Miller, A. (2002). Subset selection in regression, CRC Press.
Frank, L. E. & Friedman, J. H. (1993). A statistical view of some chemometrics regression tools. Technometrics35 (2), 109-135.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological)58 (1), 267-288.
Zou, H. & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the royal statistical society: series B (statistical methodology)67 (2), 301-320.
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American statistical association 101(476), 1418-1429.
Sirimongkolkasem, T., & Drikvandi, R. (2019). On regularisation methods for analysis of high dimensional data. Annals of Data Science 6(4), 737-763.
Meinshausen, N., & Yu, B. (2009). Lasso-type recovery of sparse representations for high-dimensional data. The annals of statistics 37(1), 246-270.
Yüzbaşı, B., Arashi, M., & Ejaz Ahmed, S. (2020). Shrinkage Estimation Strategies in Generalised Ridge Regression Models: Low/High‐Dimension Regime. International Statistical Review 88(1), 229-251.
Ahmed, S. E., Kim, H., Yıldırım, G., & Yüzbaşı, B. (2016). High-Dimensional Regression Under Correlated Design: An Extensive Simulation Study. In International Workshop on Matrices and Statistics (pp. 145-175). Springer, Cham.
Shahriari, S., Faria, S., & Gonçalves, A. M. (2015). Variable selection methods in high-dimensional regression—A simulation study. Communications in Statistics-Simulation and Computation 44(10), 2548-2561.
Hastie, T., Tibshirani, R., Friedman, J. H., & Friedman, J. H. (2009). The elements of statistical learning: data mining, inference, and prediction (Vol. 2, pp. 1-758). New York: springer.
Tibshirani, R. (1997). The lasso method for variable selection in the Cox model. Statistics in medicine16 (4), 385-395.
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J. & Knight, K. (2005). Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology)67 (1), 91-108.
Zhao, P. & Yu, B. (2006). On model selection consistency of Lasso. The Journal of Machine Learning Research7, 2541-2563.
Chang, L., Roberts, S. & Welsh, A. (2018). Robust lasso regression using Tukey's biweight criterion. Technometrics60 (1), 36-47.
Hussami, N., & Tibshirani, R. J. (2015). A component lasso. Canadian Journal of Statistics 43(4), 624-646.

Toplam 22 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	Türkçe
Konular	Mühendislik
Bölüm	Makaleler
Yazarlar	Murat Genç 0000-0002-6335-3044
Yayımlanma Tarihi	30 Haziran 2022
Gönderilme Tarihi	13 Eylül 2021
Kabul Tarihi	7 Mart 2022
Yayımlandığı Sayı	Yıl 2022 Cilt: 9 Sayı: 1

Kaynak Göster

APA	Genç, M. (2022). Bir Simülasyon Çalışması ile Cezalı Regresyon Yöntemlerinin Karşılaştırılması. Bilecik Şeyh Edebali Üniversitesi Fen Bilimleri Dergisi, 9(1), 80-91. https://doi.org/10.35193/bseufbd.994181

Bilecik Şeyh Edebali Üniversitesi Fen Bilimleri Dergisi

Comparison of Penalized Regression Methods through a Simulation Study

Öz

Anahtar Kelimeler

Kaynakça

Bir Simülasyon Çalışması ile Cezalı Regresyon Yöntemlerinin Karşılaştırılması

Öz

Anahtar Kelimeler

Kaynakça

Ayrıntılar

Kaynak Göster

Cited By

Konveks ve konveks olmayan cezalı regresyon yöntemlerinin karşılaştırılması üzerine bir çalışma

Balıkesir Üniversitesi Fen Bilimleri Enstitüsü Dergisi

https://doi.org/10.25092/baunfbed.1299583

The Effect of the Second Stage Estimator on Model Performance in Post-LASSO Method

Turkish Journal of Science and Technology

https://doi.org/10.55525/tjst.1244925