Quantifying the Impact of Risk Factors on Direct Compensation Property Damage in Canadian Automobile Insurance

Pervin Baylan; Neslihan Demirel

doi:10.51541/nicel.1397941

Research Article

Kanada Otomobil Sigortasında Risk Faktörlerinin Doğrudan Tazmin Edilen Maddi Hasar Üzerindeki Etkisinin Değerlendirilmesi

Year 2024, Volume: 6 Issue: 1, 103 - 127, 30.06.2024

Pervin Baylan , Neslihan Demirel

https://doi.org/10.51541/nicel.1397941

Abstract

Bu çalışma, özel binek araç kazalarında çeşitli risk faktörlerinin doğrudan tazmin edilen maddi hasarlar (Direct Compensation Property Damage - DCPD) üzerindeki etkisini değerlendiren istatistiksel bir analiz sunmaktadır. 2003 ile 2012 yılları arasındaki on yıllık döneme ait Ontario, Kanada'daki otomobil sigortası verileri kullanılarak, genelleştirilmiş doğrusal ikili logit karma model aracılığıyla maddi hasarın istatistiksel bir modeli araştırılmış ve sigortalıların sınıfları arasındaki dengesizlik dikkate alınmıştır. Sonuçlar, kullanım amacı, sürücü eğitimi, muallak hasar ve gerçekleşen hasar dahil olmak üzere çeşitli risk faktörlerinin DCPD hasarlarının olasılığı üzerinde önemli bir etkiye sahip olduğunu göstermektedir. Bu risk faktörlerinin etkileri, farklı sigortalı sınıflarındaki ağırlıklar — her bir başarı oranını oluşturmak için kullanılan deneme sayısı — altında gözlemlenmiştir. Genelleştirilmiş doğrusal karma modeller (GLMMs) analizi, risk faktörlerinin üçüncü şahıs sorumluluk (TPL) sigortası kapsamındaki DCPD hasarları ve maddi hasarlar (PD) olarak adlandırılan ikili sonuçlar üzerindeki etkisinin değerlendirilmesinde güçlü bir araçtır. Bu modeller, en önemli risk faktörlerini belirlemeye odaklanarak sigorta risk değerlendirmesine ve poliçe tasarımına da bilgi sağlayabilir. İkili sonuçlardaki sınıf dengesizliği dikkate alınarak hesaplanan performans ölçümleri, elde edilen modelin sınıfları doğru tahmin etme yeteneğini doğrulamaktadır. Sınıflandırma performansını ölçmeye yönelik değerlendirme ölçümü olan F1 skoru 0,934 olarak hesaplanmıştır. Ayrıca, Kesinlik-Duyarlılık (Precision-Recall (PR)) eğrisinin altında kalan alan olan PR AUC ise 0,953 olarak elde edilmiştir. Bu yüksek skorlar, elde edilen modelin sınıflandırmada iyi performans gösterdiğine işaret etmektedir. Diğer ölçümler de, bu modelin sınıflandırma doğruluğunu desteklemektedir. Analizin bulguları, sigortacıların maddi hasarların altında yatan nedenleri daha iyi anlamalarına ve risk azaltımı için daha doğru ve etkili stratejiler geliştirmelerine yardımcı olabilir. Ayrıca bu çalışma, farklı sınıflar arasındaki dengesizliği hesaba katmak için sınıfa özgü risk değerlendirme modellerinin geliştirilmesinin önemini vurgulamaktadır.

Keywords

İkili Logit Model, Doğrudan Tazmin Edilen Maddi Hasar, Genelleştirilmiş Doğrusal Karma Model, Üçüncü Şahıs Sorumluluk Sigortası, Dengesiz Panel Veri

References

Anarkooli, A. J., Hosseinpour, M. and Kardar, A. (2017), Investigation of factors affecting the injury severity of single-vehicle rollover crashes: A random-effects generalized ordered probit model, Accident Analysis and Prevention, 106, 399-410.
Antonio, K. and Beirlant, J. (2007), Actuarial statistics with generalized linear mixed models, Insurance: Mathematics and Economics, 40(1), 58-76.
Antonio, K. and Valdez, E. A. (2012), Statistical concepts of a priori and a posteriori risk classification in insurance, AStA Advances in Statistical Analysis, 96, 187-224.
Bakhshi, A. K. and Ahmed, M. M. (2021), Practical advantage of crossed random intercepts under Bayesian hierarchical modeling to tackle unobserved heterogeneity in clustering critical versus non-critical crashes, Accident Analysis and Prevention, 149, 105855.
Balusu, S. K., Pinjari, A. R., Mannering, F. L. and Eluru, N. (2018), Non-decreasing threshold variances in mixed generalized ordered response models: A negative correlations approach to variance reduction, Analytic Methods in Accident Research, 20, 46-67.
Barua, S., El-Basyouny, K. and Islam, M. T. (2015), Effects of spatial correlation in random parameters collision count-data models, Analytic Methods in Accident Research, 5, 28-42.
Barua, S., El-Basyouny, K. and Islam, M. T. (2016), Multivariate random parameters collision count data models with spatial heterogeneity, Analytic Methods in Accident Research, 9, 1-15.
Chen, F., Chen, S. and Ma, X. (2018), Analysis of hourly crash likelihood using unbalanced panel data mixed logit model and real-time driving environmental big data, Journal of Safety Research, 65, 153-159.
Davis, J. and Goadrich, M. (2006), The relationship between Precision-Recall and ROC curves, In: Proceedings of the 23rd International Conference on Machine Learning – ICML ‘06, 233-240.
De Jong, P. and Heller, G. Z. (2008), Generalized Linear Models for Insurance Data, In: International Series on Actuarial Science, Cambridge University Press.
Dong, C., Clarke, D. B., Yan, X., Khattak, A. and Huang, B. (2014), Multivariate random-parameters zero-inflated negative binomial regression model: An application to estimate crash frequencies at intersections, Accident Analysis and Prevention, 70, 320-329.
Eluru, N., Bhat, C. R. and Hensher, D. A. (2008), A mixed generalized ordered response model for examining pedestrian and bicyclist injury severity level in traffic crashes, Accident Analysis and Prevention, 40(3), 1033-1054.
Embrechts, P. and Wüthrich, M. V. (2022), Recent challenges in actuarial science, Annual Review of Statistics and Its Application, 9, 119-140.
Frees, E. W. (2010), Regression Modeling with Actuarial and Financial Applications, In: International Series on Actuarial Science, Cambridge University Press.
Fountas, G. and Anastasopoulos, P. C. (2017), A random thresholds random parameters hierarchical ordered probit analysis of highway accident injury-severities, Analytic Methods in Accident Research, 15, 1-16.
Fountas, G., Pantangi, S. S., Hulme, K. F. and Anastasopoulos, P. C. (2019), The effects of driver fatigue, gender, and distracted driving on perceived and observed aggressive driving behavior: A correlated grouped random parameters bivariate probit approach, Analytic Methods in Accident Research, 22, 100091.
Garrido, J., Genest, C. and Schulz, J. (2016), Generalized linear models for dependent frequency and severity of insurance claims, Insurance: Mathematics and Economics, 70, 205-215.
Gong, H., Fu, T., Sun, Y., Guo, Z., Cong, L., Hu, W. and Ling, Z. (2022), Two-vehicle driver-injury severity: A multivariate random parameters logit approach, Analytic Methods in Accident Research, 33, 100190.
Haberman, S. and Renshaw, A. E. (1996), Generalized linear models and actuarial science, Journal of the Royal Statistical Society: Series D (The Statistician), 45(4), 407-436.
Hedeker, D. (2005), Generalized linear mixed models, In: B. Everitt, D. Howell (Eds.), Encyclopedia of Statistics in Behavioral Science, John Wiley & Sons, New York, 729-738.
Hossin, M. and Sulaiman, M. N. (2015), A review on evaluation metrics for data classification evaluations, International Journal of Data Mining and Knowledge Management Process, 5(2), 1-11.
Kaas, R., Goovaerts, M., Dhaene, J. and Denuit, M. (2008), Modern Actuarial Risk Theory: Using R, Second Edition, Springer Berlin, Heidelberg.
Khamis, H. (2008), Measures of association: How to choose?, Journal of Diagnostic Medical Sonography, 24(3), 155-162.
Kim, M., Kho, S. Y. and Kim, D. K. (2017), Hierarchical ordered model for injury severity of pedestrian crashes in South Korea, Journal of Safety Research, 61, 33-40.
Lord, D. and Mannering, F. (2010), The statistical analysis of crash-frequency data: A review and assessment of methodological alternatives, Transportation Research Part A: Policy and Practice, 44(5), 291-305.
Mannering, F. L., Shankar, V. and Bhat, C. R. (2016), Unobserved heterogeneity and the statistical analysis of highway accident data, Analytic Methods in Accident Research, 11, 1-16.
McCullagh, P. and Nelder, J. A. (1989), Generalized Linear Models, In: Monographs on Statistics and Applied Probability 37, Second Edition, Chapman and Hall, London, New York.
Miao, G. M. (2018), Application of hierarchical model in non-life insurance actuarial science, Modern Economy, 9(3), 393-399.
Nelder, J. A. and Wedderburn, R. W. M. (1972), Generalized linear models, Journal of the Royal Statistical Society: Series A (General), 135(3), 370-384.
Ohlsson, E. and Johansson, B. (2010), Non-life Insurance Pricing with Generalized Linear Models, In: EAA Series Textbook, Springer Berlin, Heidelberg.
Pai, J. S. and Walch, A. H. (2020), ACTEX Study Manual for Exam MAS-II, ACTEX Learning/SRBooks, Inc., Greenland, NH.
Pantangi, S. S., Fountas, G., Sarwar, M. T., Anastasopoulos, P. C., Blatt, A., Majka, K., Pierowicz, J. and Mohan, S. B. (2019), A preliminary investigation of the effectiveness of high visibility enforcement programs using naturalistic driving study data: A grouped random parameters approach, Analytic Methods in Accident Research, 21, 1-12.
Portet, S. (2020), A primer on model selection using the Akaike Information Criterion, Infectious Disease Modelling, 5, 111-128.
Saito, T. and Rehmsmeier, M. (2015), The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets, PloS One, 10(3), e0118432.
Tran, V., Liu, D., Pradhan, A. K., Li, K., Bingham, C. R., Simons-Morton, B. G. and Albert, P. S. (2015), Assessing risk-taking in a driving simulator study: Modeling longitudinal semi-continuous driving data using a two-part regression model with correlated random effects, Analytic Methods in Accident Research, 5, 17-27.
Yau, K., Yip, K. and Yuen, H. K. (2003), Modelling repeated insurance claim frequency data using the generalized linear mixed model, Journal of Applied Statistics, 30(8), 857-865.

Quantifying the Impact of Risk Factors on Direct Compensation Property Damage in Canadian Automobile Insurance

Year 2024, Volume: 6 Issue: 1, 103 - 127, 30.06.2024

Pervin Baylan , Neslihan Demirel

https://doi.org/10.51541/nicel.1397941

Abstract

This study presents a statistical analysis assessing the impact of various risk factors on direct compensation property damage (DCPD) claims in private passenger vehicle accidents. Using automobile insurance data in Ontario, Canada for the decade years period between 2003 and 2012, a statistical model of property damage was explored via a generalized linear binary logit mixed model and considered the imbalance between the classes of insureds. The results indicate that several risk factors have a significant impact on the likelihood of DCPD claims, including usage, training, outstanding loss, and incurred loss. The effects of these risk factors were observed under the weights — the number of trials used to generate each success proportion — in the different classes of insureds. The generalized linear mixed models (GLMMs) analysis provides a powerful tool for quantifying the impact of risk factors on binary outcomes, which are called DCPD claims and property damage (PD) claims covered by third-party liability (TPL) insurance. These models can also inform insurance underwriting and policy design, focusing on identifying the most significant risk factors. The performance metrics calculated by considering the class imbalance in binary outcomes verify the resulting model’s ability to accurately predict classes. The F1 score, an evaluation metric to measure the performance of classification, was calculated as 0.934. In addition, PR AUC, which is the area under the Precision-Recall (PR) curve, was computed as 0.953. These high scores indicate that the resulting model performs well in the classification. The other metrics also support the classification accuracy of this model. The findings of the analysis can help insurers better understand the underlying drivers of property damages and develop more accurate and effective strategies for risk mitigation. Furthermore, this study highlights the importance of developing class-specific risk assessment models to account for the imbalance across different classes.

Keywords

Binary Logit Model, Direct Compensation Property Damage, Generalized Linear Mixed Model, Third-Party Liability Insurance, Unbalanced Panel Data

References

Anarkooli, A. J., Hosseinpour, M. and Kardar, A. (2017), Investigation of factors affecting the injury severity of single-vehicle rollover crashes: A random-effects generalized ordered probit model, Accident Analysis and Prevention, 106, 399-410.
Antonio, K. and Beirlant, J. (2007), Actuarial statistics with generalized linear mixed models, Insurance: Mathematics and Economics, 40(1), 58-76.
Antonio, K. and Valdez, E. A. (2012), Statistical concepts of a priori and a posteriori risk classification in insurance, AStA Advances in Statistical Analysis, 96, 187-224.
Bakhshi, A. K. and Ahmed, M. M. (2021), Practical advantage of crossed random intercepts under Bayesian hierarchical modeling to tackle unobserved heterogeneity in clustering critical versus non-critical crashes, Accident Analysis and Prevention, 149, 105855.
Balusu, S. K., Pinjari, A. R., Mannering, F. L. and Eluru, N. (2018), Non-decreasing threshold variances in mixed generalized ordered response models: A negative correlations approach to variance reduction, Analytic Methods in Accident Research, 20, 46-67.
Barua, S., El-Basyouny, K. and Islam, M. T. (2015), Effects of spatial correlation in random parameters collision count-data models, Analytic Methods in Accident Research, 5, 28-42.
Barua, S., El-Basyouny, K. and Islam, M. T. (2016), Multivariate random parameters collision count data models with spatial heterogeneity, Analytic Methods in Accident Research, 9, 1-15.
Chen, F., Chen, S. and Ma, X. (2018), Analysis of hourly crash likelihood using unbalanced panel data mixed logit model and real-time driving environmental big data, Journal of Safety Research, 65, 153-159.
Davis, J. and Goadrich, M. (2006), The relationship between Precision-Recall and ROC curves, In: Proceedings of the 23rd International Conference on Machine Learning – ICML ‘06, 233-240.
De Jong, P. and Heller, G. Z. (2008), Generalized Linear Models for Insurance Data, In: International Series on Actuarial Science, Cambridge University Press.
Dong, C., Clarke, D. B., Yan, X., Khattak, A. and Huang, B. (2014), Multivariate random-parameters zero-inflated negative binomial regression model: An application to estimate crash frequencies at intersections, Accident Analysis and Prevention, 70, 320-329.
Eluru, N., Bhat, C. R. and Hensher, D. A. (2008), A mixed generalized ordered response model for examining pedestrian and bicyclist injury severity level in traffic crashes, Accident Analysis and Prevention, 40(3), 1033-1054.
Embrechts, P. and Wüthrich, M. V. (2022), Recent challenges in actuarial science, Annual Review of Statistics and Its Application, 9, 119-140.
Frees, E. W. (2010), Regression Modeling with Actuarial and Financial Applications, In: International Series on Actuarial Science, Cambridge University Press.
Fountas, G. and Anastasopoulos, P. C. (2017), A random thresholds random parameters hierarchical ordered probit analysis of highway accident injury-severities, Analytic Methods in Accident Research, 15, 1-16.
Fountas, G., Pantangi, S. S., Hulme, K. F. and Anastasopoulos, P. C. (2019), The effects of driver fatigue, gender, and distracted driving on perceived and observed aggressive driving behavior: A correlated grouped random parameters bivariate probit approach, Analytic Methods in Accident Research, 22, 100091.
Garrido, J., Genest, C. and Schulz, J. (2016), Generalized linear models for dependent frequency and severity of insurance claims, Insurance: Mathematics and Economics, 70, 205-215.
Gong, H., Fu, T., Sun, Y., Guo, Z., Cong, L., Hu, W. and Ling, Z. (2022), Two-vehicle driver-injury severity: A multivariate random parameters logit approach, Analytic Methods in Accident Research, 33, 100190.
Haberman, S. and Renshaw, A. E. (1996), Generalized linear models and actuarial science, Journal of the Royal Statistical Society: Series D (The Statistician), 45(4), 407-436.
Hedeker, D. (2005), Generalized linear mixed models, In: B. Everitt, D. Howell (Eds.), Encyclopedia of Statistics in Behavioral Science, John Wiley & Sons, New York, 729-738.
Hossin, M. and Sulaiman, M. N. (2015), A review on evaluation metrics for data classification evaluations, International Journal of Data Mining and Knowledge Management Process, 5(2), 1-11.
Kaas, R., Goovaerts, M., Dhaene, J. and Denuit, M. (2008), Modern Actuarial Risk Theory: Using R, Second Edition, Springer Berlin, Heidelberg.
Khamis, H. (2008), Measures of association: How to choose?, Journal of Diagnostic Medical Sonography, 24(3), 155-162.
Kim, M., Kho, S. Y. and Kim, D. K. (2017), Hierarchical ordered model for injury severity of pedestrian crashes in South Korea, Journal of Safety Research, 61, 33-40.
Lord, D. and Mannering, F. (2010), The statistical analysis of crash-frequency data: A review and assessment of methodological alternatives, Transportation Research Part A: Policy and Practice, 44(5), 291-305.
Mannering, F. L., Shankar, V. and Bhat, C. R. (2016), Unobserved heterogeneity and the statistical analysis of highway accident data, Analytic Methods in Accident Research, 11, 1-16.
McCullagh, P. and Nelder, J. A. (1989), Generalized Linear Models, In: Monographs on Statistics and Applied Probability 37, Second Edition, Chapman and Hall, London, New York.
Miao, G. M. (2018), Application of hierarchical model in non-life insurance actuarial science, Modern Economy, 9(3), 393-399.
Nelder, J. A. and Wedderburn, R. W. M. (1972), Generalized linear models, Journal of the Royal Statistical Society: Series A (General), 135(3), 370-384.
Ohlsson, E. and Johansson, B. (2010), Non-life Insurance Pricing with Generalized Linear Models, In: EAA Series Textbook, Springer Berlin, Heidelberg.
Pai, J. S. and Walch, A. H. (2020), ACTEX Study Manual for Exam MAS-II, ACTEX Learning/SRBooks, Inc., Greenland, NH.
Pantangi, S. S., Fountas, G., Sarwar, M. T., Anastasopoulos, P. C., Blatt, A., Majka, K., Pierowicz, J. and Mohan, S. B. (2019), A preliminary investigation of the effectiveness of high visibility enforcement programs using naturalistic driving study data: A grouped random parameters approach, Analytic Methods in Accident Research, 21, 1-12.
Portet, S. (2020), A primer on model selection using the Akaike Information Criterion, Infectious Disease Modelling, 5, 111-128.
Saito, T. and Rehmsmeier, M. (2015), The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets, PloS One, 10(3), e0118432.
Tran, V., Liu, D., Pradhan, A. K., Li, K., Bingham, C. R., Simons-Morton, B. G. and Albert, P. S. (2015), Assessing risk-taking in a driving simulator study: Modeling longitudinal semi-continuous driving data using a two-part regression model with correlated random effects, Analytic Methods in Accident Research, 5, 17-27.
Yau, K., Yip, K. and Yuen, H. K. (2003), Modelling repeated insurance claim frequency data using the generalized linear mixed model, Journal of Applied Statistics, 30(8), 857-865.

There are 36 citations in total.

Details

Primary Language	English
Subjects	Statistical Analysis
Journal Section	Articles
Authors	Pervin Baylan 0000-0003-2660-3814 Neslihan Demirel 0000-0002-5394-4721
Publication Date	June 30, 2024
Submission Date	November 29, 2023
Acceptance Date	January 17, 2024
Published in Issue	Year 2024 Volume: 6 Issue: 1

Cite

APA	Baylan, P., & Demirel, N. (2024). Quantifying the Impact of Risk Factors on Direct Compensation Property Damage in Canadian Automobile Insurance. Nicel Bilimler Dergisi, 6(1), 103-127. https://doi.org/10.51541/nicel.1397941

Download Cover Image

Article Files

Full Text