Research Article
BibTex RIS Cite

Factors Associated with Match Result and Number of Goals Scored and Conceded in the English Premier League

Year 2022, Volume: 11 Issue: 1, 227 - 236, 24.03.2022
https://doi.org/10.17798/bitlisfen.1015215

Abstract

The aim of this research is to identify the factors associated with match result and number of goals scored and conceded and in the English Premier League. The data consist of 17 performance indicators and situational variables of the matches in the English Premier League for the season of 2017-18. Poisson regression model was implemented to identify the significant factors in the number of goals scored and conceded, while multinomial logistic regression and support vector machine methods were used to determine the influential factors on the match result. It was found that scoring first, shots on target and goals conceded have significant influence on the number of goals scored, whereas scoring first, match location, quality of opponent, goals conceded, shots and clearances are influential on the number of goals conceded. On the other hand, scoring first, match location, shots, shot on target, clearances and quality of opponent significantly affect the probability of losing; while scoring first, match location, shots, shots on target and possession affect the probability of winning. In addition, among all the variables studied, scoring first is the only variable appearing important in all the analyses, making it the most significant factor for success in football.

References

  • Almeida, C. H., Ferreira, A. P., & Volossovitch, A. (2014). Effects of match location, match status and quality of opposition on regaining possession in UEFA champions league. Journal of Human Kinetics, 41(1). https://doi.org/10.2478/hukin-2014-0048
  • Anderson, C., & Sally, D. (2014). The numbers game: why everything you know about Football is wrong. Penguin Books.
  • Armatas, V., & Pollard, R. (2014). Home advantage in Greek football. European Journal of Sport Science, 14(2), 116–122. https://doi.org/10.1080/17461391.2012.736537
  • Bilek, G., & Ulas, E. (2019). Predicting match outcome according to the quality of opponent in the English premier league using situational variables and team performance indicators. International Journal of Performance Analysis in Sport, 19(6), 930–941. https://doi.org/10.1080/24748668.2019.1684773
  • Bland, J. M., & Altman, D. G. (2000). Statistics notes. The odds ratio. BMJ (Clinical Research Ed.), 320(7247), 1468. https://doi.org/10.1136/bmj.320.7247.1468
  • Castellano, J., Casamichana, D., & Lago, C. (2012). The use of match statistics that discriminate between successful and unsuccessful soccer teams. Journal of Human Kinetics, 31(1). https://doi.org/10.2478/v10078-012-0015-7
  • Coxe, S., West, S. G., & Aiken, L. S. (2009). The analysis of count data: A gentle introduction to Poisson regression and its alternatives. Journal of Personality Assessment, 91(2), 121–136. https://doi.org/10.1080/00223890802634175
  • Crowder, M., Dixon, M., Ledford, A., & Robinson, M. (2002). Dynamic modelling and prediction of English Football League matches for betting. Journal of the Royal Statistical Society Series D: The Statistician, 51(2). https://doi.org/10.1111/1467-9884.00308
  • Dixon, M. J., & Coles, S. G. (1997). Modelling association football scores and inefficiencies in the football betting market. Journal of the Royal Statistical Society: Series C (Applied Statistics), 46(2), 265–280. https://doi.org/10.1111/1467-9876.00065
  • Ensum, R., Pollard, R., & Taylor, S. (2005). Applications of logistic regression to shots at goal in association football. In Science and Football V (pp. 211–218). Routledge. https://doi.org/10.4324/9780203412992-78
  • Fabian, P. G. V. G. M. T. (2011). Scikit-learn: Machine learning in Python. J. Mach. Learn. Res.
  • García-Rubio, J., Gómez, M. Á., Lago-Peñas, C., & Ibáñez, J. S. (2015). Effect of match venue, scoring first and quality of opposition on match outcome in the UEFA Champions League. International Journal of Performance Analysis in Sport, 15(2), 527–539. https://doi.org/10.1080/24748668.2015.11868811
  • Goddard, J. (2005). Regression models for forecasting goals and match results in association football. International Journal of Forecasting, 21(2), 331–340. https://doi.org/10.1016/J.IJFORECAST.2004.08.002
  • Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for cancer classification using support vector machines. Machine Learning, 46(1), 389-422.
  • Hvattum, L. M., & Arntzen, H. (2010). Using ELO ratings for match result prediction in association football. International Journal of Forecasting, 26(3). https://doi.org/10.1016/j.ijforecast.2009.10.002
  • Jones, P. D., James, N., & Mellalieu, S. D. (2004). Possession as a performance indicator in soccer. International Journal of Performance Analysis in Sport, 4(1), 98–102. https://doi.org/10.1080/24748668.2004.11868295
  • Karlis, D., & Ntzoufras, I. (2003). Analysis of sports data by using bivariate Poisson models. Journal of the Royal Statistical Society: Series D (The Statistician), 52(3), 381–393. https://doi.org/10.1111/1467-9884.00366
  • Lago-Peñas, C., Gómez-Ruano, M., Megías-Navarro, D., & Pollard, R. (2016). Home advantage in football: Examining the effect of scoring first on match outcome in the five major European leagues. International Journal of Performance Analysis in Sport, 16(2), 411–421. https://doi.org/10.1080/24748668.2016.11868897
  • Lago-Peñas, C., & Lago-Ballesteros, J. (2011). Game location and team quality effects on performance profiles in professional soccer. Journal of Sports Science & Medicine, 10(3), 465–471. http://www.ncbi.nlm.nih.gov/pubmed/24150619
  • Lago-Peñas, C., Lago-Ballesteros, J., Dellal, A., & Gómez, M. (2010). Game-related statistics that discriminated winning, drawing and losing teams from the Spanish Soccer League. Journal of Sports Science & Medicine, 9(2), 288–293.
  • Lago, C. (2009). The influence of match location, quality of opposition, and match status on possession strategies in professional association football. Journal of Sports Sciences, 27(13), 1463–1469. https://doi.org/10.1080/02640410903131681
  • Lago, C., & Martín, R. (2007). Determinants of possession of the ball in soccer. Journal of Sports Sciences, 25(9), 969–974. https://doi.org/10.1080/02640410600944626
  • Lee, A. J. (1997). Modeling scores in the Premier League: Is Manchester United really the best? CHANCE, 10(1), 15–19. https://doi.org/10.1080/09332480.1997.10554791
  • Lepschy, H., Woll, A., & Wäsche, H. (2021a). Success factors in the FIFA 2018 World Cup in Russia and FIFA 2014 World Cup in Brazil. Frontiers in Psychology, 12, 525. https://doi.org/10.3389/fpsyg.2021.638690
  • Lepschy, H., Woll, A., & Wäsche, H. (2021b). Success Factors in the FIFA 2018 World Cup in Russia and FIFA 2014 World Cup in Brazil. Frontiers in Psychology, 12, 525. https://doi.org/10.3389/fpsyg.2021.638690
  • Li, Y., Ma, R., Gonçalves, B., Gong, B., Cui, Y., & Shen, Y. (2020). Data-driven team ranking and match performance analysis in Chinese Football Super League. Chaos, Solitons & Fractals, 141, 110330. https://doi.org/10.1016/J.CHAOS.2020.110330
  • Liu, H., Gomez, M. Á., Lago-Peñas, C., & Sampaio, J. (2015). Match statistics related to winning in the group stage of 2014 Brazil FIFA World Cup. Journal of Sports Sciences, 33(12), 1205–1213. https://doi.org/10.1080/02640414.2015.1022578
  • Liu, H., Hopkins, W., Gómez, M. A., & Molinuevo, J. S. (2013). Inter-operator reliability of live football match statistics from OPTA Sportsdata. International Journal of Performance Analysis in Sport, 13(3). https://doi.org/10.1080/24748668.2013.11868690
  • Liu, T., García-De-Alcaraz, A., Zhang, L., & Zhang, Y. (2019). Exploring home advantage and quality of opposition interactions in the Chinese Football Super League. International Journal of Performance Analysis in Sport, 19(3), 289–301. https://doi.org/10.1080/24748668.2019.1600907
  • Lucey, P., Bialkowski, A., Monfort, M., Carr, P., & Matthews, I. (2014). “Quality vs quantity”: Improved shot prediction in soccer using strategic features from spatiotemporal data. Proc. 8th Annual MIT Sloan Sports Analytics Conference.
  • Maher, M. J. (1982). Modelling association football scores. Statistica Neerlandica, 36(3), 109–118. https://doi.org/10.1111/j.1467-9574.1982.tb00782.x
  • Mcguckin, B., Bradley, J., Hughes, M., O’donoghue, P., & Martin, D. (2020). Determinants of successful possession in elite Gaelic football Determinants of successful possession in elite Gaelic football. International Journal of Performance Analysis in Sport. https://doi.org/10.1080/24748668.2020.1758433
  • Moura, F. A., Martins, L. E. B., & Cunha, S. A. (2014). Analysis of football game-related statistics using multivariate techniques. Journal of Sports Sciences, 32(20), 1881–1887. https://doi.org/10.1080/02640414.2013.853130
  • Peeters, T., & van Ours, J. C. (2021). Seasonal Home Advantage in English Professional Football; 1974–2018. De Economist, 169(1), 107–126. https://doi.org/10.1007/s10645-020-09372-z
  • Pei, H., Lin, Q., Yang, L., & Zhong, P. (2021). A novel semi-supervised support vector machine with asymmetric squared loss. Advances in Data Analysis and Classification, 15(1), 159–191. https://doi.org/10.1007/s11634-020-00390-y
  • Pollard, R. (2006). Worldwide regional variations in home advantage in association football. Journal of Sports Sciences, 24(3), 231–240. https://doi.org/10.1080/02640410500141836
  • Poulter, D. R. (2009). Home advantage and player nationality in international club football. Journal of Sports Sciences, 27(8), 797–805. https://doi.org/10.1080/02640410902893364
  • Premier League Sports Data Case Study - Opta Sports. (n.d.). Retrieved April 13, 2021, from https://www.optasports.com/case-studies/opta-provides-data-powered-insights-to-the-premier-league/
  • Saavedra García, M., Gutiérrez Aguilar, O., Fernández Romero, J. J., & Sa Marques, P. (2015). Measuring home advantage in spanish football (1928-2011). Revista Internacional de Medicina y Ciencias de La Actividad Fisica y Del Deporte, 15(57). https://doi.org/10.15366/rimcafd2015.57.010
  • Salazar, D. A., Vélez, J. I., & Salazar, J. C. (2012). Comparison between SVM and logistic regression: Which one is better to discriminate? Revista Colombiana de Estadística, 35(SPE2).
  • Soto-Valero, C., González-Castellanos, M., & Pérez-Morales, I. (2017). A predictive model for analysing the starting pitchers’ performance using time series classification methods. International Journal of Performance Analysis in Sport, 17(4), 492–509.
  • Taylor, B. J., Mellalieu, D. S., James, N., & Barter, P. (2010). Situation variable effects and tactical performance in professional association football. International Journal of Performance Analysis in Sport, 10(3). https://doi.org/10.1080/24748668.2010.11868520
  • Thomas, S., Reeves, C., & Davies, S. (2004). An analysis of home advantage in the English Football Premiership. Perceptual and Motor Skills, 99(3 Pt 2), 1212–1216. https://doi.org/10.2466/pms.99.3f.1212-1216
  • Yanhao Huo & Lihui Xin & Chuanze Kang & Minghui Wang Qin Ma & Bin Yu. (2019). SGL-SVM: a novel method for tumor classification via support vector machine with sparse group Lasso. Journal of Theoretical Biology.

Factors Associated with Match Result and Number of Goals Scored and Conceded in the English Premier League

Year 2022, Volume: 11 Issue: 1, 227 - 236, 24.03.2022
https://doi.org/10.17798/bitlisfen.1015215

Abstract

The aim of this research is to identify the factors associated with the match result and the number of goals scored and conceded in the English Premier League. The data consist of 17 performance indicators and situational variables of the football matches in the English Premier League for the season of 2017-18. Poisson regression model was implemented to identify the significant factors in the number of goals scored and conceded, while multinomial logistic regression and support vector machine methods were used to determine the influential factors on the match result. It was found that scoring first, shots on target and goals conceded have significant influence on the number of goals scored, whereas scoring first, match location, quality of opponent, goals conceded, shots and clearances are influential on the number of goals conceded. On the other hand, scoring first, match location, shots, shot on target, clearances and quality of opponent significantly affect the probability of losing; while scoring first, match location, shots, shots on target and possession affect the probability of winning. In addition, among all the variables studied, scoring first is the only variable appearing important in all the analyses, making it the most significant factor for success in football.

References

  • Almeida, C. H., Ferreira, A. P., & Volossovitch, A. (2014). Effects of match location, match status and quality of opposition on regaining possession in UEFA champions league. Journal of Human Kinetics, 41(1). https://doi.org/10.2478/hukin-2014-0048
  • Anderson, C., & Sally, D. (2014). The numbers game: why everything you know about Football is wrong. Penguin Books.
  • Armatas, V., & Pollard, R. (2014). Home advantage in Greek football. European Journal of Sport Science, 14(2), 116–122. https://doi.org/10.1080/17461391.2012.736537
  • Bilek, G., & Ulas, E. (2019). Predicting match outcome according to the quality of opponent in the English premier league using situational variables and team performance indicators. International Journal of Performance Analysis in Sport, 19(6), 930–941. https://doi.org/10.1080/24748668.2019.1684773
  • Bland, J. M., & Altman, D. G. (2000). Statistics notes. The odds ratio. BMJ (Clinical Research Ed.), 320(7247), 1468. https://doi.org/10.1136/bmj.320.7247.1468
  • Castellano, J., Casamichana, D., & Lago, C. (2012). The use of match statistics that discriminate between successful and unsuccessful soccer teams. Journal of Human Kinetics, 31(1). https://doi.org/10.2478/v10078-012-0015-7
  • Coxe, S., West, S. G., & Aiken, L. S. (2009). The analysis of count data: A gentle introduction to Poisson regression and its alternatives. Journal of Personality Assessment, 91(2), 121–136. https://doi.org/10.1080/00223890802634175
  • Crowder, M., Dixon, M., Ledford, A., & Robinson, M. (2002). Dynamic modelling and prediction of English Football League matches for betting. Journal of the Royal Statistical Society Series D: The Statistician, 51(2). https://doi.org/10.1111/1467-9884.00308
  • Dixon, M. J., & Coles, S. G. (1997). Modelling association football scores and inefficiencies in the football betting market. Journal of the Royal Statistical Society: Series C (Applied Statistics), 46(2), 265–280. https://doi.org/10.1111/1467-9876.00065
  • Ensum, R., Pollard, R., & Taylor, S. (2005). Applications of logistic regression to shots at goal in association football. In Science and Football V (pp. 211–218). Routledge. https://doi.org/10.4324/9780203412992-78
  • Fabian, P. G. V. G. M. T. (2011). Scikit-learn: Machine learning in Python. J. Mach. Learn. Res.
  • García-Rubio, J., Gómez, M. Á., Lago-Peñas, C., & Ibáñez, J. S. (2015). Effect of match venue, scoring first and quality of opposition on match outcome in the UEFA Champions League. International Journal of Performance Analysis in Sport, 15(2), 527–539. https://doi.org/10.1080/24748668.2015.11868811
  • Goddard, J. (2005). Regression models for forecasting goals and match results in association football. International Journal of Forecasting, 21(2), 331–340. https://doi.org/10.1016/J.IJFORECAST.2004.08.002
  • Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for cancer classification using support vector machines. Machine Learning, 46(1), 389-422.
  • Hvattum, L. M., & Arntzen, H. (2010). Using ELO ratings for match result prediction in association football. International Journal of Forecasting, 26(3). https://doi.org/10.1016/j.ijforecast.2009.10.002
  • Jones, P. D., James, N., & Mellalieu, S. D. (2004). Possession as a performance indicator in soccer. International Journal of Performance Analysis in Sport, 4(1), 98–102. https://doi.org/10.1080/24748668.2004.11868295
  • Karlis, D., & Ntzoufras, I. (2003). Analysis of sports data by using bivariate Poisson models. Journal of the Royal Statistical Society: Series D (The Statistician), 52(3), 381–393. https://doi.org/10.1111/1467-9884.00366
  • Lago-Peñas, C., Gómez-Ruano, M., Megías-Navarro, D., & Pollard, R. (2016). Home advantage in football: Examining the effect of scoring first on match outcome in the five major European leagues. International Journal of Performance Analysis in Sport, 16(2), 411–421. https://doi.org/10.1080/24748668.2016.11868897
  • Lago-Peñas, C., & Lago-Ballesteros, J. (2011). Game location and team quality effects on performance profiles in professional soccer. Journal of Sports Science & Medicine, 10(3), 465–471. http://www.ncbi.nlm.nih.gov/pubmed/24150619
  • Lago-Peñas, C., Lago-Ballesteros, J., Dellal, A., & Gómez, M. (2010). Game-related statistics that discriminated winning, drawing and losing teams from the Spanish Soccer League. Journal of Sports Science & Medicine, 9(2), 288–293.
  • Lago, C. (2009). The influence of match location, quality of opposition, and match status on possession strategies in professional association football. Journal of Sports Sciences, 27(13), 1463–1469. https://doi.org/10.1080/02640410903131681
  • Lago, C., & Martín, R. (2007). Determinants of possession of the ball in soccer. Journal of Sports Sciences, 25(9), 969–974. https://doi.org/10.1080/02640410600944626
  • Lee, A. J. (1997). Modeling scores in the Premier League: Is Manchester United really the best? CHANCE, 10(1), 15–19. https://doi.org/10.1080/09332480.1997.10554791
  • Lepschy, H., Woll, A., & Wäsche, H. (2021a). Success factors in the FIFA 2018 World Cup in Russia and FIFA 2014 World Cup in Brazil. Frontiers in Psychology, 12, 525. https://doi.org/10.3389/fpsyg.2021.638690
  • Lepschy, H., Woll, A., & Wäsche, H. (2021b). Success Factors in the FIFA 2018 World Cup in Russia and FIFA 2014 World Cup in Brazil. Frontiers in Psychology, 12, 525. https://doi.org/10.3389/fpsyg.2021.638690
  • Li, Y., Ma, R., Gonçalves, B., Gong, B., Cui, Y., & Shen, Y. (2020). Data-driven team ranking and match performance analysis in Chinese Football Super League. Chaos, Solitons & Fractals, 141, 110330. https://doi.org/10.1016/J.CHAOS.2020.110330
  • Liu, H., Gomez, M. Á., Lago-Peñas, C., & Sampaio, J. (2015). Match statistics related to winning in the group stage of 2014 Brazil FIFA World Cup. Journal of Sports Sciences, 33(12), 1205–1213. https://doi.org/10.1080/02640414.2015.1022578
  • Liu, H., Hopkins, W., Gómez, M. A., & Molinuevo, J. S. (2013). Inter-operator reliability of live football match statistics from OPTA Sportsdata. International Journal of Performance Analysis in Sport, 13(3). https://doi.org/10.1080/24748668.2013.11868690
  • Liu, T., García-De-Alcaraz, A., Zhang, L., & Zhang, Y. (2019). Exploring home advantage and quality of opposition interactions in the Chinese Football Super League. International Journal of Performance Analysis in Sport, 19(3), 289–301. https://doi.org/10.1080/24748668.2019.1600907
  • Lucey, P., Bialkowski, A., Monfort, M., Carr, P., & Matthews, I. (2014). “Quality vs quantity”: Improved shot prediction in soccer using strategic features from spatiotemporal data. Proc. 8th Annual MIT Sloan Sports Analytics Conference.
  • Maher, M. J. (1982). Modelling association football scores. Statistica Neerlandica, 36(3), 109–118. https://doi.org/10.1111/j.1467-9574.1982.tb00782.x
  • Mcguckin, B., Bradley, J., Hughes, M., O’donoghue, P., & Martin, D. (2020). Determinants of successful possession in elite Gaelic football Determinants of successful possession in elite Gaelic football. International Journal of Performance Analysis in Sport. https://doi.org/10.1080/24748668.2020.1758433
  • Moura, F. A., Martins, L. E. B., & Cunha, S. A. (2014). Analysis of football game-related statistics using multivariate techniques. Journal of Sports Sciences, 32(20), 1881–1887. https://doi.org/10.1080/02640414.2013.853130
  • Peeters, T., & van Ours, J. C. (2021). Seasonal Home Advantage in English Professional Football; 1974–2018. De Economist, 169(1), 107–126. https://doi.org/10.1007/s10645-020-09372-z
  • Pei, H., Lin, Q., Yang, L., & Zhong, P. (2021). A novel semi-supervised support vector machine with asymmetric squared loss. Advances in Data Analysis and Classification, 15(1), 159–191. https://doi.org/10.1007/s11634-020-00390-y
  • Pollard, R. (2006). Worldwide regional variations in home advantage in association football. Journal of Sports Sciences, 24(3), 231–240. https://doi.org/10.1080/02640410500141836
  • Poulter, D. R. (2009). Home advantage and player nationality in international club football. Journal of Sports Sciences, 27(8), 797–805. https://doi.org/10.1080/02640410902893364
  • Premier League Sports Data Case Study - Opta Sports. (n.d.). Retrieved April 13, 2021, from https://www.optasports.com/case-studies/opta-provides-data-powered-insights-to-the-premier-league/
  • Saavedra García, M., Gutiérrez Aguilar, O., Fernández Romero, J. J., & Sa Marques, P. (2015). Measuring home advantage in spanish football (1928-2011). Revista Internacional de Medicina y Ciencias de La Actividad Fisica y Del Deporte, 15(57). https://doi.org/10.15366/rimcafd2015.57.010
  • Salazar, D. A., Vélez, J. I., & Salazar, J. C. (2012). Comparison between SVM and logistic regression: Which one is better to discriminate? Revista Colombiana de Estadística, 35(SPE2).
  • Soto-Valero, C., González-Castellanos, M., & Pérez-Morales, I. (2017). A predictive model for analysing the starting pitchers’ performance using time series classification methods. International Journal of Performance Analysis in Sport, 17(4), 492–509.
  • Taylor, B. J., Mellalieu, D. S., James, N., & Barter, P. (2010). Situation variable effects and tactical performance in professional association football. International Journal of Performance Analysis in Sport, 10(3). https://doi.org/10.1080/24748668.2010.11868520
  • Thomas, S., Reeves, C., & Davies, S. (2004). An analysis of home advantage in the English Football Premiership. Perceptual and Motor Skills, 99(3 Pt 2), 1212–1216. https://doi.org/10.2466/pms.99.3f.1212-1216
  • Yanhao Huo & Lihui Xin & Chuanze Kang & Minghui Wang Qin Ma & Bin Yu. (2019). SGL-SVM: a novel method for tumor classification via support vector machine with sparse group Lasso. Journal of Theoretical Biology.
There are 44 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Araştırma Makalesi
Authors

Günal Bilek 0000-0001-6417-7129

Betul Aygun 0000-0001-9610-9235

Publication Date March 24, 2022
Submission Date October 26, 2021
Acceptance Date February 10, 2022
Published in Issue Year 2022 Volume: 11 Issue: 1

Cite

IEEE G. Bilek and B. Aygun, “Factors Associated with Match Result and Number of Goals Scored and Conceded in the English Premier League”, Bitlis Eren Üniversitesi Fen Bilimleri Dergisi, vol. 11, no. 1, pp. 227–236, 2022, doi: 10.17798/bitlisfen.1015215.

Bitlis Eren University
Journal of Science Editor
Bitlis Eren University Graduate Institute
Bes Minare Mah. Ahmet Eren Bulvari, Merkez Kampus, 13000 BITLIS