Prediction bike-sharing demand with gradient boosting methods
Yıl 2023,
Cilt: 29 Sayı: 8, 824 - 832, 31.12.2023
Zeliha Ergül Aydın
,
Banu İçmen Erdem
,
Zeynep Idil Erzurum Cıcek
Öz
The popularity of bike-sharing programs has increased the need for precise demand prediction techniques. In this work, the use of gradientboosting techniques to forecast demand for bike-sharing systems is studied. The gradient boosting algorithms XGBoost, LightGBM, and CatBoost are used in this study to suggest an approach for predicting bike-sharing demand. Two real-world data sets were analyzed in this study, one for Konya and the other for Washington, D.C. Both datasets provide details about the day's particular characteristics and the weather. By using previous data to train a gradient-boosting model, we are able to make extremely precise predictions of future bike-sharing demand. CatBoost outperforms XGboost and LightGBM when all gradient boosting models are trained with the best hyperparameter sets.
Kaynakça
- [1] Maggioni F, Cagnolari M, Bertazzi L, Wallace SW. “Stochastic optimization models for a bike-sharing problem with transshipment”. European Journal of Operational Research, 276(1), 272-283, 2019.
- [2] Tekouabou SCK. “Intelligent management of bike-sharing in smart cities using machine learning and Internet of Things”. Sustainable Cities and Society, 67, 1-14, 2021.
- [3] Otero I, Nieuwenhuijsen MJ, Rojas-Rueda D. “Health impacts of bike-sharing systems in Europe”. Environment international, 115, 387-394, 2018.
- [4] Huber F, Yushchenko A, Stratmann B, Steinhage V. “Extreme Gradient Boosting for yield estimation compared with Deep Learning approaches”. Computers and Electronics in Agriculture, 202, 1-11, 2022.
- [5] Sripetdee T, Jitmitsumphan S, Chaimuengchuen T, Buranaamnuay M, Chinkanjanarot S, Jonglertjunya W, Ling TC, Phadungbut P. “Extreme gradient boosting machine for modeling hydrogen gas storage in carbon slit pores from molecular simulation data”. Energy Reports, 8(16), 16-21, 2022.
- [6] Zhang C, Chen X, Wang S, Hu J, Wang C, Liu X. “Using CatBoost Algorithm to Identify Middle-aged and Elderly Depression National Health and Nutrition Examination Survey 2011–2018”. Psychiatry Research, 306(11), 1-8, 2021.
- [7] Tran A, Tsujimura M, Thang H, Nguyen T, Binh D, Dang T, Doan QV, Bui D, Anh Ngoc T, Vo P, Thuc P, Pham TD. “Evaluating the predictive power of different machine learning algorithms for groundwater salinity prediction of multi-layer coastal aquifers in the Mekong Delta”. Vietnam. Ecological Indicators, 127, 1-14, 2021.
- [8] Jhaveri S, Khedkar I, Kantharia Y, Jaswal S. “Success Prediction using Random Forest, CatBoost, XGBoost and AdaBoost for Kickstarter Campaigns”. 2019 3rd International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, 27-29 March 2019.
- [9] Salaken SM, Hosen MA, Khosravi A, Nahavandi S. “Forecasting bike sharing demand using fuzzy ınference mechanism”. 22nd International Conference, ICONIP 2015, İstanbul, Turkey, 9-12 November 2015.
- [10] Sathishkumar VE, Park J, Cho Y. “Using data mining techniques for bike-sharing demand prediction in metropolitan city”. Computer Communications, 153, 353-366, 2020.
- [11] Wang B, Kim I. “Short-term prediction for bike-sharing service using machine learning”. Transportation Research Procedia, 34, 171-178, 2018.
- [12] Sathishkumar VE, Cho Y.” A rule-based model for seoul bike-sharing demand prediction using weather data”. European Journal of Remote Sensing, 53(1), 166-183, 2020.
- [13] Chang PC, Wu J, Xu Y, Zhang M, Lu X. “Bike-sharing demand prediction using artificial immune system and artificial neural network”. Soft Computing, 23, 613-626, 2019.
- [14] Gao X, Lee G. “Moment-based Rental Prediction for Bicycle-sharing Transportation Systems Using a Hybrid Genetic Algorithm and Machine Learning”. Computers & Industrial Engineering, 128, 60-69, 2018.
- [15] Dastjerdi A, Morency C. “Bike-sharing demand prediction at community level under COVID-19 using deep learning”. Sensors, 22(3), 1-18, 2022.
- [16] Pan Y, Zheng R, Zhang J, Yao X. “Predicting bike-sharing demand using recurrent neural networks”. Procedia Computer Science, 147, 562-566, 2019.
- [17] Jiang W. “Bike-Sharing usage prediction with deep learning: a survey”. Neural Comput & Applic, 34, 15369-15385, 2022.
- [18] Li Y, Zheng Z. “Citywide bike usage prediction in a bikesharing system”. IEEE Transactions on Knowledge and Data Engineering, 32(6), 1079-1091, 2020.
- [19] Fanaee-T H, Gama J. “Event labeling combining ensemble detectors and background knowledge”. Prog Artif Intell, 2, 113-127, 2014.
- [20] Konya Açık Veri Portalı. “Paylaşımlı Kiralık Bisiklet Kullanım Verileri”. https://acikveri.konya.bel.tr (06.12.2022).
- [21] Meteostat. “The Weather's Record Keeper”. https://meteostat.net/en/ (06.12.2022).
- [22] Natekin A, Knoll A. “Gradient boosting machines, a tutorial”. Frontiers in Neurorobotics, 7, 1-21, 2013.
- [23] Bentejac C, Csörgö A, Martinez-Munoz G. “A comparative analysis of gradient boosting algorithms”. Artificial Intelligence Review, 54, 1937-1967, 2021.
- [24] Chen T, Guestrin C. “XGBoost: A scalable tree boosting system”. KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco California, USA, 13-17 August 2016.
- [25] Ergul Aydin Z, Kamisli Ozturk Z. “Performance analysis of XGBoost classifier with missing data”. Manchester Journal of Artificial Intelligence and Applied Sciences, 2(2), 166-170, 2021.
- [26] Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu TY. “LightGBM: A highly efficient gradient boosting decision tree”. Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach California, USA, 4-9 December 2017.
- [27] Prokhorenkova L, Gusev G, Vorobev A, Dorogush AV, Gulin A. “Catboost: unbiased boosting with categorical features”. Advances in Neural İnformation Processing Systems, 31, 6638–6648, 2018.
- [28] Dorogush AV, Ershov V, Gulin A. “CatBoost: Gradient Boosting with Categorical Features Support”. arXiv preprint, 2018. https://doi.org/10.48550/arXiv.1810.11363.
- [29] Xu, T, Han, G, Qi, X, Du, J, Lin, C, Shu, L. “A hybrid machine learning model for demand prediction of edge-computingbased bike-sharing system under ınternet of things”. IEEE Internet of Things Journal, 7(8), 7345-7356, 2020.
- [30] Apaydın, M, Yumuş M, Değirmenci, A, Karal Ö. “Evaluation of air temperature with machine learning regression methods using Seoul City meteorological data”. Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi, 28 (5), 737-747, 2022.
- [31] Marín, LG, Cruz, N, Sáez, D, Núñez, A. “Prediction interval methodology based on fuzzy numbers and its extension to fuzzy systems and neural networks”. Expert Systems with Applications, 119, 128-141, 2019.
Gradyan artırma yöntemleriyle bisiklet paylaşım talebini tahminleme
Yıl 2023,
Cilt: 29 Sayı: 8, 824 - 832, 31.12.2023
Zeliha Ergül Aydın
,
Banu İçmen Erdem
,
Zeynep Idil Erzurum Cıcek
Öz
Bisiklet paylaşım sistemlerinin artan popülaritesi, talebi doğru tahmin etme ihtiyacını artırmıştır. Bu çalışma, bisiklet paylaşım sistemlerinde talebi tahmin etmek için gradyan artırma yöntemlerinin kullanımını araştırmaktadır. Bu amaçla, XGBoost, LightGBM ve CatBoost gradyan artırma algoritmalarını kullanarak bisiklet paylaşım talebini tahmin etmek için bir yöntem önerilmektedir. Önerilen yöntem Konya ve Washington, D.C. olmak üzere iki gerçek dünya veri setine uygulanmıştır. Her iki veri setinde de hava koşulları ve günün belirli özellikleri gibi bilgiler yer almaktadır Geçmiş veriler üzerinde bir gradyan artırma modeli eğiterek, gelecekteki bisiklet paylaşımı talebine ilişkin son derece doğru tahminler yapılabilmektedir. Tüm gradyan artırma modelleri en iyi hiperparametre kümeleriyle eğitildiğinde; CatBoost, XGboost ve LightGBM'den daha iyi performans göstermiştir.
Kaynakça
- [1] Maggioni F, Cagnolari M, Bertazzi L, Wallace SW. “Stochastic optimization models for a bike-sharing problem with transshipment”. European Journal of Operational Research, 276(1), 272-283, 2019.
- [2] Tekouabou SCK. “Intelligent management of bike-sharing in smart cities using machine learning and Internet of Things”. Sustainable Cities and Society, 67, 1-14, 2021.
- [3] Otero I, Nieuwenhuijsen MJ, Rojas-Rueda D. “Health impacts of bike-sharing systems in Europe”. Environment international, 115, 387-394, 2018.
- [4] Huber F, Yushchenko A, Stratmann B, Steinhage V. “Extreme Gradient Boosting for yield estimation compared with Deep Learning approaches”. Computers and Electronics in Agriculture, 202, 1-11, 2022.
- [5] Sripetdee T, Jitmitsumphan S, Chaimuengchuen T, Buranaamnuay M, Chinkanjanarot S, Jonglertjunya W, Ling TC, Phadungbut P. “Extreme gradient boosting machine for modeling hydrogen gas storage in carbon slit pores from molecular simulation data”. Energy Reports, 8(16), 16-21, 2022.
- [6] Zhang C, Chen X, Wang S, Hu J, Wang C, Liu X. “Using CatBoost Algorithm to Identify Middle-aged and Elderly Depression National Health and Nutrition Examination Survey 2011–2018”. Psychiatry Research, 306(11), 1-8, 2021.
- [7] Tran A, Tsujimura M, Thang H, Nguyen T, Binh D, Dang T, Doan QV, Bui D, Anh Ngoc T, Vo P, Thuc P, Pham TD. “Evaluating the predictive power of different machine learning algorithms for groundwater salinity prediction of multi-layer coastal aquifers in the Mekong Delta”. Vietnam. Ecological Indicators, 127, 1-14, 2021.
- [8] Jhaveri S, Khedkar I, Kantharia Y, Jaswal S. “Success Prediction using Random Forest, CatBoost, XGBoost and AdaBoost for Kickstarter Campaigns”. 2019 3rd International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, 27-29 March 2019.
- [9] Salaken SM, Hosen MA, Khosravi A, Nahavandi S. “Forecasting bike sharing demand using fuzzy ınference mechanism”. 22nd International Conference, ICONIP 2015, İstanbul, Turkey, 9-12 November 2015.
- [10] Sathishkumar VE, Park J, Cho Y. “Using data mining techniques for bike-sharing demand prediction in metropolitan city”. Computer Communications, 153, 353-366, 2020.
- [11] Wang B, Kim I. “Short-term prediction for bike-sharing service using machine learning”. Transportation Research Procedia, 34, 171-178, 2018.
- [12] Sathishkumar VE, Cho Y.” A rule-based model for seoul bike-sharing demand prediction using weather data”. European Journal of Remote Sensing, 53(1), 166-183, 2020.
- [13] Chang PC, Wu J, Xu Y, Zhang M, Lu X. “Bike-sharing demand prediction using artificial immune system and artificial neural network”. Soft Computing, 23, 613-626, 2019.
- [14] Gao X, Lee G. “Moment-based Rental Prediction for Bicycle-sharing Transportation Systems Using a Hybrid Genetic Algorithm and Machine Learning”. Computers & Industrial Engineering, 128, 60-69, 2018.
- [15] Dastjerdi A, Morency C. “Bike-sharing demand prediction at community level under COVID-19 using deep learning”. Sensors, 22(3), 1-18, 2022.
- [16] Pan Y, Zheng R, Zhang J, Yao X. “Predicting bike-sharing demand using recurrent neural networks”. Procedia Computer Science, 147, 562-566, 2019.
- [17] Jiang W. “Bike-Sharing usage prediction with deep learning: a survey”. Neural Comput & Applic, 34, 15369-15385, 2022.
- [18] Li Y, Zheng Z. “Citywide bike usage prediction in a bikesharing system”. IEEE Transactions on Knowledge and Data Engineering, 32(6), 1079-1091, 2020.
- [19] Fanaee-T H, Gama J. “Event labeling combining ensemble detectors and background knowledge”. Prog Artif Intell, 2, 113-127, 2014.
- [20] Konya Açık Veri Portalı. “Paylaşımlı Kiralık Bisiklet Kullanım Verileri”. https://acikveri.konya.bel.tr (06.12.2022).
- [21] Meteostat. “The Weather's Record Keeper”. https://meteostat.net/en/ (06.12.2022).
- [22] Natekin A, Knoll A. “Gradient boosting machines, a tutorial”. Frontiers in Neurorobotics, 7, 1-21, 2013.
- [23] Bentejac C, Csörgö A, Martinez-Munoz G. “A comparative analysis of gradient boosting algorithms”. Artificial Intelligence Review, 54, 1937-1967, 2021.
- [24] Chen T, Guestrin C. “XGBoost: A scalable tree boosting system”. KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco California, USA, 13-17 August 2016.
- [25] Ergul Aydin Z, Kamisli Ozturk Z. “Performance analysis of XGBoost classifier with missing data”. Manchester Journal of Artificial Intelligence and Applied Sciences, 2(2), 166-170, 2021.
- [26] Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu TY. “LightGBM: A highly efficient gradient boosting decision tree”. Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach California, USA, 4-9 December 2017.
- [27] Prokhorenkova L, Gusev G, Vorobev A, Dorogush AV, Gulin A. “Catboost: unbiased boosting with categorical features”. Advances in Neural İnformation Processing Systems, 31, 6638–6648, 2018.
- [28] Dorogush AV, Ershov V, Gulin A. “CatBoost: Gradient Boosting with Categorical Features Support”. arXiv preprint, 2018. https://doi.org/10.48550/arXiv.1810.11363.
- [29] Xu, T, Han, G, Qi, X, Du, J, Lin, C, Shu, L. “A hybrid machine learning model for demand prediction of edge-computingbased bike-sharing system under ınternet of things”. IEEE Internet of Things Journal, 7(8), 7345-7356, 2020.
- [30] Apaydın, M, Yumuş M, Değirmenci, A, Karal Ö. “Evaluation of air temperature with machine learning regression methods using Seoul City meteorological data”. Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi, 28 (5), 737-747, 2022.
- [31] Marín, LG, Cruz, N, Sáez, D, Núñez, A. “Prediction interval methodology based on fuzzy numbers and its extension to fuzzy systems and neural networks”. Expert Systems with Applications, 119, 128-141, 2019.