A C4.5 – CART DECISION TREE MODEL FOR REAL ESTATE PRICE PREDICTION AND THE ANALYSIS OF THE UNDERLYING FEATURES
Year 2022,
Volume: 10 Issue: 1, 147 - 161, 01.03.2022
Sait Yücebaş
,
Melike Doğan
,
Levent Genç
Abstract
The machine learning approaches are used in different domains for price prediction. Real estate price prediction comes to fore in recent years. However, most of the studies focus on the prediction performance and the factors affecting the price are often ignored. In this study, a C4.5 – CART model to predict the residential real estate prices is developed. This model is capable of predicting both numeric and categorical price for real estate properties. In addition, the factors affecting the price are reveled and analyzed in detail. The performance of the developed model is compared to Direct Capitalization model, which is used as a gold standard in the domain. Both models are tested on a dataset that includes updated real time data that is gathered by a web scraper. For numeric prediction, RMSE of the developed model is 13.169 and 358.69 for the Direct Capitalization model. KAPPA and accuracy is used for the categorical prediction. The model has 81% KAPPA and 88% accuracy.
References
- Abidoye, R.B., Chan, A.P.C., 2017, “Modelling property values in Nigeria using artificial neural network”, Journal of Property Research, vol. 34, no. 1, pp. 36-53. doi: 10.1080/09599916.2017.1286366
- Adetiloye, K.A., Eke, P.D., 2014, “A Review of Real Estate Valuation And Optimal Pricing Techniques”, Asian Economic and Financial Review, vol. 4, no. 12, pp. 1878-1893. doi: https://doi.org/10.1108/JERER-08-2018-0035
- Afonso, B.K.A., Melo, L.C., Oliveira1, W.D.G., Sousa, S.B.S., Berton, L., 2019, “Housing Prices Prediction with a Deep Learning and Random Forest Ensemble”, web adresi: https://www.researchgate.net/publication/335527230_Housing_Prices_Prediction_with_a_Deep_Learning_and_Random_Forest_Ensemble, Ziyaret Tarihi: 20.12.20201
- Armstrong, S., Collopy, F., 1992, “Error Measures For Generalizing About Forecasting Methods: Empirical Comparisons”, International Journal of Forecasting, vol.8, no.1, pp. 69-80, 1992. https://doi.org/10.1016/0169-2070 (92)90008-W
- Arslan, A., 2016, “Kentsel Alanlarda Taşınmaz Değerlemesi”, Yüksek Lisans Tezi, Balıkesir Üniversitesi, Fen Bilimleri Enstitüsü, Balıkesir
- Breiman, L., 2001, “Random Forests”, Machine Learning, vol. 45, pp. 5–32
- Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A., 1984, “Classification And Regression Trees”, 1st ed., Brooks/Cole Publishing, Monterey, CA, USA.
- He, H.M., Chen, Y., Xiao, J.Y., Chen, X.Q. Lee, Z.J., 2021, “Data Analysis on the Influencing Factors of the Real Estate Price”, Artificial Intelligence Evolution [Internet]. 2021Sep.10 [cited 2021Dec.23]; 2(2):52-66. Available from: https://ojs.wiserpub.com/index.php/AIE/article/view/966
- Hong, J., Choi, H., Kim, W., 2020, “A house price valuation based on the random forest approach: the mass appraisal of residential property in South Korea”, International Journal of Strategic Property Management, vol. 24, no. 3, pp 140-152. https://doi.org/10.3846/ijspm.2020.11544
- Khalafallah,A., 2008, "Neural network based model for predicting housing market performance", Tsinghua Science and Technology, vol. 13, no. S1, pp. 325-328. doi: 10.1016/S1007-0214(08)70169-X
- Levantesi, S., Piscopo, G., 2020, “The Importance of Economic Variables on London Real Estate Market: A Random Forest Approach”, Risks, vol. 8, pp. 112. https://doi.org/10.3390/risks8040112
- Li, L., Chu, K., “Prediction of Real Estate Price Variation Based on Economic Parameters”, 2017 International Conference on Applied System Innovation (ICASI), Sapporo, Japan, 87-90, 2020. doi: 10.1109/ICASI.2017.7988353
- Madhuri, C.R., Anuradha, G., Pujitha, M.V., 2019, “House Price Prediction Using Regression Techniques: A Comparative Study”, International Conference on Smart Structures and Systems (ICSSS), Chennai, India, 1-5, 14-15 March 2019. doi: 10.1109/ICSSS.2019.8882834
- Manasa, J., Gupta, R., Narahari, N.S., “Machine Learning Based Predicting House Prices Using Regression Techniques”, 2020 2nd International Conference on Innovative Mechanisms for Industry Applications (ICIMIA), Bangalore, India, 624-630, 2020. doi: 10.1109/ICIMIA48430 .2020.9074952
- Mayer M., Bourassa, M., Hoesli, D., Scognamiglio, D., 2019, “Estimation and Updating
Methods for Hedonic Valuation”, Journal of European Real Estate Research, vol. 12, no. 1, pp. 134-150. https://doi.org/10.1108/JERER-08-2018-0035.
- Michaletz, V.B., Artemenkov, A., 2018, “The Transactional Assets Pricing Approach and Income Capitalization Models In Professional Valuation: Towards A Quick Income Capitalization Format”, De Gruyter, vol. 26, no. 1, pp. 89-107. doi: 10.2478/remav-2018-0008.
- Mukhlishin, M.F., Saputra, R., Wibowo, A., "Predicting House Sale Price Using Fuzzy Logic, Artificial Neural Network and K-Nearest Neighbor", 2017 1st International Conference on Informatics and Computational Sciences (ICICoS), Semarang, Indonesia, pp. 171-176, 2017. doi: 10.1109/ICICOS.2017.8276357
- Onurlu, Ö., 2006, Uluslararası Değerleme Standartlarının Türkiye’de Uygulanması Sürecinde Gelir Kapitalizasyonu Yaklaşımının İrdelenmesi, Yüksek Lisans Tezi, İstanbul Teknik Üniversitesi, Fen Bilimleri Enstitüsü, İstanbul.
- Park, B., Bae, J.K., 2015, “Using Machine Learning Algorithms for Housing Price Prediction: The Case of Fairfax County, Virginia Housing Data”, Expert Systems with Applications, vol. 42, no. 6, pp. 2928-2934. https://doi.org/10.1016/j.eswa.2014.11.040
- Peter, N.J., Okagbue, H.I., Obasi, E. C.M., Akinola, A.O., 2020, “Review on the Application of Artificial Neural Networks in Real Estate Valuation”, International Journal of Advanced Trends in Computer Science and Engineering, vol. 9, no. 3, pp. 2918–2925. https://doi.org/10.30534/IJATCSE/2020/66932020)
- Pınar, A., Demir, M., 2014, “Konut Sektöründe Kapitalizasyon Oranlarını Belirleyen Faktörler: Türkiye için Bir Mikro-Veri Analizi,” Sosyoekonomi, vol. 22, no. 22, pp. 386-398.
- Piao, Y., Chen, A., Shang, Z., “Housing Price Prediction Based on CNN”, 9th International Conference on Information Science and Technology (ICIST), Hulunbuir, China, 491-495, 2-5 Aug. 2019. doi: 10.1109/ICIST.2019.8836731
- Phan, T.D., “Housing Price Prediction Using Machine Learning Algorithms: The Case of Melbourne City, Australia”, International Conference on Machine Learning and Data Engineering (iCMLDE), Sydney, NSW, Australia, 35-42, 3-7 Dec. 2018. doi: 10.1109/iCMLDE.2018.00017
- Rave, J.I.P., Morales, J.C.C., Echavarría, F.G., 2019, “A Machine Learning Approach to Big Data Regression Analysis of Real Estate Prices for Inferential and Predictive Purposes, Journal of Property Research, vol. 36, no. 1, pp. 59- 96, DOI: 10.1080/09599916.2019.1587489
- Salzberg, S.L, 1994, “C4.5: Programs for Machine Learning by J. Ross Quinlan. Morgan Kaufmann Publishers, Inc., 1993”, Machine Learning, vol. 16, pp. 235 – 240. https://doi.org/10.1007/BF00993309
- Sawant, R. Jangid,Y., Tiwari, T., Jain, S., Gupta A., "Comprehensive Analysis of Housing Price Prediction in Pune Using Multi-Featured Random Forest Approach," 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), Pune, India, 1-5, 2018. doi: 10.1109/ICCUBEA.2018.8697402.
- Truong, Q., Nguyen, M., Dang, H., Mei, B., 2020, “Housing Price Prediction via Improved Machine Learning Techniques”, Procedia Computer Science, vol. 174, pp. 433-442. https://doi.org/10.1016/j.procs.2020.06.111
- Vanbelle, S., 2017, “Comparing Dependent Kappa Coefficients Obtained On Multilevel Data” Biom J., vol. 59, no. 5, pp. 1016‐ 1034. https://doi.org/10.1002/bimj.201600093
- Wang, F., Zou, Y., Zhang, H., Shi, H., “House Price Prediction Approach Based on Deep Learning And ARIMA Model”, IEEE 7th International Conference on Computer Science and Network Technology (ICCSNT), Dalian, China, 303-307, 19-20 Oct. 2019. doi: 10.1109/ICCSNT47585.2019.8962443
- Ward, M.D., Gleditsch, K.S., 2019, Spatial Regression Models, 2nd ed., Sage Publications, Thousand Oaks, CA, USA.
Varma, A., Sarma, A., Doshi, S., Nair, R., “House Price Prediction Using Machine Learning and Neural Networks”, 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT), Coimbatore, India, 1936-1939, 20-21 April 2018. doi: 10.1109/ICICCT.2018.8473231.
- Wu, C., Ren, F., Hu, W., Du, Q., 2019, “Multiscale Geographically and Temporally Weighted Regression: Exploring the Spatiotemporal Determinants of Housing Prices”, International Journal of Geographical Information Science, vol. 33, no. 3, pp. 489-511, DOI: 10.1080/13658816.2018 .1545158
- Yalçın, G., Selçuk, O., Şentürk, E., 2018, “Bursa İli Mustafakemalpaşa İlçesi Tarım Arazilerinde Kapitalizasyon Oranının Tespiti,” Afyon Kocatepe Üniversitesi Fen ve Mühendislik Bilimleri Dergisi, vol. 18, no. 2, pp. 548-560. doi: 10.5578/fmbd.67386
- Yılmaz, M., 2019, “Gayrimenkul Değerleme Yöntemleri Ve Bir Uygulama”, Yüksek Lisans Tezi, Marmara Üniversitesi, Sosyal Bilimler Enstitüsü, İstanbul
- Zhang P., Ma, W., Zhang, T., 2012, “Application of Artificial Neural Network to Predict Real Estate Investment in Qingdao”, Future Communication, Computing, Control and Management. Lecture Notes in Electrical Engineering, 141, Editör: Zhang, Y., Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27311-7_28
Gayrimenkul Fiyat Tahmini ve Alttaki Özelliklerin Analizi İçin C4.5 – CART Karar Ağacı Modeli
Year 2022,
Volume: 10 Issue: 1, 147 - 161, 01.03.2022
Sait Yücebaş
,
Melike Doğan
,
Levent Genç
Abstract
Fiyat tahmini için makina öğrenmesi uygulamaları farklı alanlarda kullanılmaktadır. Gayrimenkul alanında fiyat tahmini son yıllarda ön plana çıkmaktadır. Ancak, çalışmaların büyük bölümü tahmin performansına odaklanmış olup fiyata etki eden faktörlerin incelenmesi göz ardı edilmiştir. Bu çalışmada gayrimenkul fiyat tahmin için bir C4.5 – CART ağacı modeli geliştirilmiştir. Bu model hem nümerik hem de kategorik fiyat tahmini yapabilmektedir. Ek olarak fiyata etki eden faktörler detaylıca analiz edilerek ortaya çıkarılmıştır. İlgili modelin performansı bu alanda bir altın standart olan Direkt Kapitalazyon modeli ile karşılaştırılmıştır. Her iki model web kazıyıcı tarafından elde edilen güncel gerçek zamanlı veri kümeleri üzerinde test edilmiştir. Nümerik tahmin için geliştirilen modelin kök ortalama kare hatası 13.169 iken Direkt Kapitalizasyon için 359,69 bulunmuştur. Kategorik tahmin için kesinlik ve KAPPA metrikleri kullanılmıştır. Modelin KAPPA sayısı %81 ve kesinlik değeri %88’dir.
References
- Abidoye, R.B., Chan, A.P.C., 2017, “Modelling property values in Nigeria using artificial neural network”, Journal of Property Research, vol. 34, no. 1, pp. 36-53. doi: 10.1080/09599916.2017.1286366
- Adetiloye, K.A., Eke, P.D., 2014, “A Review of Real Estate Valuation And Optimal Pricing Techniques”, Asian Economic and Financial Review, vol. 4, no. 12, pp. 1878-1893. doi: https://doi.org/10.1108/JERER-08-2018-0035
- Afonso, B.K.A., Melo, L.C., Oliveira1, W.D.G., Sousa, S.B.S., Berton, L., 2019, “Housing Prices Prediction with a Deep Learning and Random Forest Ensemble”, web adresi: https://www.researchgate.net/publication/335527230_Housing_Prices_Prediction_with_a_Deep_Learning_and_Random_Forest_Ensemble, Ziyaret Tarihi: 20.12.20201
- Armstrong, S., Collopy, F., 1992, “Error Measures For Generalizing About Forecasting Methods: Empirical Comparisons”, International Journal of Forecasting, vol.8, no.1, pp. 69-80, 1992. https://doi.org/10.1016/0169-2070 (92)90008-W
- Arslan, A., 2016, “Kentsel Alanlarda Taşınmaz Değerlemesi”, Yüksek Lisans Tezi, Balıkesir Üniversitesi, Fen Bilimleri Enstitüsü, Balıkesir
- Breiman, L., 2001, “Random Forests”, Machine Learning, vol. 45, pp. 5–32
- Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A., 1984, “Classification And Regression Trees”, 1st ed., Brooks/Cole Publishing, Monterey, CA, USA.
- He, H.M., Chen, Y., Xiao, J.Y., Chen, X.Q. Lee, Z.J., 2021, “Data Analysis on the Influencing Factors of the Real Estate Price”, Artificial Intelligence Evolution [Internet]. 2021Sep.10 [cited 2021Dec.23]; 2(2):52-66. Available from: https://ojs.wiserpub.com/index.php/AIE/article/view/966
- Hong, J., Choi, H., Kim, W., 2020, “A house price valuation based on the random forest approach: the mass appraisal of residential property in South Korea”, International Journal of Strategic Property Management, vol. 24, no. 3, pp 140-152. https://doi.org/10.3846/ijspm.2020.11544
- Khalafallah,A., 2008, "Neural network based model for predicting housing market performance", Tsinghua Science and Technology, vol. 13, no. S1, pp. 325-328. doi: 10.1016/S1007-0214(08)70169-X
- Levantesi, S., Piscopo, G., 2020, “The Importance of Economic Variables on London Real Estate Market: A Random Forest Approach”, Risks, vol. 8, pp. 112. https://doi.org/10.3390/risks8040112
- Li, L., Chu, K., “Prediction of Real Estate Price Variation Based on Economic Parameters”, 2017 International Conference on Applied System Innovation (ICASI), Sapporo, Japan, 87-90, 2020. doi: 10.1109/ICASI.2017.7988353
- Madhuri, C.R., Anuradha, G., Pujitha, M.V., 2019, “House Price Prediction Using Regression Techniques: A Comparative Study”, International Conference on Smart Structures and Systems (ICSSS), Chennai, India, 1-5, 14-15 March 2019. doi: 10.1109/ICSSS.2019.8882834
- Manasa, J., Gupta, R., Narahari, N.S., “Machine Learning Based Predicting House Prices Using Regression Techniques”, 2020 2nd International Conference on Innovative Mechanisms for Industry Applications (ICIMIA), Bangalore, India, 624-630, 2020. doi: 10.1109/ICIMIA48430 .2020.9074952
- Mayer M., Bourassa, M., Hoesli, D., Scognamiglio, D., 2019, “Estimation and Updating
Methods for Hedonic Valuation”, Journal of European Real Estate Research, vol. 12, no. 1, pp. 134-150. https://doi.org/10.1108/JERER-08-2018-0035.
- Michaletz, V.B., Artemenkov, A., 2018, “The Transactional Assets Pricing Approach and Income Capitalization Models In Professional Valuation: Towards A Quick Income Capitalization Format”, De Gruyter, vol. 26, no. 1, pp. 89-107. doi: 10.2478/remav-2018-0008.
- Mukhlishin, M.F., Saputra, R., Wibowo, A., "Predicting House Sale Price Using Fuzzy Logic, Artificial Neural Network and K-Nearest Neighbor", 2017 1st International Conference on Informatics and Computational Sciences (ICICoS), Semarang, Indonesia, pp. 171-176, 2017. doi: 10.1109/ICICOS.2017.8276357
- Onurlu, Ö., 2006, Uluslararası Değerleme Standartlarının Türkiye’de Uygulanması Sürecinde Gelir Kapitalizasyonu Yaklaşımının İrdelenmesi, Yüksek Lisans Tezi, İstanbul Teknik Üniversitesi, Fen Bilimleri Enstitüsü, İstanbul.
- Park, B., Bae, J.K., 2015, “Using Machine Learning Algorithms for Housing Price Prediction: The Case of Fairfax County, Virginia Housing Data”, Expert Systems with Applications, vol. 42, no. 6, pp. 2928-2934. https://doi.org/10.1016/j.eswa.2014.11.040
- Peter, N.J., Okagbue, H.I., Obasi, E. C.M., Akinola, A.O., 2020, “Review on the Application of Artificial Neural Networks in Real Estate Valuation”, International Journal of Advanced Trends in Computer Science and Engineering, vol. 9, no. 3, pp. 2918–2925. https://doi.org/10.30534/IJATCSE/2020/66932020)
- Pınar, A., Demir, M., 2014, “Konut Sektöründe Kapitalizasyon Oranlarını Belirleyen Faktörler: Türkiye için Bir Mikro-Veri Analizi,” Sosyoekonomi, vol. 22, no. 22, pp. 386-398.
- Piao, Y., Chen, A., Shang, Z., “Housing Price Prediction Based on CNN”, 9th International Conference on Information Science and Technology (ICIST), Hulunbuir, China, 491-495, 2-5 Aug. 2019. doi: 10.1109/ICIST.2019.8836731
- Phan, T.D., “Housing Price Prediction Using Machine Learning Algorithms: The Case of Melbourne City, Australia”, International Conference on Machine Learning and Data Engineering (iCMLDE), Sydney, NSW, Australia, 35-42, 3-7 Dec. 2018. doi: 10.1109/iCMLDE.2018.00017
- Rave, J.I.P., Morales, J.C.C., Echavarría, F.G., 2019, “A Machine Learning Approach to Big Data Regression Analysis of Real Estate Prices for Inferential and Predictive Purposes, Journal of Property Research, vol. 36, no. 1, pp. 59- 96, DOI: 10.1080/09599916.2019.1587489
- Salzberg, S.L, 1994, “C4.5: Programs for Machine Learning by J. Ross Quinlan. Morgan Kaufmann Publishers, Inc., 1993”, Machine Learning, vol. 16, pp. 235 – 240. https://doi.org/10.1007/BF00993309
- Sawant, R. Jangid,Y., Tiwari, T., Jain, S., Gupta A., "Comprehensive Analysis of Housing Price Prediction in Pune Using Multi-Featured Random Forest Approach," 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), Pune, India, 1-5, 2018. doi: 10.1109/ICCUBEA.2018.8697402.
- Truong, Q., Nguyen, M., Dang, H., Mei, B., 2020, “Housing Price Prediction via Improved Machine Learning Techniques”, Procedia Computer Science, vol. 174, pp. 433-442. https://doi.org/10.1016/j.procs.2020.06.111
- Vanbelle, S., 2017, “Comparing Dependent Kappa Coefficients Obtained On Multilevel Data” Biom J., vol. 59, no. 5, pp. 1016‐ 1034. https://doi.org/10.1002/bimj.201600093
- Wang, F., Zou, Y., Zhang, H., Shi, H., “House Price Prediction Approach Based on Deep Learning And ARIMA Model”, IEEE 7th International Conference on Computer Science and Network Technology (ICCSNT), Dalian, China, 303-307, 19-20 Oct. 2019. doi: 10.1109/ICCSNT47585.2019.8962443
- Ward, M.D., Gleditsch, K.S., 2019, Spatial Regression Models, 2nd ed., Sage Publications, Thousand Oaks, CA, USA.
Varma, A., Sarma, A., Doshi, S., Nair, R., “House Price Prediction Using Machine Learning and Neural Networks”, 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT), Coimbatore, India, 1936-1939, 20-21 April 2018. doi: 10.1109/ICICCT.2018.8473231.
- Wu, C., Ren, F., Hu, W., Du, Q., 2019, “Multiscale Geographically and Temporally Weighted Regression: Exploring the Spatiotemporal Determinants of Housing Prices”, International Journal of Geographical Information Science, vol. 33, no. 3, pp. 489-511, DOI: 10.1080/13658816.2018 .1545158
- Yalçın, G., Selçuk, O., Şentürk, E., 2018, “Bursa İli Mustafakemalpaşa İlçesi Tarım Arazilerinde Kapitalizasyon Oranının Tespiti,” Afyon Kocatepe Üniversitesi Fen ve Mühendislik Bilimleri Dergisi, vol. 18, no. 2, pp. 548-560. doi: 10.5578/fmbd.67386
- Yılmaz, M., 2019, “Gayrimenkul Değerleme Yöntemleri Ve Bir Uygulama”, Yüksek Lisans Tezi, Marmara Üniversitesi, Sosyal Bilimler Enstitüsü, İstanbul
- Zhang P., Ma, W., Zhang, T., 2012, “Application of Artificial Neural Network to Predict Real Estate Investment in Qingdao”, Future Communication, Computing, Control and Management. Lecture Notes in Electrical Engineering, 141, Editör: Zhang, Y., Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27311-7_28