Dinamik Ortamlarda Derin Takviyeli Öğrenme Tabanlı Otonom Yol Planlama Yaklaşımları için Karşılaştırmalı Analiz

Ziya Tan; Mehmet Karaköse

doi:10.54365/adyumbd.1025545

Research Article

Dinamik Ortamlarda Derin Takviyeli Öğrenme Tabanlı Otonom Yol Planlama Yaklaşımları için Karşılaştırmalı Analiz

Year 2022, Volume: 9 Issue: 16, 248 - 262, 14.04.2022

Ziya Tan , Mehmet Karaköse

https://doi.org/10.54365/adyumbd.1025545

Abstract

Takviyeli öğrenme, içinde bulunduğu ortamı algılayan ve kendi kendine kararlar verebilen bir sistemin, mevcut problemin çözümünde doğru kararlar almayı nasıl öğrenebileceği bir yöntemdir. Bu makalede, bir robotun haraketli engellerin(yayalar) olduğu bir ortamda engellere çarpmadan belirtilen alanda otonom bir şekilde hareket etmeyi öğrenmesi için derin takviyeli öğrenme tabanlı bir algoritma önerilmektedir. Oluşturulan simülatör ortamında derin öğrenme algoritmalarından Convolutional Neural Network(CNN), Long-short Term Memory(LSTM) ve Recurrent Neural Network(RNN) ayrı ayrı kullanılıp performansları test edilerek raporlanmıştır. Buna göre bu makale kapsamında literatüre üç önemli katkı sunulmaktadır. Birincisi etkili bir otonom robot algoritmasının geliştirilmesi, ikincisi probleme uygun olarak uyarlanabilen derin öğrenme algoritmasının belirlenmesi, üçüncü olarak otonom bir robotun hareketli engellerin olduğu kalabalık ortamlardaki hareket eylemini gerçekleştirmesi için genelleştirilmiş bir derin takviyeli öğrenme yaklaşımının ortaya konulmasıdır. Geliştirilen yaklaşımların doğrulanması için derin takviyeli öğrenme algoritmaları ayrı ayrı simüle edilerek eğitimi gerçekleştirilmiştir. Yapılan eğitim sonuçlarına göre, LSTM algoritmasının diğerlerinden daha başarılı olduğu tespit edilmiştir.

Keywords

Derin takviyeli öğrenme, Derin öğrenme, Otonom yol planlama, LSTM, RNN

References

Z. Tong, H. Chen , X. Deng, K. Li ve K. Li, A. Scheduling scheme in the cloud computing environment using deep Q –learning. Information Sciences 2020: 1171-1191.
L. A. Baxter. Markov decision processes: Discrete stochastic dynamic programming. Technometrics 1995; 37(3): 353-353.
C. J. Watkins ve P. Dayan. Q-Learning. Machine Learning 1992;3(8): 279-292.
C. Berner, G. Brockman, B. Chan, V. Cheung, C. Dennison, D. Farhi, Q. Fischer, S. Hashme, C. Hesse, R. Józefowicz, S. Gray, C. Olsson, J. Pachocki, M. Petrov, H. P. d. O. Pinto, J. Raiman, T. Salimans, J. Schlatter, J. Schneider, S. Sidor, . I. Sutskever, J. Tang, F. Wolski ve S. Zhang. Dota 2 with large scale deep reinforcement learning. arXiv:1912.06680v1 2019.
O. Vinyals, I. Babuschkin, W. M. Czarnecki, M. Mathieu, A. Dudzik, J. Chung, D. H. Choi, R. Powell, T. Ewalds, P. Georgiev, J. Oh, D. Horgan, M. Kroiss, I. Danihelka, A. Huang, L. Sifre ve T. Cai. Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 2019;575: 350-354.
M. Jaderberg, W. M. Czarnecki, I. Dunning, L. Marris, G. Lever, A. G. Castañeda, C. Beattie, N. C. Rabinowitz, A. S. Morcos, A. Ruderman ve N. Sonnerat. Human-level performance in 3D multiplayer games with population-based reinforcement learning. Science 2019;364:859-865.
A. Graves, G. Wayne, . M. Reynolds, T. Harley, . I. Danihelka, S. G. Colmenarejo, E. Grefenstette, . T. Ramalho ve J. Agapiou. Hybrid computing using a neural network with dynamic external memory. Nature 2016; 538: 471-476.
G. Wayne, C.-C. Hung, D. Amos, M. Mirza, A. Ahuja, A. Grabska-Barwinska, J. Rae, P. Mirowski, J. Z. Leibo, M. Gemici, M. Reynolds, T. Harley, J. Abramson, S. Mohamed, D. Rezende, D. Saxton ve A. Cain. Unsupervised predictive memory in a goal-directed agent. arXiv:1803.10760, 2018.
S. W. Kaled ve Y. Sırma. Image visual sensor used in health-care navigation in indoor scenes using deep reinforcement learning (drl) and control sensor robot for patients data health ınformation. Journal of Medical Imaging and Health Informatics 2021;11(1).
I. Akkaya, A. Marcin, C. Maciek, L. Mateusz, M. Bob, P. Arthur, P. Alex, M. Plappert ve P. Glenn. Solvıng rubık’s cube with a robot hand. arXiv:1910.07113 2019.
S. Latif, H. Cuayáhuitl, F. Pervez, F. Shamshad, H. S. Ali ve E. Cambria. A survey on deep reinforcement learning for audio-based applications. arXiv:2101.00240 2021.
T. Rajapakshe, R. Rana ve S. Khalifa. A novel policy for pre-trained deep reinforcement learning for speech emotion recognition. arXiv:2101.00738 2021.
M. Luong ve C. Pham. Incremental learning for autonomous navigation of mobile robots based on deep reinforcement learning. Journal of Intelligent & Robotic Systems 2020;101(1): 1-11.
C. Yan, X. Xiang ve C. Wang. Towards real-time path planning through deep reinforcement learning for a uav in dynamic environments. Journal of Intelligent & Robotic Systems 2020; 98: 297-309.
S. Wen, Y. Zhao, X. Yuan, Z. Wang, D. Zhang ve L. Manfredi. Path planning for active SLAM based on deep reinforcement learning under unknown environments. Intelligent Service Robotics 2020; 1-10.
S. Guo, X. Zhang, Y. Zheng ve Y. Du. An autonomous path planning model for unmanned ships based on deep reinforcement learning. Sensors 2020; 20(2): 426-440.
L. He, N. Aouf ve B. Song. Explainable deep reinforcement learning for uav autonomous path planning. Aerospace science and technology 2021;118.
P. Li, M. A. Aty ve J. Yuan. Real-time crash risk prediction on arterials based on LSTM-CNN. Accident Analysis & Prevention, 2020.
Z. Tan ve M. Karaköse. On-Policy deep reinforcement learning approach to multi agent problems. In Interdisciplinary Research in Technology and Management, Kolkata 2021.
B. Bulut, V. Kalın, B. B. Güneş ve R. Khazhin. Deep learning approach for detection of retinal abnormalities based on color fundus ımages. 2020 Innovations in Intelligent Systems and Applications Conference (ASYU), İstanbul,Türkiye 2020.
S.Bozkurt. Derin öğrenme algoritmaları kullanılarak çay alanlarının otomatik segmentasyonu, Yüksek Lisans Tezi. İstanbul 2018.
M. M. Ejaz, T. B. Tang ve C.-K. Lu. Autonomous visual navigation using deep reinforcement learning: An Overview. IEEE Student Conference on Research and Development. Bandar Seri Iskandar, Malezya 2019.
D. Silver, A. Huang, C. Maddison, A. Guez, L. Sifre ve V. Den. Mastering the game of go with deep neural networks and tree search. Nature 2016; 529: 484-495.
S. Carta, A. Ferreira, A. S. Podda, D. R. Recupero ve A. Sanna. Multi-DQN: An ensemble of deep q-learning agents for stock market forecasting. Expert Systems with Applications 2021;164.
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg ve D. Hassabis. Human-level control through deep reinforcement learning. Nature 2015: 529-533.
Y. LeCun, Y. Bengio ve G. Hinton. Deep Learning. Review 2015; 521:436-450.
S. Dupond. A thorough review on the current advance of neural network structures. Annual Reviews in Control 2019;14: 200-230.
A. Tealab. Time series forecasting using artificial neural networks methodologies: A systematic review. Future Computing and Informatics Journal 2018; 3(2): 334-340.
F. Rundo. Deep LSTM with reinforcement learning layer for financial trend prediction in fx high frequency trading systems. Applied Sciences 2019; 20(9): 44-60.
M. Hibat-Allah, M. Ganahl, L. E. Hayward, R. G. Melko ve J. Carrasquilla. Recurrent neural network wave functions. Physıcal Revıew Research 2020;2(2).
X. Li, L. Li, J. Gao, X. He, J. Chen, L. Deng ve J. He. Recurrent reinforcement learning: A hybrid approach. arXiv:1509.0344, 2015.
S. Hochreiter ve J. Schmidhuber. Long short-term memory. Neural Computation 1997; 9(8): 1735–1780.
Z. Qun, L. Xu ve G. Zhang. LSTM neural network with emotional analysis for prediction of stock price. Engineering Letters 2017; 25(2).
Y. Bengio, P. Simard ve P. Frasconi. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks 1994;5(2):157-166.
A. Sherstinsky. Fundamentals of recurrent neural network (rnn) and long short-term memory (LSTM) network. Physica D: Nonlinear Phenomena 2020; 404.
F. Shahid, A. Zameer ve M. Muneeb. Predictions for COVID-19 with deep learning models of LSTM, GRU and Bi-LSTM. ScienceDirect 2020; 140.
H. Fan, M. Jiang, L. Xu, H. Zhu, J. Cheng ve J. Jiang. Comparison of long short term memory networks and the hydrological model in runoff simulation. Water 2020; 12(1): 175-180.
Z. Tan ve M. Karaköse. Proximal policy based deep reinforcement learning approach for swarm robots. In 2021 Zooming Innovation in Consumer Technologies Conference (ZINC). Novi Sad, 2021.
S. Ha, J. Kim ve K. Yamane. Automated deep reinforcement learning environment for hardware of a modular legged robot. 15th International Conference on Ubiquitous Robots 2018:348-354.
A. Ramaswamy. Theory of deep q-learning: a dynamical systems perspective. arXiv:2008.10870v1, 2020.
R. S. Sutton ve A. G. Barto. Reinforcement Learning:An Introduction. London: MIT Press, 2015.
T. T. Nguyen, N. D. Nguyen ve S. Nahavandi. Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges,Solutions, and Applications. IEEE Transactıons on Cybernetıcs 2020; 50(9).
S. Bhagat, H. Banerjee, Z. T. H. Tse ve H. Ren. Deep reinforcement learning for soft, flexible robots:brief review with impending challenges. Robotics, 2019.
J. Qi, J. Du, S. M. Siniscalchi, X. Ma ve C.-H. Lee. On mean absolute error for deep neural network based vector-to-vector regression. IEEE Signal Processing Letters 2020;27: 1485 – 1489.
Z. Tan ve M. Karaköse. Comparative evaluation for effectiveness analysis of policy based deep reinforcement learning approaches. International Journal of Computer and Information Technology 2021;10(3): 1-15.

Year 2022, Volume: 9 Issue: 16, 248 - 262, 14.04.2022

Ziya Tan , Mehmet Karaköse

https://doi.org/10.54365/adyumbd.1025545

Abstract

References

Z. Tong, H. Chen , X. Deng, K. Li ve K. Li, A. Scheduling scheme in the cloud computing environment using deep Q –learning. Information Sciences 2020: 1171-1191.
L. A. Baxter. Markov decision processes: Discrete stochastic dynamic programming. Technometrics 1995; 37(3): 353-353.
C. J. Watkins ve P. Dayan. Q-Learning. Machine Learning 1992;3(8): 279-292.
C. Berner, G. Brockman, B. Chan, V. Cheung, C. Dennison, D. Farhi, Q. Fischer, S. Hashme, C. Hesse, R. Józefowicz, S. Gray, C. Olsson, J. Pachocki, M. Petrov, H. P. d. O. Pinto, J. Raiman, T. Salimans, J. Schlatter, J. Schneider, S. Sidor, . I. Sutskever, J. Tang, F. Wolski ve S. Zhang. Dota 2 with large scale deep reinforcement learning. arXiv:1912.06680v1 2019.
O. Vinyals, I. Babuschkin, W. M. Czarnecki, M. Mathieu, A. Dudzik, J. Chung, D. H. Choi, R. Powell, T. Ewalds, P. Georgiev, J. Oh, D. Horgan, M. Kroiss, I. Danihelka, A. Huang, L. Sifre ve T. Cai. Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 2019;575: 350-354.
M. Jaderberg, W. M. Czarnecki, I. Dunning, L. Marris, G. Lever, A. G. Castañeda, C. Beattie, N. C. Rabinowitz, A. S. Morcos, A. Ruderman ve N. Sonnerat. Human-level performance in 3D multiplayer games with population-based reinforcement learning. Science 2019;364:859-865.
A. Graves, G. Wayne, . M. Reynolds, T. Harley, . I. Danihelka, S. G. Colmenarejo, E. Grefenstette, . T. Ramalho ve J. Agapiou. Hybrid computing using a neural network with dynamic external memory. Nature 2016; 538: 471-476.
G. Wayne, C.-C. Hung, D. Amos, M. Mirza, A. Ahuja, A. Grabska-Barwinska, J. Rae, P. Mirowski, J. Z. Leibo, M. Gemici, M. Reynolds, T. Harley, J. Abramson, S. Mohamed, D. Rezende, D. Saxton ve A. Cain. Unsupervised predictive memory in a goal-directed agent. arXiv:1803.10760, 2018.
S. W. Kaled ve Y. Sırma. Image visual sensor used in health-care navigation in indoor scenes using deep reinforcement learning (drl) and control sensor robot for patients data health ınformation. Journal of Medical Imaging and Health Informatics 2021;11(1).
I. Akkaya, A. Marcin, C. Maciek, L. Mateusz, M. Bob, P. Arthur, P. Alex, M. Plappert ve P. Glenn. Solvıng rubık’s cube with a robot hand. arXiv:1910.07113 2019.
S. Latif, H. Cuayáhuitl, F. Pervez, F. Shamshad, H. S. Ali ve E. Cambria. A survey on deep reinforcement learning for audio-based applications. arXiv:2101.00240 2021.
T. Rajapakshe, R. Rana ve S. Khalifa. A novel policy for pre-trained deep reinforcement learning for speech emotion recognition. arXiv:2101.00738 2021.
M. Luong ve C. Pham. Incremental learning for autonomous navigation of mobile robots based on deep reinforcement learning. Journal of Intelligent & Robotic Systems 2020;101(1): 1-11.
C. Yan, X. Xiang ve C. Wang. Towards real-time path planning through deep reinforcement learning for a uav in dynamic environments. Journal of Intelligent & Robotic Systems 2020; 98: 297-309.
S. Wen, Y. Zhao, X. Yuan, Z. Wang, D. Zhang ve L. Manfredi. Path planning for active SLAM based on deep reinforcement learning under unknown environments. Intelligent Service Robotics 2020; 1-10.
S. Guo, X. Zhang, Y. Zheng ve Y. Du. An autonomous path planning model for unmanned ships based on deep reinforcement learning. Sensors 2020; 20(2): 426-440.
L. He, N. Aouf ve B. Song. Explainable deep reinforcement learning for uav autonomous path planning. Aerospace science and technology 2021;118.
P. Li, M. A. Aty ve J. Yuan. Real-time crash risk prediction on arterials based on LSTM-CNN. Accident Analysis & Prevention, 2020.
Z. Tan ve M. Karaköse. On-Policy deep reinforcement learning approach to multi agent problems. In Interdisciplinary Research in Technology and Management, Kolkata 2021.
B. Bulut, V. Kalın, B. B. Güneş ve R. Khazhin. Deep learning approach for detection of retinal abnormalities based on color fundus ımages. 2020 Innovations in Intelligent Systems and Applications Conference (ASYU), İstanbul,Türkiye 2020.
S.Bozkurt. Derin öğrenme algoritmaları kullanılarak çay alanlarının otomatik segmentasyonu, Yüksek Lisans Tezi. İstanbul 2018.
M. M. Ejaz, T. B. Tang ve C.-K. Lu. Autonomous visual navigation using deep reinforcement learning: An Overview. IEEE Student Conference on Research and Development. Bandar Seri Iskandar, Malezya 2019.
D. Silver, A. Huang, C. Maddison, A. Guez, L. Sifre ve V. Den. Mastering the game of go with deep neural networks and tree search. Nature 2016; 529: 484-495.
S. Carta, A. Ferreira, A. S. Podda, D. R. Recupero ve A. Sanna. Multi-DQN: An ensemble of deep q-learning agents for stock market forecasting. Expert Systems with Applications 2021;164.
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg ve D. Hassabis. Human-level control through deep reinforcement learning. Nature 2015: 529-533.
Y. LeCun, Y. Bengio ve G. Hinton. Deep Learning. Review 2015; 521:436-450.
S. Dupond. A thorough review on the current advance of neural network structures. Annual Reviews in Control 2019;14: 200-230.
A. Tealab. Time series forecasting using artificial neural networks methodologies: A systematic review. Future Computing and Informatics Journal 2018; 3(2): 334-340.
F. Rundo. Deep LSTM with reinforcement learning layer for financial trend prediction in fx high frequency trading systems. Applied Sciences 2019; 20(9): 44-60.
M. Hibat-Allah, M. Ganahl, L. E. Hayward, R. G. Melko ve J. Carrasquilla. Recurrent neural network wave functions. Physıcal Revıew Research 2020;2(2).
X. Li, L. Li, J. Gao, X. He, J. Chen, L. Deng ve J. He. Recurrent reinforcement learning: A hybrid approach. arXiv:1509.0344, 2015.
S. Hochreiter ve J. Schmidhuber. Long short-term memory. Neural Computation 1997; 9(8): 1735–1780.
Z. Qun, L. Xu ve G. Zhang. LSTM neural network with emotional analysis for prediction of stock price. Engineering Letters 2017; 25(2).
Y. Bengio, P. Simard ve P. Frasconi. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks 1994;5(2):157-166.
A. Sherstinsky. Fundamentals of recurrent neural network (rnn) and long short-term memory (LSTM) network. Physica D: Nonlinear Phenomena 2020; 404.
F. Shahid, A. Zameer ve M. Muneeb. Predictions for COVID-19 with deep learning models of LSTM, GRU and Bi-LSTM. ScienceDirect 2020; 140.
H. Fan, M. Jiang, L. Xu, H. Zhu, J. Cheng ve J. Jiang. Comparison of long short term memory networks and the hydrological model in runoff simulation. Water 2020; 12(1): 175-180.
Z. Tan ve M. Karaköse. Proximal policy based deep reinforcement learning approach for swarm robots. In 2021 Zooming Innovation in Consumer Technologies Conference (ZINC). Novi Sad, 2021.
S. Ha, J. Kim ve K. Yamane. Automated deep reinforcement learning environment for hardware of a modular legged robot. 15th International Conference on Ubiquitous Robots 2018:348-354.
A. Ramaswamy. Theory of deep q-learning: a dynamical systems perspective. arXiv:2008.10870v1, 2020.
R. S. Sutton ve A. G. Barto. Reinforcement Learning:An Introduction. London: MIT Press, 2015.
T. T. Nguyen, N. D. Nguyen ve S. Nahavandi. Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges,Solutions, and Applications. IEEE Transactıons on Cybernetıcs 2020; 50(9).
S. Bhagat, H. Banerjee, Z. T. H. Tse ve H. Ren. Deep reinforcement learning for soft, flexible robots:brief review with impending challenges. Robotics, 2019.
J. Qi, J. Du, S. M. Siniscalchi, X. Ma ve C.-H. Lee. On mean absolute error for deep neural network based vector-to-vector regression. IEEE Signal Processing Letters 2020;27: 1485 – 1489.
Z. Tan ve M. Karaköse. Comparative evaluation for effectiveness analysis of policy based deep reinforcement learning approaches. International Journal of Computer and Information Technology 2021;10(3): 1-15.

There are 45 citations in total.

Details

Primary Language	Turkish
Subjects	Engineering
Journal Section	Research Article
Authors	Ziya Tan 0000-0003-2813-5882 Mehmet Karaköse 0000-0002-3276-3788
Publication Date	April 14, 2022
Submission Date	November 18, 2021
Published in Issue	Year 2022 Volume: 9 Issue: 16

Cite

APA	Tan, Z., & Karaköse, M. (2022). Dinamik Ortamlarda Derin Takviyeli Öğrenme Tabanlı Otonom Yol Planlama Yaklaşımları için Karşılaştırmalı Analiz. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi, 9(16), 248-262. https://doi.org/10.54365/adyumbd.1025545
AMA	Tan Z, Karaköse M. Dinamik Ortamlarda Derin Takviyeli Öğrenme Tabanlı Otonom Yol Planlama Yaklaşımları için Karşılaştırmalı Analiz. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi. April 2022;9(16):248-262. doi:10.54365/adyumbd.1025545
Chicago	Tan, Ziya, and Mehmet Karaköse. “Dinamik Ortamlarda Derin Takviyeli Öğrenme Tabanlı Otonom Yol Planlama Yaklaşımları için Karşılaştırmalı Analiz”. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi 9, no. 16 (April 2022): 248-62. https://doi.org/10.54365/adyumbd.1025545.
EndNote	Tan Z, Karaköse M (April 1, 2022) Dinamik Ortamlarda Derin Takviyeli Öğrenme Tabanlı Otonom Yol Planlama Yaklaşımları için Karşılaştırmalı Analiz. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi 9 16 248–262.
IEEE	Z. Tan and M. Karaköse, “Dinamik Ortamlarda Derin Takviyeli Öğrenme Tabanlı Otonom Yol Planlama Yaklaşımları için Karşılaştırmalı Analiz”, Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi, vol. 9, no. 16, pp. 248–262, 2022, doi: 10.54365/adyumbd.1025545.
ISNAD	Tan, Ziya - Karaköse, Mehmet. “Dinamik Ortamlarda Derin Takviyeli Öğrenme Tabanlı Otonom Yol Planlama Yaklaşımları için Karşılaştırmalı Analiz”. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi 9/16 (April 2022), 248-262. https://doi.org/10.54365/adyumbd.1025545.
JAMA	Tan Z, Karaköse M. Dinamik Ortamlarda Derin Takviyeli Öğrenme Tabanlı Otonom Yol Planlama Yaklaşımları için Karşılaştırmalı Analiz. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi. 2022;9:248–262.
MLA	Tan, Ziya and Mehmet Karaköse. “Dinamik Ortamlarda Derin Takviyeli Öğrenme Tabanlı Otonom Yol Planlama Yaklaşımları için Karşılaştırmalı Analiz”. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi, vol. 9, no. 16, 2022, pp. 248-62, doi:10.54365/adyumbd.1025545.
Vancouver	Tan Z, Karaköse M. Dinamik Ortamlarda Derin Takviyeli Öğrenme Tabanlı Otonom Yol Planlama Yaklaşımları için Karşılaştırmalı Analiz. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi. 2022;9(16):248-62.

Download Cover Image

Article Files

Full Text