Research Article
BibTex RIS Cite
Year 2023, , 54 - 60, 30.06.2023
https://doi.org/10.36222/ejt.1237590

Abstract

References

  • [1] Z. Qin, X. Zhang, X. Zhang, B. Lu, Z. Liu, and L. Guo, “The uav trajectory optimization for data collection from time-constrained iot devices: A hierarchical deep q-network approach,” Applied Sciences, vol. 12, no. 5, pp. 2546, 2022.
  • [2] Bithas, P.S., Michailidis, E.T., Nomikos, N., Vouyioukas, D. and Kanatas, A.G., 2019. A survey on machine-learning techniques for UAV-based communications. Sensors, vol. 19, no. 23, pp.5170.
  • [3] H. Bayerlein, M. Theile, M. Caccamo, and D. Gesbert, “Multi-UAV path planning for wireless data harvesting with deep reinforcement learning,” IEEE Open Journal of the Communications Society, vol. 2, pp. 1171– 1187, 2021.
  • [4] Y. Yao, Z. Zhu, S. Huang, X. Yue, C. Pan, and X. Li, “Energy efficiency characterization in heterogeneous iot system with uav swarms based on wireless power transfer,“ IEEE Access, vol. 8, pp. 967–979, 2019.
  • [5] Z. Wang and J. Cai, “Probabilistic roadmap method for path-planning in radioactive environment of -nuclear facilities,” Progress in Nuclear Energy, vol. 109, pp.113–120, 2018.
  • [6] A. Upadhyay, K. R. Shrimali, and A. Shukla, “Uav-robot relationship for coordination of robots on a collision free path,” Procedia Computer Science, vol. 133, pp. 424–431, 2018.
  • [7] F. Yan, Y.-S. Liu, and J.-Z. Xiao, “Path planning in complex 3d environments using a probabilistic roadmap method,” International Journal of Automation and computing, vol. 10, no. 6, pp. 525–533, 2016.
  • [8] S. Jain, R. C. Shah, W. Brunette, G. Borriello, and S. Roy, “Exploiting mobility for energy efficient data collection in wireless sensor networks,” Mobile networks and Applications, vol. 11, no. 3, pp. 327–339, 2006.
  • [9] C.S. Choi and F. Baccelli, “Spatial and temporal analysis of direct communications from static devices to mobile vehicles,” IEEE Transactions on Wireless Communications, vol. 18, no. 11, pp. 5128–5140, 2019.
  • [10] R. Marini, L. Spampinato, S. Mignardi, R. Verdone, and C. Buratti, “Reinforcement learning-based trajectory planning for uav-aided vehicular communications,” 2022 30th European Signal Processing Conference (EUSIPCO), IEEE, pp. 967–971, 2022.
  • [11] A. Kaplan, N. Kingry, P. Uhing, and R. Dai, “Time-optimal path planning with power schedules for a solar-powered ground robot,” IEEE Transactions on Automation Science and Engineering, vol. 14, no. 2, pp. 1235–1244, 2006.
  • [12] M. R. Jabbarpour, H. Zarrabi, J. J. Jung, and P. Kim, “A green ant-based method for path planning of unmanned ground vehicles,” IEEE access, vol. 5, pp. 1820–1832, 2017.
  • [13] E. Almoaili and H. Kurdi, “Path planning algorithm for unmanned ground vehicles (ugvs) in known static environments,” Procedia Computer Science, vol. 177, pp. 57–63, 2020.
  • [14] Y.-C. Wang and K.-C. Chen, “Efficient path planning for a mobile sink to reliably gather data from sensors with diverse sensing rates and limited buffers,” IEEE Transactions on Mobile Computing, vol. 18, no. 7, pp. 1527–1540, 2018.
  • [15] D. Liu, S. Wang, Z. Wen, L. Cheng, M. Wen, and Y.-C. Wu, “Edge learning with unmanned ground vehicle: Joint path, energy, and sample size planning,” IEEE Internet of Things Journal, vol. 8, no. 4, pp. 2959–2975, 2020.
  • [16] A. T. Azar, A. Koubaa, N. Ali Mohamed, H. A. Ibrahim, Z. F. Ibrahim, M. Kazim, A. Ammar, B. Benjdira, A. M. Khamis, I. A. Hameed, et al., “Drone deep reinforcement learning: A review,” Electronics, vol. 10, no. 9, pp. 999, 2021.
  • [17] K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “Deep reinforcement learning: A brief survey,” IEEE Signal Processing Magazine, vol. 34, no. 6, pp. 26–38, 2017.
  • [18] C. J. C. H. Watkins, Learning from delayed rewards. King’s College, May 1989.
  • [19] M. Kusy and R. Zajdel, “Stateless Q-Learning algorithm for training of radial basis function based neural networks in medical data classification,” Advances in Intelligent Systems and Computing, vol. 230, pp. 267–278, 2014.
  • [20] C. J. C. H. Watkins and P. Dayan, “Q-learning,” Machine learning, vol. 8, pp. 279-292, 1992.
  • [21] S. Bhatnagar, R. S: Sutton, M. Ghavamzadeh, and M. Lee, “Incremental Natural Actor-Critic Algorithms,” NeurIPS, pp. 729-736, 1993.
  • [22] R. S. Sutton and A. G. Barto, “Reinforcement learning: An introduction,” MIT press, 2018.
  • [23] Y. Hu, D. Li, Y. He, and J. Han, "Incremental Learning Framework for Autonomous Robots Based on Q-Learning and the Adaptive Kernel Linear Model," in IEEE Transactions on Cognitive and Developmental Systems, vol. 14, no. 1, pp. 64-74, March 2022.
  • [24] V. Mnih, et al, “Human-level control through deep reinforcement learning. Nature, vol. 518, no. 7540, pp. 529-533, 2015.
  • [25] J. Zhu, Y. Song, D. Jiang and H. Song, "A New Deep-Q-Learning-Based Transmission Scheduling Mechanism for the Cognitive Internet of Things," in IEEE Internet of Things Journal, vol. 5, no. 4, pp. 2375-2385, Aug. 2018.
  • [26] X. Zhou, W. Liang, K. I. -K. Wang, H. Wang, L. T. Yang and Q. Jin, "Deep-Learning-Enhanced Human Activity Recognition for Internet of Healthcare Things," in IEEE Internet of Things Journal, vol. 7, no. 7, pp. 6429-6438, July 2020.
  • [27] A. Weissensteiner, "A Q-Learning Approach to Derive Optimal Consumption and Investment Strategies," in IEEE Transactions on Neural Networks, vol. 20, no. 8, pp. 1234-1243, Aug. 2009
  • [28] M. Ye, C. Tianqing and F. Wenhui, "A single-task and multi-decision evolutionary game model based on multi-agent reinforcement learning," in Journal of Systems Engineering and Electronics, vol. 32, no. 3, pp. 642-657, June 2021
  • [29] Z. Xiaochuan, W. Wanwan, L. Qin, W. Tianao and S. Hao, "The Design and Realization of Dynamic Evaluation Strategy of Pieces in Military Chess Game System," 2019 Chinese Control And Decision Conference (CCDC) pp. 6287-6292, Nanchang, China, 2019.
  • [30] T. N. Larsen, H. Ø. Teigen, T. Laache, D. Varagnolo, and A. Rasheed, “Comparing deep reinforcement learning algorithms’ ability to safely navigate challenging waters,” Frontiers in Robotics and AI 8.

Q-Learning Based Obstacle Avoidance Data Harvesting Model Using UAV and UGV

Year 2023, , 54 - 60, 30.06.2023
https://doi.org/10.36222/ejt.1237590

Abstract

The Internet of Things (IoT) has revolutionized our lives by providing convenience in various aspects of our lives. However, for the IoT environment to function optimally, it is crucial to regularly collect data from IoT devices. This is because timely data collection enables more accurate evaluations and insights. Additionally, energy conservation is another crucial aspect to consider when collecting data, as it can have a significant impact on the sustainability of the IoT ecosystem. To achieve this, Unmanned Aerial Vehicles (UAVs) and Unmanned Ground Vehicles (UGVs) are increasingly being used to collect data. In this study, we delve into the problem of how UAVs and UGVs can effectively and efficiently collect data from IoT devices in an environment with obstacles. To address this challenge, we propose a Q-learning-based Obstacle Avoidance Data Harvesting (QOA-DH) method, which utilizes the principles of reinforcement learning to make decisions on data collection. Additionally, we conduct a comparison of the performance of UAVs and UGVs, considering the different restrictions and assumptions that are unique to each type of vehicle. This research aims to improve the overall efficiency and effectiveness of data collection in IoT environments and pave the way for sustainable IoT solutions.

References

  • [1] Z. Qin, X. Zhang, X. Zhang, B. Lu, Z. Liu, and L. Guo, “The uav trajectory optimization for data collection from time-constrained iot devices: A hierarchical deep q-network approach,” Applied Sciences, vol. 12, no. 5, pp. 2546, 2022.
  • [2] Bithas, P.S., Michailidis, E.T., Nomikos, N., Vouyioukas, D. and Kanatas, A.G., 2019. A survey on machine-learning techniques for UAV-based communications. Sensors, vol. 19, no. 23, pp.5170.
  • [3] H. Bayerlein, M. Theile, M. Caccamo, and D. Gesbert, “Multi-UAV path planning for wireless data harvesting with deep reinforcement learning,” IEEE Open Journal of the Communications Society, vol. 2, pp. 1171– 1187, 2021.
  • [4] Y. Yao, Z. Zhu, S. Huang, X. Yue, C. Pan, and X. Li, “Energy efficiency characterization in heterogeneous iot system with uav swarms based on wireless power transfer,“ IEEE Access, vol. 8, pp. 967–979, 2019.
  • [5] Z. Wang and J. Cai, “Probabilistic roadmap method for path-planning in radioactive environment of -nuclear facilities,” Progress in Nuclear Energy, vol. 109, pp.113–120, 2018.
  • [6] A. Upadhyay, K. R. Shrimali, and A. Shukla, “Uav-robot relationship for coordination of robots on a collision free path,” Procedia Computer Science, vol. 133, pp. 424–431, 2018.
  • [7] F. Yan, Y.-S. Liu, and J.-Z. Xiao, “Path planning in complex 3d environments using a probabilistic roadmap method,” International Journal of Automation and computing, vol. 10, no. 6, pp. 525–533, 2016.
  • [8] S. Jain, R. C. Shah, W. Brunette, G. Borriello, and S. Roy, “Exploiting mobility for energy efficient data collection in wireless sensor networks,” Mobile networks and Applications, vol. 11, no. 3, pp. 327–339, 2006.
  • [9] C.S. Choi and F. Baccelli, “Spatial and temporal analysis of direct communications from static devices to mobile vehicles,” IEEE Transactions on Wireless Communications, vol. 18, no. 11, pp. 5128–5140, 2019.
  • [10] R. Marini, L. Spampinato, S. Mignardi, R. Verdone, and C. Buratti, “Reinforcement learning-based trajectory planning for uav-aided vehicular communications,” 2022 30th European Signal Processing Conference (EUSIPCO), IEEE, pp. 967–971, 2022.
  • [11] A. Kaplan, N. Kingry, P. Uhing, and R. Dai, “Time-optimal path planning with power schedules for a solar-powered ground robot,” IEEE Transactions on Automation Science and Engineering, vol. 14, no. 2, pp. 1235–1244, 2006.
  • [12] M. R. Jabbarpour, H. Zarrabi, J. J. Jung, and P. Kim, “A green ant-based method for path planning of unmanned ground vehicles,” IEEE access, vol. 5, pp. 1820–1832, 2017.
  • [13] E. Almoaili and H. Kurdi, “Path planning algorithm for unmanned ground vehicles (ugvs) in known static environments,” Procedia Computer Science, vol. 177, pp. 57–63, 2020.
  • [14] Y.-C. Wang and K.-C. Chen, “Efficient path planning for a mobile sink to reliably gather data from sensors with diverse sensing rates and limited buffers,” IEEE Transactions on Mobile Computing, vol. 18, no. 7, pp. 1527–1540, 2018.
  • [15] D. Liu, S. Wang, Z. Wen, L. Cheng, M. Wen, and Y.-C. Wu, “Edge learning with unmanned ground vehicle: Joint path, energy, and sample size planning,” IEEE Internet of Things Journal, vol. 8, no. 4, pp. 2959–2975, 2020.
  • [16] A. T. Azar, A. Koubaa, N. Ali Mohamed, H. A. Ibrahim, Z. F. Ibrahim, M. Kazim, A. Ammar, B. Benjdira, A. M. Khamis, I. A. Hameed, et al., “Drone deep reinforcement learning: A review,” Electronics, vol. 10, no. 9, pp. 999, 2021.
  • [17] K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “Deep reinforcement learning: A brief survey,” IEEE Signal Processing Magazine, vol. 34, no. 6, pp. 26–38, 2017.
  • [18] C. J. C. H. Watkins, Learning from delayed rewards. King’s College, May 1989.
  • [19] M. Kusy and R. Zajdel, “Stateless Q-Learning algorithm for training of radial basis function based neural networks in medical data classification,” Advances in Intelligent Systems and Computing, vol. 230, pp. 267–278, 2014.
  • [20] C. J. C. H. Watkins and P. Dayan, “Q-learning,” Machine learning, vol. 8, pp. 279-292, 1992.
  • [21] S. Bhatnagar, R. S: Sutton, M. Ghavamzadeh, and M. Lee, “Incremental Natural Actor-Critic Algorithms,” NeurIPS, pp. 729-736, 1993.
  • [22] R. S. Sutton and A. G. Barto, “Reinforcement learning: An introduction,” MIT press, 2018.
  • [23] Y. Hu, D. Li, Y. He, and J. Han, "Incremental Learning Framework for Autonomous Robots Based on Q-Learning and the Adaptive Kernel Linear Model," in IEEE Transactions on Cognitive and Developmental Systems, vol. 14, no. 1, pp. 64-74, March 2022.
  • [24] V. Mnih, et al, “Human-level control through deep reinforcement learning. Nature, vol. 518, no. 7540, pp. 529-533, 2015.
  • [25] J. Zhu, Y. Song, D. Jiang and H. Song, "A New Deep-Q-Learning-Based Transmission Scheduling Mechanism for the Cognitive Internet of Things," in IEEE Internet of Things Journal, vol. 5, no. 4, pp. 2375-2385, Aug. 2018.
  • [26] X. Zhou, W. Liang, K. I. -K. Wang, H. Wang, L. T. Yang and Q. Jin, "Deep-Learning-Enhanced Human Activity Recognition for Internet of Healthcare Things," in IEEE Internet of Things Journal, vol. 7, no. 7, pp. 6429-6438, July 2020.
  • [27] A. Weissensteiner, "A Q-Learning Approach to Derive Optimal Consumption and Investment Strategies," in IEEE Transactions on Neural Networks, vol. 20, no. 8, pp. 1234-1243, Aug. 2009
  • [28] M. Ye, C. Tianqing and F. Wenhui, "A single-task and multi-decision evolutionary game model based on multi-agent reinforcement learning," in Journal of Systems Engineering and Electronics, vol. 32, no. 3, pp. 642-657, June 2021
  • [29] Z. Xiaochuan, W. Wanwan, L. Qin, W. Tianao and S. Hao, "The Design and Realization of Dynamic Evaluation Strategy of Pieces in Military Chess Game System," 2019 Chinese Control And Decision Conference (CCDC) pp. 6287-6292, Nanchang, China, 2019.
  • [30] T. N. Larsen, H. Ø. Teigen, T. Laache, D. Varagnolo, and A. Rasheed, “Comparing deep reinforcement learning algorithms’ ability to safely navigate challenging waters,” Frontiers in Robotics and AI 8.
There are 30 citations in total.

Details

Primary Language English
Subjects Computer Software
Journal Section Research Article
Authors

Erdal Akin 0000-0002-2223-3927

Yakup Şahin 0000-0001-6792-2550

Early Pub Date July 6, 2023
Publication Date June 30, 2023
Published in Issue Year 2023

Cite

APA Akin, E., & Şahin, Y. (2023). Q-Learning Based Obstacle Avoidance Data Harvesting Model Using UAV and UGV. European Journal of Technique (EJT), 13(1), 54-60. https://doi.org/10.36222/ejt.1237590

All articles published by EJT are licensed under the Creative Commons Attribution 4.0 International License. This permits anyone to copy, redistribute, remix, transmit and adapt the work provided the original work and source is appropriately cited.Creative Commons Lisansı