Trajectory Design for Multi-UAV Aided Communication with Actor-critic-based Reinforcement Learning

CHEN Ze-Chao; GUO Yi-Jun

Chinese︱Feedback︱Save this page

• Elaborating Academic Views 　　　　 • Exchanging Innovative Ideas
• Protecting Intellectual Properties 　　• Fast Sharing Science Papers

Sponsored by the Center for Science and Technology Development of the Ministry of Education
Supervised by Ministry of Education of the People's Republic of China

Home > Papers

Trajectory Design for Multi-UAV Aided Communication with Actor-critic-based Reinforcement Learning

CHEN Ze-Chao,GUO Yi-Jun *

School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876

*Correspondence author

#Submitted by

Subject:

Funding: none

Opened online: 8 March 2021

Accepted by: none

Citation: CHEN Ze-Chao,GUO Yi-Jun.Trajectory Design for Multi-UAV Aided Communication with Actor-critic-based Reinforcement Learning[OL]. [ 8 March 2021] http://en.paper.edu.cn/en_releasepaper/content/4753773

In this paper, the trajectory design problem is investigated in wireless communications aided by multiple unmanned aerial vehicles (UAVs), and a multi-UAV trajectory design method called multi-agent twin delayed deep deterministic policy gradient (MA-TD3) is proposed which is able to design continuous trajectories without pre-knowledge of global information such as user locations and channel conditions, through integrating the multi-agent deep deterministic policy gradient (MADDPG) algorithm and twin delayed deep deterministic policy gradient (TD3) algorithm based on actor-critic reinforcement learning (RL) framework. In particular, the multi-UAV trajectory design problem is firstly formulated as a stochastic game (SG) to maximize the completion rate of the transmission tasks. Then, the MA-TD3 method is proposed which is based on the actor-critic RL framework and the learned trajectory is obtained successively. Numerical results show that compared to traditional single agent RL methods, the proposed MA-TD3 method achieves higher completion rate of the transmission tasks by enabling cooperation between multiple UAVs through centralized training and distributed execution.

Keywords:Communication and Information System; trajectory design; multi-UAV aided communication; multi-agent reinforcement learning

For this paper

● PDF (0B)
● Revision 0 　　
● Print this paper
● Recommend this paper to a friend
● Add to my favorite list

Saved Papers

Please enter a name for this paper to be shown in your personalized Saved Papers list

Tags

Add yours

Related Papers

Statistics

PDF Downloaded	27
Bookmarked	0
Recommend	0
Comments	Array

Submit your papers

Alert Name:
Alerting to:
Authentication email will be sent to your email address in 24 hours
Frequency:
Email Message Format:	Plain text Graphical(HTML)

Complete the form below and we will recommend the selected titles to your friends on your behalf. * Indicates a required field.
Your name*:
Your email address*:
Recipient's name*:
Recipient's email address*:
(multiple recipient's names and email addresses should be separated with semicolons)
Your comments:	I thought you would find the page(s) useful.

Your name:
Your email address:
Recipient's name:
Recipient's email address:
(multiple recipient's names and email addresses should be separated with semicolons)
Your comments:	I thought you would find this page useful.

Disclaimer: This message was sent to your friend using the "Send it to a friend" facility on the Sciencepaper Online’ WWW site, http://www.paper.edu.cn/en. The Sciencepaper Online is not responsible for the content of this email, and anything said in this email does not necessarily reflect the Sciencepaper Online's views.

	Check out RSS, or use RSS reader to subscribe this item

Saved Papers