基于强化学习的四旋翼无人机控制律设计

首页 > 过刊浏览>2021年第29卷第2期 >71-75

基于强化学习的四旋翼无人机控制律设计
DOI:
                        
CSTR:
                        
作者:
                        
作者单位:西北工业大学 自动化学院
作者简介:
通讯作者:
中图分类号:
基金项目:航空科学基金资助( 201905053003)；陕西省飞行控制与仿真技术重点实验室资助

Design of Control Law for Quadrotor UAV Based on Reinforcement Learning

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

目前四旋翼无人机大部分都采用经典控制方法进行控制律的设计，然而控制参数的选择和对被控对象数学模型的依赖一直是经典控制方法设计中需要克服的问题；针对此问题，采用了一种基于深度强化学习算法Deep Q Network的无人机控制律设计方法，以四旋翼姿态角和姿态角速率作为三层神经网络的输入数据，最终输出动作值函数，再根据贪婪策略进行动作的选取，通过与环境的不断交互，智能体根据奖惩信息来更新神经网络的权值，使得智能体朝着获得累积回报最大值的方向选取动作；仿真结果表明在经过强化学习训练之后，四旋翼姿态角能够快速准确地跟踪上参考指令的变化，证明了基于强化学习的四旋翼无人机控制律的可行性，从而避免了传统控制方法对控制参数的选择与控制模型的依赖。

Abstract:

At present, most of the quadrotor UAVs use the classic control method to design the control law. However, the selection of control parameters and the dependence on the mathematical model of the controlled object have always been problems that need to be overcome in the design of the classic control method. Aiming at this problem, a design method of UAV control law based on deep reinforcement learning algorithm Deep Q Network is adopted. The quadrotor attitude angle and attitude angle rate are used as the input data of the three-layer neural network, and finally the action value function is output. Then, the action is selected according to the greedy strategy. Through continuous interaction with the environment, the agent updates the weight of the neural network according to the reward and punishment information, so that the agent selects the action in the direction of obtaining the maximum cumulative return. The simulation results show that after the reinforcement learning training, the quadrotor attitude angle can quickly and accurately track the change of the reference command, which proves the feasibility of the quadrotor UAV control law based on reinforcement learning, thus avoiding the dependence of traditional control methods on the selection of control parameters and control model.

参考文献

相似文献

引证文献

引用本文

梁晨,刘小雄,张兴旺,黄剑雄.基于强化学习的四旋翼无人机控制律设计计算机测量与控制[J].,2021,29(2):71-75.

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2020-06-13
最后修改日期:2020-07-08
录用日期:2020-07-08
在线发布日期: 2021-02-08
出版日期:

引用本文

相关视频

分享

文章指标

历史

文章二维码