基于D-DQN强化学习算法的双足机器人智能控制研究

首页 > 过刊浏览>2024年第32卷第3期 >181-187

基于D-DQN强化学习算法的双足机器人智能控制研究
DOI:
                        
CSTR:
                        
作者:
                        
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:2022年度广州华商学院高等教育教学改革项目（HS2022ZLGC71）

Research on Intelligent Control of Biped Robot Based on D-DQN Reinforcement Learning Algorithm

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

针对现有双足机器人智能控制算法存在的轨迹偏差大、效率低等问题,提出了一种基于D-DQN强化学习的控制算法。先分析双足机器人运动中的坐标变换关系和关节连杆补偿过程,然后基于Q值网络实现对复杂运动非线性过程降维处理,采用了Q值网络权值和辅助权值的双网络权值设计方式,进一步强化DQN网络性能,并以Tanh函数作为神经网络的激活函数,提升DQN网络的数值训练能力。在数据训练和交互中经验回放池发挥出关键的辅助作用,通过将奖励值输入到目标函数中,进一步提升对双足机器人的控制精度,最后通过虚拟约束控制的方式提高双足机器人运动中的稳定性。实验结果显示:在D-DQN强化学习的控制算法,机器人完成第一阶段测试的时间仅为115s,综合轨迹偏差0.02m,而且步态切换极限环测试的稳定性良好。

Abstract:

Aiming at the problems of large trajectory deviation and low efficiency of existing intelligent control algorithms for biped robots, a control algorithm based on D-DQN reinforcement learning is proposed. Firstly, the coordinate transformation relationship in the motion of biped robot and the compensation process of joint and link are analyzed, and then the dimensionality reduction of complex nonlinear motion process is realized based on Q-value network. The double weight design method of Q-value network weight and auxiliary weight is adopted to strengthen the performance of DQN network, and Tanh function is used as the activation function of neural network to improve the numerical training ability of DQN network. The experience playback pool plays a key auxiliary role in data training and interaction. By inputting the reward value into the objective function, the control accuracy of the biped robot is further improved. Finally, the stability of the biped robot is improved by virtual constraint control. The experimental results show that under the D-DQN reinforcement learning control algorithm, the time of the robot to complete the first stage test is only 115s, the comprehensive trajectory deviation is 0.02m, and the stability of the gait switching limit cycle test is good.

参考文献

相似文献

引证文献

引用本文

李丽霞,陈艳.基于D-DQN强化学习算法的双足机器人智能控制研究计算机测量与控制[J].,2024,32(3):181-187.

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2023-08-22
最后修改日期:2023-09-08
录用日期:2023-09-11
在线发布日期: 2024-04-01
出版日期:

引用本文

分享

相关视频

文章指标

历史

文章二维码