基于自适应状态聚集Q学习的移动机器人动态规划方法

首页 > 过刊浏览>2014年第22卷第10期 >3419-3422

基于自适应状态聚集Q学习的移动机器人动态规划方法
DOI:
                        
CSTR:
                        
作者:
                        
作者单位:(1.江苏大学 计算机科学与通信工程学院,江苏 镇江 212013; ;2.镇江高等专科学校 电子信息系,江苏 镇江 212000)
作者简介:王 辉(1980),女,江苏丹阳人,讲师,硕士研究生,主要从事虚拟现实和人工智能方向的研究。
通讯作者:
中图分类号:TP393
基金项目:江苏省高校自然科学研究计划(03kjd520075)。

A Dynamic Planning Method for Mobile Robot Based on Adaptive State Aggregating Q-Learning

Author:

Affiliation:

(1.School of Computer Science and Telecommunication Engineering, Jiangsu University, Zhenjiang 212013, China ;2. Electron&Information Department,Zhenjiang College,Zhenjiang 212000,China)

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

针对现有移动机器人路径规划方法存在的收敛速度慢和难以进行在线规划的问题,研究了一种基于状态聚集SOM网和带资格迹Q学习的移动机器人路径动态规划方法——SQ(λ)；首先,设计了系统的总体闭环规划模型,将整个系统分为前端(状态聚集)和后端(路径规划)；然后,在传统的SOM基础上增加输出层构建出三层的SOM网实现对移动机器人状态的聚集,并给出了三层SOM网的训练算法；最后,基于聚集的状态提出了一种基于带资格迹和探索因子自适应变化的改进Q学习算法实现最优策略的获取,并能根据改进Q学习算法的收敛速度自适应地控制前端SOM输出层神经元的增减,从而改进整体算法的收敛性能；仿真实验表明:文中设计的SQ(λ)能有效地实现移动机器人的路径规划,较其它算法相比,具有收敛速度快和寻优能力强的优点,具有较大的优越性。

Abstract:

Aiming at the given path planning method for mobile robot has the slow convergence rate and hard to plan on-line, a dynamic path planning method based on state aggregating SOM net and Q-Learning is researched. Firstly, the planning model of whole system is designed and it is divided into two parts such as frontier part (state aggregating) and back part (path planning), then the three-layer SOM net is developed to realize the aggregation of states based on the traditional SOM, the training algorithm for three-layer SOM net is given. Finally, a algorithm for obtaining the optimal strategy based on eligibility trace and adaptive changing explore factor is proposed, and the number of output nodes of SOM can be adaptive increase or decrease according to the convergence extent of the Q(λ), therefore, the whole convergence can be guaranteed by the improved algorithm. The simulation experiment shows the method designed can realize the path planning, and compared with the other methods, it has the rapid convergence rate and the ability to get the optimal solution, and it is proved to be has big priority over the other methods.

参考文献

相似文献

引证文献

引用本文

王辉,宋昌统.基于自适应状态聚集Q学习的移动机器人动态规划方法计算机测量与控制[J].,2014,22(10):3419-3422.

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:
最后修改日期:
录用日期:
在线发布日期: 2015-01-15
出版日期:

引用本文

分享

相关视频

文章指标

历史

文章二维码