[1]唐中勇 付强 卓佳 陈焕文.一类基于启发式搜索的激励学习算法[J].计算机技术与发展,2006,(08):41-43.
 TANG Zhong-yong,FU Qiang,ZHUO Jia,et al.A Class of Reinforcement Learning Algorithm Based on Heuristic Search[J].,2006,(08):41-43.
点击复制

一类基于启发式搜索的激励学习算法()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:
2006年08期
页码:
41-43
栏目:
智能、算法、系统工程
出版日期:
1900-01-01

文章信息/Info

Title:
A Class of Reinforcement Learning Algorithm Based on Heuristic Search
文章编号:
1673-629X(2006)08-0041-03
作者:
唐中勇 付强 卓佳 陈焕文
长沙理工大学计算机通讯工程学院
Author(s):
TANG Zhong-yongFU Qiang ZHUO JiaCHEN Huan-wen
Dept. of Computer and Communication,Changsha Univ. of Sci. and Techn
关键词:
启发式搜索激励学习启发式SARSA
Keywords:
heuristic searchreinforcement learningH-SARSA
分类号:
TP301.6
文献标志码:
A
摘要:
激励学习已被证明是在控制领域中一种可行的新方法。相比其他的方法,它能较好地处理未知环境问题,但它仍然不是一种有效的方法。幸运的是,在现实世界中,智能体总是会有一些环境的先验知识,这些能形成启发式信息。启发式搜索是一种常用的搜索方法,有很快的搜索速度,但需要精确的启发式信息,这在有些时候难以得到。文中分析比较了启发式搜索和激励学习的各自特点,提出一类新的基于启发式搜索的激励学习算法,初步的实验结果显示了较好的性能
Abstract:
The reinforcement learning has been proved to be a new applicable method in control field. It can solve the problems of unknown environment better than the others. But it isn't a very effective method yet. Fortunately in real world,the agent often has some knowledge of the environment,which can be used as heuristic information. The heuristic search is a very effective search method,which can search very quickly. But it need very precise heuristic information, which may be hard to get in complex environment. The characteristics of heuristic search and reinforcement learning are compared and a class of reinforcement learning algorithm on heuristic search is introduced. The preliminary empirical result shows better than the previous

相似文献/References:

[1]冯晓辉 马光思.数码谜题求解的算法设计及其扩展研究[J].计算机技术与发展,2009,(08):110.
 FENG Xiao-hui,MA Guang-si.Algorithm Design and Extension Research of N - Puzzle Problem[J].,2009,(08):110.
[2]裴芳敏 亿珍珍 赵克.启发式搜索在数学智能解题系统中的应用研究[J].计算机技术与发展,2010,(07):5.
 PEI Fang-min,YI Zhen-zhen,ZHAO Ke.Application and Research of a Heuristic Search in Intelligent Mathematics Problem Solving System[J].,2010,(08):5.
[3]付强 陈焕文.中国象棋人机对弈的自学习方法研究[J].计算机技术与发展,2007,(12):76.
 FU Qiang,CHEN Huan-wen.Research on Methods of Self- Teaching of Chinese Chess Game[J].,2007,(08):76.
[4]朱永红 张燕平.用VC++实现基于A*算法的八数码问题[J].计算机技术与发展,2006,(09):32.
 ZHU Yong-hong,ZHANG Yan-ping.Programming for Eight - Figure Puzzle Problem Based on Algorithm A * with Visual C + +[J].,2006,(08):32.
[5]刘源旭 郦江源.基于模糊QoS满意度的启发式多约束路由算法[J].计算机技术与发展,2011,(12):52.
 LIU Yuan-xu,LI Jiang-yuan.A Heuristic Multi-Constraints Routing Algorithm Based on Fuzzy QoS Satisfaction[J].,2011,(08):52.
[6]唐德权,史伟奇.一种改进的车辆路径调度算法研究[J].计算机技术与发展,2018,28(01):112.[doi:10.3969/ j. issn.1673-629X.2018.01.024]
 TANG De-quan,SHI Wei-qi.Research on an Improved Vehicle Routing Scheduling Algorithm[J].,2018,28(08):112.[doi:10.3969/ j. issn.1673-629X.2018.01.024]
[7]刘晓峰 *,刘智斌,董兆安.基于记忆启发的强化学习方法研究[J].计算机技术与发展,2023,33(06):168.[doi:10. 3969 / j. issn. 1673-629X. 2023. 06. 025]
 LIU Xiao-feng *,LIU Zhi-bin,DONG Zhao-an.Research on Memory Heuristic Reinforcement Learning[J].,2023,33(08):168.[doi:10. 3969 / j. issn. 1673-629X. 2023. 06. 025]

备注/Memo

备注/Memo:
唐中勇(1977-),男,湖南衡阳人,硕士研究生,研究方向为激励学习 陈焕文,博士,教授,研究方向为激励学习、人工智能等
更新日期/Last Update: 1900-01-01