相似文献/References:
[1]冯林 李琛 孙焘.Robocup半场防守中的一种强化学习算法[J].计算机技术与发展,2008,(01):59.
FENG Lin,LI Chen,SUN Tao.A Reinforcement Learning Method for Robocup Soccer Half Field Defense[J].,2008,(05):59.
[2]汤萍萍 王红兵.基于强化学习的Web服务组合[J].计算机技术与发展,2008,(03):142.
TANG Ping-ping,WANG Hong-bing.Web Service Composition Based on Reinforcement -Learning[J].,2008,(05):142.
[3]王朝晖 孙惠萍.图像检索中IRRL模型研究[J].计算机技术与发展,2008,(12):35.
WANG Zhao-hui,SUN Hui-ping.Research of IRRL Model in Image Retrieval[J].,2008,(05):35.
[4]林联明 王浩 王一雄.基于神经网络的Sarsa强化学习算法[J].计算机技术与发展,2006,(01):30.
LIN Lian-ming,WANG Hao,WANG Yi-xiong.Sarsa Reinforcement Learning Algorithm Based on Neural Networks[J].,2006,(05):30.
[5]农汉琦,孙蕴琪,黄 洁,等.基于机器学习的认知无线网络优化策略[J].计算机技术与发展,2020,30(05):125.[doi:10. 3969 / j. issn. 1673-629X. 2020. 05. 024]
NONG Han-qi,SUN Yun-qi,HUANG Jie,et al.Optimization Strategy of Cognitive Radio Network Based on Machine Learning[J].,2020,30(05):125.[doi:10. 3969 / j. issn. 1673-629X. 2020. 05. 024]
[6]雷 莹,许道云.一种合作 Markov 决策系统[J].计算机技术与发展,2020,30(12):8.[doi:10. 3969 / j. issn. 1673-629X. 2020. 12. 002]
LEI Ying,XU Dao-yun.A Cooperation Markov Decision Process System[J].,2020,30(05):8.[doi:10. 3969 / j. issn. 1673-629X. 2020. 12. 002]
[7]彭云建,梁 进.基于探索-利用权衡优化的 Q 学习路径规划[J].计算机技术与发展,2022,32(04):1.[doi:10. 3969 / j. issn. 1673-629X. 2022. 04. 001]
PENG Yun-jian,LIANG Jin.Q-learning Path Planning Based on Exploration / Exploitation Tradeoff Optimization[J].,2022,32(05):1.[doi:10. 3969 / j. issn. 1673-629X. 2022. 04. 001]
[8]顾文斌,杨生胜,王贤良,等.基于模糊 RBF 神经网络的无刷直流电机 PID 控制[J].计算机技术与发展,2022,32(08):15.[doi:10. 3969 / j. issn. 1673-629X. 2022. 08. 003]
GU Wen-bin,YANG Sheng-sheng,WANG Xian-liang,et al.PID Control of Brushless DC Motor Based on Fuzzy RBF Neural Network[J].,2022,32(05):15.[doi:10. 3969 / j. issn. 1673-629X. 2022. 08. 003]
[9]魏竞毅,赖 俊,陈希亮.基于互信息的智能博弈对抗分层强化学习研究[J].计算机技术与发展,2022,32(09):142.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 022]
WEI Jing-yi,LAI Jun,CHEN Xi-liang.Research on Hierarchical Reinforcement Learning of Intelligent Game Confrontation Based on Mutual Information[J].,2022,32(05):142.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 022]
[10]吴 鹏,魏上清,董嘉鹏,等.基于 SARSA 强化学习的审判人力资源调度方法[J].计算机技术与发展,2022,32(09):82.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 013]
WU Peng,WEI Shang-qing,DONG Jia-peng,et al.Trial Human Resources Scheduling Method Based on SARSA Reinforcement Learning[J].,2022,32(05):82.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 013]