[1]乔 通,周 洲,程 鑫,等.基于 Q-学习的底盘测功机自适应 PID 控制模型[J].计算机技术与发展,2022,32(05):117-122.[doi:10. 3969 / j. issn. 1673-629X. 2022. 05. 020]
 QIAO Tong,ZHOU Zhou,CHENG Xin,et al.Adaptive PID Control Model of Chassis Dynamometer Based on Q-Learning[J].,2022,32(05):117-122.[doi:10. 3969 / j. issn. 1673-629X. 2022. 05. 020]
点击复制

基于 Q-学习的底盘测功机自适应 PID 控制模型()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
32
期数:
2022年05期
页码:
117-122
栏目:
应用前沿与综合
出版日期:
2022-05-10

文章信息/Info

Title:
Adaptive PID Control Model of Chassis Dynamometer Based on Q-Learning
文章编号:
1673-629X(2022)05-0116-06
作者:
乔 通12 周 洲12 程 鑫12 郭兰英1 王润民12
1. 长安大学 信息工程学院,陕西 西安 710064;
2. 陕西省车联网与智能汽车测试技术工程研究中心,陕西 西安 710064
Author(s):
QIAO Tong12 ZHOU Zhou12 CHENG Xin12 GUO Lan-ying1 WANG Run-min12
1. School of Information Engineering,Chang’an University,Xi’an 710064,China;
2. Shaanxi Engineering Research Center of Internet of Vehicles and Intelligent Vehicle Testing Technology,Xi’an 710064,China
关键词:
强化学习PID 控制Q 学习控制策略底盘测功机
Keywords:
reinforcement learningPID controlQ learningcontrol strategychassis dynamometer
分类号:
TP391
DOI:
10. 3969 / j. issn. 1673-629X. 2022. 05. 020
摘要:
为了解决汽车底盘测功机控制系统在动态控制时出现延迟较高和误差大的问题,提出了一种基于强化学习的底盘测功机控制策略。 以 PID 控制算法为基础,扭力偏差为控制器输入,调节电压控制量为输出,选择扭力差变化为智能体奖惩的学习策略,通过 Q 学习算法对 PID 参数进行在线自适应整定;在底盘测功机仿真试验中验证了控制器的调控性能,并与传统 PID 控制以及神经网络 PID 控制的结果进行了对比;实验结果表明,基于 Q 学习的自适应 PID 控制模型较传统PID 算法控制周期缩减至 40. 7% ,相较于神经网络 PID 算法控制周期缩短至 27. 9% 。 相对于传统 PID 控制模型与神经网络 PID 模型,基于 Q 学习的自适应 PID 控制模型输出力上升过程稳定且快速。 提出的基于 Q 学习的自适应 PID 控制模型能够有效提升底盘测功机控制精度,满足其使用的工业要求。
Abstract:
In order to solve the problems of high delay and large error in dynamic control of chassis dynamometer control system,a chassis dynamometer control strategy based on reinforcement learning is proposed. Based on the PID control algorithm,the torque deviation is the input of the controller, the control quantity is the output, and the selection of the torque difference is the learning strategy of the intelligent body reward and the penalty, and the PID parameter is adjusted by the Q learning algorithm. In the simulation test of chassis dynamical machine,the control performance of? the controller is verified, and the results of the traditional PID control and the neural network PID control are compared. The experimental results show that the control cycle of the adaptive PID control model based on Q learning is reduced to 40. 7% compared with the traditional PID algorithm, and the control cycle is shortened by 27. 9% compared with the neural network PID algorithm. Compared with the traditional PID control model and neural network PID model,the process of output force rising of the adaptive PID control model based on Q-Learning is stable and fast. The proposed adaptive PID control model based on Q learning can effectively improve the control accuracy of the chassis dynamometer and meet the industrial requirements of the chassis dynamometer.

相似文献/References:

[1]冯林 李琛 孙焘.Robocup半场防守中的一种强化学习算法[J].计算机技术与发展,2008,(01):59.
 FENG Lin,LI Chen,SUN Tao.A Reinforcement Learning Method for Robocup Soccer Half Field Defense[J].,2008,(05):59.
[2]汤萍萍 王红兵.基于强化学习的Web服务组合[J].计算机技术与发展,2008,(03):142.
 TANG Ping-ping,WANG Hong-bing.Web Service Composition Based on Reinforcement -Learning[J].,2008,(05):142.
[3]王朝晖 孙惠萍.图像检索中IRRL模型研究[J].计算机技术与发展,2008,(12):35.
 WANG Zhao-hui,SUN Hui-ping.Research of IRRL Model in Image Retrieval[J].,2008,(05):35.
[4]林联明 王浩 王一雄.基于神经网络的Sarsa强化学习算法[J].计算机技术与发展,2006,(01):30.
 LIN Lian-ming,WANG Hao,WANG Yi-xiong.Sarsa Reinforcement Learning Algorithm Based on Neural Networks[J].,2006,(05):30.
[5]农汉琦,孙蕴琪,黄 洁,等.基于机器学习的认知无线网络优化策略[J].计算机技术与发展,2020,30(05):125.[doi:10. 3969 / j. issn. 1673-629X. 2020. 05. 024]
 NONG Han-qi,SUN Yun-qi,HUANG Jie,et al.Optimization Strategy of Cognitive Radio Network Based on Machine Learning[J].,2020,30(05):125.[doi:10. 3969 / j. issn. 1673-629X. 2020. 05. 024]
[6]雷 莹,许道云.一种合作 Markov 决策系统[J].计算机技术与发展,2020,30(12):8.[doi:10. 3969 / j. issn. 1673-629X. 2020. 12. 002]
 LEI Ying,XU Dao-yun.A Cooperation Markov Decision Process System[J].,2020,30(05):8.[doi:10. 3969 / j. issn. 1673-629X. 2020. 12. 002]
[7]彭云建,梁 进.基于探索-利用权衡优化的 Q 学习路径规划[J].计算机技术与发展,2022,32(04):1.[doi:10. 3969 / j. issn. 1673-629X. 2022. 04. 001]
 PENG Yun-jian,LIANG Jin.Q-learning Path Planning Based on Exploration / Exploitation Tradeoff Optimization[J].,2022,32(05):1.[doi:10. 3969 / j. issn. 1673-629X. 2022. 04. 001]
[8]顾文斌,杨生胜,王贤良,等.基于模糊 RBF 神经网络的无刷直流电机 PID 控制[J].计算机技术与发展,2022,32(08):15.[doi:10. 3969 / j. issn. 1673-629X. 2022. 08. 003]
 GU Wen-bin,YANG Sheng-sheng,WANG Xian-liang,et al.PID Control of Brushless DC Motor Based on Fuzzy RBF Neural Network[J].,2022,32(05):15.[doi:10. 3969 / j. issn. 1673-629X. 2022. 08. 003]
[9]魏竞毅,赖 俊,陈希亮.基于互信息的智能博弈对抗分层强化学习研究[J].计算机技术与发展,2022,32(09):142.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 022]
 WEI Jing-yi,LAI Jun,CHEN Xi-liang.Research on Hierarchical Reinforcement Learning of Intelligent Game Confrontation Based on Mutual Information[J].,2022,32(05):142.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 022]
[10]吴 鹏,魏上清,董嘉鹏,等.基于 SARSA 强化学习的审判人力资源调度方法[J].计算机技术与发展,2022,32(09):82.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 013]
 WU Peng,WEI Shang-qing,DONG Jia-peng,et al.Trial Human Resources Scheduling Method Based on SARSA Reinforcement Learning[J].,2022,32(05):82.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 013]

更新日期/Last Update: 2022-05-10