交通运输工程与信息学报

2022, 02, v.20;No.76 14-24

基于深度强化学习的无信号交叉口车辆协同控制算法

基金项目(Foundation): 产业技术基础公共服务平台项目(2019-00892-2-1)

邮箱(Email):

DOI: 10.19961/j.cnki.1672-4747.2021.11.021

1,677	27	141
下载次数	被引频次	阅读次数

引用本文下载本文

PDF

引用导出

GB/T 7714-2015 MLA APA Refworks EndNote NoteExpress NoteFirst

摘要全文参考文献出版信息相关文章

摘要：

针对未来智慧城市智能网联汽车通过无信号交叉口的通行效率问题，本文基于深度强化学习提出了一种渐进式价值期望估计的多智能体协同控制算法(PVE-MCC)。设计了基于渐进式学习的价值期望估计策略，通过动态改变价值期望学习目标，保证值函数网络渐进式地持续学习，避免策略网络陷入局部最优解，并将该策略与泛化优势估计算法结合，提升算法收敛精度和稳定性。其次，以通行效率、安全性和舒适性为优化目标，设计了多目标奖励函数来提高多智能体协同控制的综合性能。此外，无信号交叉口易出现的“死锁”现象给多车协同控制带来了巨大的挑战，针对这一问题，基于链表环形检测算法设计了启发式的“死锁”检测-破解干预策略，实现对“死锁”环的提前检测和破解，进一步保障交通通行的安全性。最后，本文搭建了双向六车道无信号交叉口场景的仿真实验平台，进行功能和性能验证。实验结果表明，PVE-MCC算法比现有方案提高交通流量30.47%，单车效率提升了95.56%，舒适性提升了53.82%。

关键词： 智能交通; 协同控制; 强化学习; 无信号交叉口; 智能网联汽车;

Abstract：

Aiming at the traffic efficiency of intelligent connected vehicles passing through a signalfree intersection in future smart cities,in this paper we propose a progressive value-expectation estimation multi-agent cooperative control(PVE-MCC) algorithm based on deep reinforcement learning. First,the PVE-MCC algorithm designs a progressive value-expectation estimation(PVE) strategy based on progressive learning by dynamically varying the value expectation learning goal from short-term to long-term changes.The value function network is guaranteed to gradually and continuously learn,and the strategic network is prevented from falling into a local optimal solution. Second,the PVE-MCC algorithm combines the PVE strategy with the generalized advantage estimation algorithm to improve the convergence accuracy and stability of the algorithm. Third,the PVE-MCC algorithm jointly takes traffic efficiency,safety,and comfort as the optimization objective,and designs a multi-objective reward function to improve the performance of multi-agent collaborative control. In addition,the “deadlock”phenomenon that easily occurs at signal-free intersections constitutes a remarkable challenge for multi-vehicle cooperative control. In response to this problem,the PVE-MCC algorithm based on the linked list ring detection algorithm designs a heuristic detection-cracking intervention strategy for the “deadlock” to ensure the safety of the intersection. Finally,we present a simulation experimental platform for a two-way six-lane signal-free intersection for verification. The experimental results show that the PVE-MCC algorithm improves the traffic flow rate by 30.47%,the single-vehicle efficiency by 95.56%,and the comfort by 53.82% compared with existing schemes.

KeyWords： intelligent transportation; cooperative control; reinforcement learning; signal-free intersection; intelligent connected vehicles;

参考文献

[1]张毅，姚丹亚，李力，等.智能车路协同系统关键技术与应用[J].交通运输系统工程与信息，2021,21(5):40-51.

[2] HOU Y. Cooperative and integrated vehicle and intersection control for energy efficiency(CIVIC-E2)[J]. IEEE Transactions on Intelligent Transportation Systems,2018,19(7):2325-2337.

[3]季金燕.自动驾驶情境下交叉路口交通协调与控制[D].南京：南京大学，2020.

[4] BIAN Y G,LI S E,REN W,et al. Cooperation of multiple connected vehicles at unsignalized intersections:distributed observation,optimization,and control[J]. IEEE Transactions on Industrial Electronics,2020,67(12):10744-10754.

[5]李勇.无信号灯十字交叉口协作车辆控制研究[D].北京：北京理工大学，2015.

[6] ZHANG Y J,MALIKOPOULOS A A,CASSANDRAS C G. Optimal control and coordination of connected and automated vehicles at urban traffic intersections[C]//2016American Control Conference,Boston:ACC, 2016.

[7]孙宁，吴伟豪，赵风财，等.基于增强型Dijkstra算法的无信号灯交叉路口智能车辆调度研究[J].计算机应用研究, 2022,39(1):188-193.

[8]赵幸.无信号灯交叉路口智能网联车辆多目标协同调度方法研究[D].南京：东南大学，2019.

[9]金立生，郭柏苍，谢宪毅，等.基于行车安全场模型的车辆控制算法[J/OL].西南交通大学学报：1-8(2021-07-09)[2021-11-06]. http：//kns. cnki. net/kcms/detail/51.1277.U.20210709. 1626.009.html.

[10]郑义.车联网环境下无信号交叉口车辆协同控制算法研究[D].长春：吉林大学，2020.

[11] HY A,RUI J B,ZH C,et al. Automated vehicle-involved traffic flow studies:a survey of assumptions,models,speculations,and perspectives[J]. Transportation Research Part C:Emerging Technologies,2021, 127.

[12]谭雪，张小强，石红国，等.基于强化学习的多时隙铁路空车实时调配研究[J].交通运输工程与信息学报，2020,18(4):53-60.

[13]徐东伟，周磊，王达，等.基于深度强化学习的城市交通信号控制综述[J].交通运输工程与信息学报，2022,20(1):15-30.

[14] KAI S,WANG B,CHEN D,et al. A multi-task reinforcement learning approach for navigating unsignalized intersections[C]//2020 IEEE Intelligent Vehicles Symposium(IV),Las Vegas:IEEE, 2020,1583-1588.

[15] SHU H,LIU T,MU X,et al. Driving tasks transfer using deep reinforcement learning for decision-making of autonomous vehicles in unsignalized intersection[J]. IEEE Transactions on Vehicular Technology,2022,77(1):41-52.

[16] GUAN Y,REN Y,LI S E,et al. Centralized cooperation for connected and automated vehicles at iIntersections by proximal policy optimization[J] IEEE Transactions on Vehicular Technology,2020,69(11):12597-12608.

[17] SHEHU H A,SHARIF M H, RAMADAN R A. Distributed mutual exclusion algorithms for intersection traffic problems[J]. IEEE Access,2020,8:138277-138296.

[18] WU T H,JIANG M Z,ZHANG L. Cooperative multiagent deep deterministic policy gradient(comaddpg)for intelligent connected transportation with unsignalized intersection[J]. Mathematical Problems in Engineering,2020,2020:1-12.

[19] SCHULMAN J,MORITZ P,LEVINE S,et al. High dimensional continuous control using generalized advantage estimation[C]//4th International Conference on Learning Representations,San Juan:ICLR, 2016:1-14.

[20] PEI H,ZHANG Y,TAO Q,et al. Distributed cooperative driving in multi-intersection road networks[J]. IEEE Transactions on Vehicular Technology,2021,70(6):5390-5403.

[21] ZHAO X,WANG J,YIN G,et al. Cooperative driving for connected and automated vehicles at non-signalized intersection based on model predictive control[C]//2019IEEE Intelligent Transportation Systems Conference,Auckland:ITSC,2019,2121-2126.

基本信息:

DOI：10.19961/j.cnki.1672-4747.2021.11.021

中图分类号:TP18;U491.54

引用信息:

[1]蒋明智,吴天昊,张琳.基于深度强化学习的无信号交叉口车辆协同控制算法[J],2022,20(02):14-24.DOI:10.19961/j.cnki.1672-4747.2021.11.021.

基金信息:

产业技术基础公共服务平台项目(2019-00892-2-1)

请选择需要下载的pdf数据

交通运输工程与信息学报

Summary

引用

GB/T 7714-2015 格式引文

MLA格式引文

APA格式引文