Quantifying the impact of non-stationarity in reinforcement learning-based traffic signal control

View article
PeerJ Computer Science
A traffic phase assigns green, yellow or red light to each traffic movement. A green traffic phase is a phase which assigns green to at least one traffic movement.

Main article text

 

Introduction

Background

Reinforcement learning

Non-stationarity in RL

Partial observability

Methods

State formulation

Actions

Reward function

Multiagent independent Q-learning

Contexts

Experiments and results

Scenario

  • Context 1 (NS = WE): insertion rate of 1 vehicle every 3 s in all eight OD pairs.

  • Context 2 (NS<WE): insertion rate of 1 vehicle every 6 s in the N-S direction OD pairs and one vehicle every 2 s in the W-E direction OD pairs.

Metrics

Traffic signal control under fixed policies

Effects of disabling learning and exploration

Effects of reduced state observability

Effects of different levels of state discretization

Discussion

Conclusion

Additional Information and Declarations

Competing Interests

The authors declare that they have no competing interests.

Author Contributions

Lucas N. Alegre conceived and designed the experiments, performed the experiments, analyzed the data, performed the computation work, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft.

Ana L.C. Bazzan conceived and designed the experiments, analyzed the data, authored or reviewed drafts of the paper, and approved the final draft.

Bruno C. da Silva conceived and designed the experiments, analyzed the data, authored or reviewed drafts of the paper, and approved the final draft.

Data Availability

The following information was supplied regarding data availability:

The implementation of the algorithms and SUMO traffic scenario is available at GitHub: https://github.com/LucasAlegre/sumo-rl.

Funding

Lucas N. Alegre was supported by CNPq under grant no. 140500/2021-9. Ana Bazzan was supported by CNPq under grant no. 307215/2017-2. This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001. There was no additional external funding received for this study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

14 Citations 2,736 Views 356 Downloads