Similar Items: Delayed reward information is underweighted in reinforcement learning with dispersed feedback
- Multi-Discounting Reinforcement Learning Based on Reward Decomposition
- Learning decentralized policies with incremental reinforcement learning, reward shaping and self-play learning.
- A physics-informed reinforcement learning framework for impulsive orbital pursuit–evasion under stochastic maneuvers
- Delay‐Scheduled Adaptive Observer Control Strategy for Nonlinear Systems With Time‐Varying Delays
- Application and development of reinforcement learning in traffic signal control
- Reinforcement learning-based adaptive particle swarm optimization