Similar Items: Temporal Dependency‐Aware Trajectory‐Level Behavioural Metric for Exploration in Reinforcement Learning
- TNCOA: Efficient Exploration via Observation‐Action Constraint on Trajectory‐Based Intrinsic Reward
- Feature Reinforcement Learning: Part II. Structured MDPs
- Extending Environments to Measure Self-reflection in Reinforcement Learning
- Robotic Cell Micromanipulation Skill Learning via Imitation‐Enhanced Reinforcement Learning
- Optimal trajectory generation method for robots for rapid handling of diversified products
- AGT: Efficient Offline Reinforcement Learning With Advantage‐Guided Transformer