Channels - Revisiting Policy Gradients for Restricted Policy Classes: Escaping Myopic Local Optima with $k$-step Policy Gradients :: FRELIP Discovery

Similar Items: Revisiting Policy Gradients for Restricted Policy Classes: Escaping Myopic Local Optima with $k$-step Policy Gradients

Quick Look
Randomized Subspace Nesterov Accelerated Gradient
Quick Look
Decentralized Proximal Stochastic Gradient Langevin Dynamics
Quick Look
On the Wasserstein Gradient Flow Interpretation of Drifting Models
Quick Look
Unmasking On-Policy Distillation: Where It Helps, Where It Hurts, and Why
Quick Look
Optimal Posterior Sampling for Policy Identification in Tabular Markov Decision Processes
Quick Look
RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards
Quick Look
On the Convergence of Projected Policy Gradient for Any Constant Step Sizes
Quick Look
On the Convergence of Projected Policy Gradient for Any Constant Step Sizes
Quick Look
On the Convergence of Projected Policy Gradient for Any Constant Step Sizes
Quick Look
On the Convergence of Projected Policy Gradient for Any Constant Step Sizes
Quick Look
Complex Equation Learner: Rational Symbolic Regression with Gradient Descent in Complex Domain
Quick Look
Transformers Efficiently Perform In-Context Logistic Regression via Normalized Gradient Descent
Quick Look
Adaptive Policy Selection and Fine-Tuning under Interaction Budgets for Offline-to-Online Reinforcement Learning
Quick Look
A decoupled diffusion planner that adapts to changing cost limits by using cost-conditioned generation for safety and reward gradients for performance
Quick Look
Physiologically Grounded Driver Behavior Classification: SHAP-Driven Elite Feature Selection and Hybrid Gradient Boosting for Multimodal Physiological Signals
Quick Look
STEPS: A Temporal Smooth Error Propagation Solver on the Manifolds for Test-Time Adaptation in Time Series Forecasting
Quick Look
Score-Aware Policy-Gradient and Performance Guarantees using Local Lyapunov Stability
Quick Look
Score-Aware Policy-Gradient and Performance Guarantees using Local Lyapunov Stability
Quick Look
Score-Aware Policy-Gradient and Performance Guarantees using Local Lyapunov Stability
Quick Look
Score-Aware Policy-Gradient and Performance Guarantees using Local Lyapunov Stability
Quick Look
GRAPHLCP: Structure-Aware Localized Conformal Prediction on Graphs
Quick Look
An adaptive wavelet-based PINN for problems with localized high-magnitude source
Quick Look
Beyond Negative Rollouts: Positive-Only Policy Optimization with Implicit Negative Gradients
Quick Look
Convergence and Sample Complexity of Natural Policy Gradient Primal-Dual Methods for Constrained MDPs