Channels - Rollout Pass-Rate Control: Steering Binary-Reward RL Toward Its Most Informative Regime :: FRELIP Discovery

Similar Items: Rollout Pass-Rate Control: Steering Binary-Reward RL Toward Its Most Informative Regime

Quick Look
RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards
Quick Look
Steer Like the LLM: Activation Steering that Mimics Prompting
Quick Look
Exploration Hacking: Can LLMs Learn to Resist RL Training?
Quick Look
Enhancing RL Generalizability in Robotics through SHAP Analysis of Algorithms and Hyperparameters
Quick Look
Manifold Steering Reveals the Shared Geometry of Neural Network Representation and Behavior
Quick Look
SLIM: Sparse Latent Steering for Interpretable and Property-Directed LLM-Based Molecular Editing
Quick Look
Themis: Training Robust Multilingual Code Reward Models for Flexible Multi-Criteria Scoring
Quick Look
Quantifying Concentration Phenomena of Mean-Field Transformers in the Low-Temperature Regime
Quick Look
A decoupled diffusion planner that adapts to changing cost limits by using cost-conditioned generation for safety and reward gradients for performance
Quick Look
DataMaster: Towards Autonomous Data Engineering for Machine Learning
Quick Look
UniSD: Towards a Unified Self-Distillation Framework for Large Language Models
Quick Look
ROSE: Rollout On Serving GPUs via Cooperative Elasticity for Agentic RL
Quick Look
MARBLE: Multi-Aspect Reward Balance for Diffusion RL
Quick Look
Early Detection of Water Stress by Plant Electrophysiology: Machine Learning for Irrigation Management
Quick Look
Exponential families from a single KL identity
Quick Look
TopBench: A Benchmark for Implicit Prediction and Reasoning over Tabular Question Answering
Quick Look
A Unified Framework of Hyperbolic Graph Representation Learning Methods
Quick Look
Assessing the Role of Intersection Proximity in Pedestrian Crashes: Insights from Data Mining Approach
Quick Look
PROMISE-AD: Progression-aware Multi-horizon Survival Estimation for Alzheimer's Disease Progression and Dynamic Tracking
Quick Look
Auto-FlexSwitch: Efficient Dynamic Model Merging via Learnable Task Vector Compression
Quick Look
Neural Aided Kalman Filtering for UAV State Estimation in Degraded Sensing Environments
Quick Look
FiLMMeD: Feature-wise Linear Modulation for Cross-Problem Multi-Depot Vehicle Routing
Quick Look
Efficient Multivector Retrieval with Token-Aware Clustering and Hierarchical Indexing
Quick Look
Beyond Gaussian Bottlenecks: Topologically Aligned Encoding of Vision-Transformer Feature Spaces