Channels - Discrete Flow Matching for Offline-to-Online Reinforcement Learning :: FRELIP Discovery

Similar Items: Discrete Flow Matching for Offline-to-Online Reinforcement Learning

Quick Look
Reward Hacking in Rubric-Based Reinforcement Learning
Quick Look
Towards Metric-Faithful Neural Graph Matching
Quick Look
Possibilistic Predictive Uncertainty for Deep Learning
Quick Look
Learning CLI Agents with Structured Action Credit under Selective Observation
Quick Look
LLM as Clinical Graph Structure Refiner: Enhancing Representation Learning in EEG Seizure Diagnosis
Quick Look
Learning Multimodal Energy-Based Model with Multimodal Variational Auto-Encoder via MCMC Revision
Quick Look
Collaborative Agent Reasoning Engineering (CARE): A Three-Party Design Methodology for Systematically Engineering AI Agents with Subject Matter Experts, Developers, and Helper Agents
Quick Look
RHyVE: Competence-Aware Verification and Phase-Aware Deployment for LLM-Generated Reward Hypotheses
Quick Look
To Build or Not to Build? Factors that Lead to Non-Development or Abandonment of AI Systems
Quick Look
Agent-Agnostic Evaluation of SQL Accuracy in Production Text-to-SQL Systems
Quick Look
What Makes a Good Terminal-Agent Benchmark Task: A Guideline for Adversarial, Difficult, and Legible Evaluation Design
Quick Look
Characterizing the Consistency of the Emergent Misalignment Persona
Quick Look
Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows
Quick Look
Crab: A Semantics-Aware Checkpoint/Restore Runtime for Agent Sandboxes
Quick Look
Intern-Atlas: A Methodological Evolution Graph as Research Infrastructure for AI Scientists
Quick Look
AI-Generated Smells: An Analysis of Code and Architecture in LLM and Agent-Driven Development
Quick Look
AI and Open-data Driven Scalable Solar Power Profiling
Quick Look
SCPRM: A Schema-aware Cumulative Process Reward Model for Knowledge Graph Question Answering
Quick Look
AIs and Humans with Agency
Quick Look
Compress Then Adapt? No, Do It Together via Task-aware Union of Subspaces
Quick Look
First-Order Efficiency for Probabilistic Value Estimation via A Statistical Viewpoint
Quick Look
Fairness of Classifiers in the Presence of Constraints between Features
Quick Look
Jailbreaking Vision-Language Models Through the Visual Modality
Quick Look
Born-Qualified: An Autonomous Framework for Deploying Advanced Energy and Electronic Materials