Channels - Rubric-Grounded RL: Structured Judge Rewards for Generalizable Reasoning :: FRELIP Discovery

Similar Items: Rubric-Grounded RL: Structured Judge Rewards for Generalizable Reasoning

Quick Look
RHyVE: Competence-Aware Verification and Phase-Aware Deployment for LLM-Generated Reward Hypotheses
Quick Look
SCPRM: A Schema-aware Cumulative Process Reward Model for Knowledge Graph Question Answering
Quick Look
Abductive Reasoning with Probabilistic Commonsense
Quick Look
RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards
Quick Look
The First Drop of Ink: Nonlinear Impact of Misleading Information in Long-Context Reasoning
Quick Look
Reason to Play: Behavioral and Brain Alignment Between Frontier LRMs and Human Game Learners
Quick Look
GeoContra: From Fluent GIS Code to Verifiable Spatial Analysis with Geography-Grounded Repair
Quick Look
VecCISC: Improving Confidence-Informed Self-Consistency with Reasoning Trace Clustering and Candidate Answer Selection
Quick Look
Collaborative Agent Reasoning Engineering (CARE): A Three-Party Design Methodology for Systematically Engineering AI Agents with Subject Matter Experts, Developers, and Helper Agents
Quick Look
Learning CLI Agents with Structured Action Credit under Selective Observation
Quick Look
LLM as Clinical Graph Structure Refiner: Enhancing Representation Learning in EEG Seizure Diagnosis
Quick Look
To Build or Not to Build? Factors that Lead to Non-Development or Abandonment of AI Systems
Quick Look
Agent-Agnostic Evaluation of SQL Accuracy in Production Text-to-SQL Systems
Quick Look
What Makes a Good Terminal-Agent Benchmark Task: A Guideline for Adversarial, Difficult, and Legible Evaluation Design
Quick Look
Characterizing the Consistency of the Emergent Misalignment Persona
Quick Look
Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows
Quick Look
Crab: A Semantics-Aware Checkpoint/Restore Runtime for Agent Sandboxes
Quick Look
Intern-Atlas: A Methodological Evolution Graph as Research Infrastructure for AI Scientists
Quick Look
AI-Generated Smells: An Analysis of Code and Architecture in LLM and Agent-Driven Development
Quick Look
AI and Open-data Driven Scalable Solar Power Profiling
Quick Look
AIs and Humans with Agency
Quick Look
Compress Then Adapt? No, Do It Together via Task-aware Union of Subspaces
Quick Look
First-Order Efficiency for Probabilistic Value Estimation via A Statistical Viewpoint
Quick Look
Possibilistic Predictive Uncertainty for Deep Learning