Channels - OGLS-SD: On-Policy Self-Distillation with Outcome-Guided Logit Steering for LLM Reasoning :: FRELIP Discovery

Similar Items: OGLS-SD: On-Policy Self-Distillation with Outcome-Guided Logit Steering for LLM Reasoning

Quick Look
VecCISC: Improving Confidence-Informed Self-Consistency with Reasoning Trace Clustering and Candidate Answer Selection
Quick Look
Abductive Reasoning with Probabilistic Commonsense
Quick Look
Rubric-Grounded RL: Structured Judge Rewards for Generalizable Reasoning
Quick Look
The First Drop of Ink: Nonlinear Impact of Misleading Information in Long-Context Reasoning
Quick Look
NeuroAgent: LLM Agents for Multimodal Neuroimaging Analysis and Research
Quick Look
Reason to Play: Behavioral and Brain Alignment Between Frontier LRMs and Human Game Learners
Quick Look
Trust the Batch, On- or Off-Policy: Adaptive Policy Optimization for RL Post-Training
Quick Look
To Call or Not to Call: A Framework to Assess and Optimize LLM Tool Calling
Quick Look
Formalize, Don't Optimize: The Heuristic Trap in LLM-Generated Combinatorial Solvers
Quick Look
AI-Generated Smells: An Analysis of Code and Architecture in LLM and Agent-Driven Development
Quick Look
LLM as Clinical Graph Structure Refiner: Enhancing Representation Learning in EEG Seizure Diagnosis
Quick Look
RHyVE: Competence-Aware Verification and Phase-Aware Deployment for LLM-Generated Reward Hypotheses
Quick Look
AdaMeZO: Adam-style Zeroth-Order Optimizer for LLM Fine-tuning Without Maintaining the Moments
Quick Look
Collaborative Agent Reasoning Engineering (CARE): A Three-Party Design Methodology for Systematically Engineering AI Agents with Subject Matter Experts, Developers, and Helper Agents
Quick Look
To Build or Not to Build? Factors that Lead to Non-Development or Abandonment of AI Systems
Quick Look
Agent-Agnostic Evaluation of SQL Accuracy in Production Text-to-SQL Systems
Quick Look
What Makes a Good Terminal-Agent Benchmark Task: A Guideline for Adversarial, Difficult, and Legible Evaluation Design
Quick Look
Characterizing the Consistency of the Emergent Misalignment Persona
Quick Look
Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows
Quick Look
Crab: A Semantics-Aware Checkpoint/Restore Runtime for Agent Sandboxes
Quick Look
Intern-Atlas: A Methodological Evolution Graph as Research Infrastructure for AI Scientists
Quick Look
AI and Open-data Driven Scalable Solar Power Profiling
Quick Look
SCPRM: A Schema-aware Cumulative Process Reward Model for Knowledge Graph Question Answering
Quick Look
AIs and Humans with Agency