Similar Items: Unmasking On-Policy Distillation: Where It Helps, Where It Hurts, and Why
- Compute Where it Counts: Self Optimizing Language Models
- Where's the Plan? Locating Latent Planning in Language Models with Lightweight Mechanistic Interventions
- Standing on the Shoulders of Giants: Stabilized Knowledge Distillation for Cross--Language Code Clone Detection
- UniSD: Towards a Unified Self-Distillation Framework for Large Language Models
- Why Global LLM Leaderboards Are Misleading: Small Portfolios for Heterogeneous Supervised ML
- Revisiting Policy Gradients for Restricted Policy Classes: Escaping Myopic Local Optima with $k$-step Policy Gradients