Similar Items: Compute Where it Counts: Self Optimizing Language Models
- Where's the Plan? Locating Latent Planning in Language Models with Lightweight Mechanistic Interventions
- Unmasking On-Policy Distillation: Where It Helps, Where It Hurts, and Why
- UniSD: Towards a Unified Self-Distillation Framework for Large Language Models
- Beyond Pairs: Your Language Model is Secretly Optimizing a Preference Graph
- Bolek: A Multimodal Language Model for Molecular Reasoning
- Crafting Reversible SFT Behaviors in Large Language Models