Similar Items: Can Coding Agents Reproduce Findings in Computational Materials Science?
- Reproducing Complex Set-Compositional Information Retrieval
- STALE: Can LLM Agents Know When Their Memories Are No Longer Valid?
- Automatically Finding and Validating Unexpected Side-Effects of Interventions on Language Models
- Agentic-imodels: Evolving agentic interpretability tools via autoresearch
- Can RL Teach Long-Horizon Reasoning to LLMs? Expressiveness Is Key
- SkillOS: Learning Skill Curation for Self-Evolving Agents