Similar Items: Trajectory Supervision for Continual Tool-Use Learning in LLMs
- QKVShare: Quantized KV-Cache Handoff for Multi-Agent On-Device LLMs
- Sustaining Cooperation in Populations Guided by AI: A Folk Theorem for LLMs
- CalBench: Evaluating Coordination-Privacy Trade-offs in Multi-Agent LLMs
- Continuous-time q-learning for mean-field control with common noise, part-II: q-learning algorithms
- Beyond the Black Box: Interpretability of Agentic AI Tool Use
- Continuous-time q-learning for mean-field control with common noise, part-I: Theoretical foundations