Similar Items: EVA-Bench: A New End-to-end Framework for Evaluating Voice Agents
- Transcoda: End-to-End Zero-Shot Optical Music Recognition via Data-Centric Synthetic Training
- AssayBench: An Assay-Level Virtual Cell Benchmark for LLMs and Agents
- DECO: Sparse Mixture-of-Experts with Dense-Comparable Performance on End-Side Devices
- TopBench: A Benchmark for Implicit Prediction and Reasoning over Tabular Question Answering
- V4FinBench: Benchmarking Tabular Foundation Models, LLMs, and Standard Methods on Corporate Bankruptcy Prediction
- Harnessing Agentic Evolution