Similar Items: EpiCastBench: Datasets and Benchmarks for Multivariate Epidemic Forecasting
- OxyEcomBench: Benchmarking Multimodal Foundation Models across E-Commerce Ecosystems
- FineState-Bench: Benchmarking State-Conditioned Grounding for Fine-grained GUI State Setting
- Workspace-Bench 1.0: Benchmarking AI Agents on Workspace Tasks with Large-Scale File Dependencies
- HOME-KGQA: A Benchmark Dataset for Multimodal Knowledge Graph Question Answering on Household Daily Activities
- PrepBench: How Far Are We from Natural-Language-Driven Data Preparation?
- A Hierarchical Agent System with Reinforcement Learning for Multivariate Time Series Data Cleaning