Similar Items: LychSim: A Controllable and Interactive Simulation Framework for Vision Research
- Seeing Realism from Simulation: Efficient Video Transfer for Vision-Language-Action Data Augmentation
- A Benchmark for Interactive World Models with a Unified Action Generation Framework
- Linearizing Vision Transformer with Test-Time Training
- Quantifying the human visual exposome with vision language models
- Object Hallucination-Free Reinforcement Unlearning for Vision-Language Models
- Prompt-Anchored Vision-Text Distillation for Lifelong Person Re-identification