Similar Items: Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling
- Map2World: Segment Map Conditioned Text to 3D World Generation
- Representation Fréchet Loss for Visual Generation
- Large Language Models are Universal Reasoners for Visual Generation
- Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs
- One Token Per Frame: Reconsidering Visual Bandwidth in World Models for VLA Policy
- A Benchmark for Interactive World Models with a Unified Action Generation Framework