Similar Items: Beyond Localization: A Comprehensive Diagnosis of Perspective-Conditioned Spatial Reasoning in MLLMs from Omnidirectional Images
- Pixel Perfect: Relational Image Quality Assessment with Spatially-Aware Distortions
- OphMAE: Bridging Volumetric and Planar Imaging with a Foundation Model for Adaptive Ophthalmological Diagnosis
- PhysEdit: Physically-Consistent Region-Aware Image Editing via Adaptive Spatio-Temporal Reasoning
- SCOPE: Structured Decomposition and Conditional Skill Orchestration for Complex Image Generation
- Perceptual Flow Network for Visually Grounded Reasoning
- Large Language Models are Universal Reasoners for Visual Generation