Similar Items: HEART: Hyperspherical Embedding Alignment via Kent-Representation Traversal in Diffusion Models
- SphereVAD: Training-Free Video Anomaly Detection via Geodesic Inference on the Unit Hypersphere
- Proxy3D: Efficient 3D Representations for Vision-Language Models via Semantic Clustering and Alignment
- PRISM: Pre-alignment via Black-box On-policy Distillation for Multimodal Reinforcement Learning
- BabelDOC: Better Layout-Preserving PDF Translation via Intermediate Representation
- Active Sampling for Ultra-Low-Bit-Rate Video Compression via Conditional Controlled Diffusion
- UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors