Similar Items: Fine-Grained Graph Generation through Latent Mixture Scheduling
- Visual Latents Know More Than They Say: Unsilencing Latent Reasoning in MLLMs
- Fast Byte Latent Transformer
- Transformed Latent Variable Multi-Output Gaussian Processes
- On Computing Total Variation Distance Between Mixtures of Product Distributions
- UniPool: A Globally Shared Expert Pool for Mixture-of-Experts
- Where's the Plan? Locating Latent Planning in Language Models with Lightweight Mechanistic Interventions