Similar Items: UniPool: A Globally Shared Expert Pool for Mixture-of-Experts
- UniSD: Towards a Unified Self-Distillation Framework for Large Language Models
- Fine-Grained Graph Generation through Latent Mixture Scheduling
- On Computing Total Variation Distance Between Mixtures of Product Distributions
- Manifold Steering Reveals the Shared Geometry of Neural Network Representation and Behavior
- Ecologically-Constrained Task Arithmetic for Multi-Taxa Bioacoustic Classifiers Without Shared Data
- YOTOnet: Zero-Shot Cross-Domain Fault Diagnosis via Domain-Conditioned Mixture of Experts