Similar Items: Elastic Attention Cores for Scalable Vision Transformers
- Understanding In-Context Learning for Nonlinear Regression with Transformers: Attention as Featurizer
- Beyond Gaussian Bottlenecks: Topologically Aligned Encoding of Vision-Transformer Feature Spaces
- Attention Once Is All You Need: Efficient Streaming Inference with Stateful Transformers
- Parallel Scan Recurrent Neural Quantum States for Scalable Variational Monte Carlo
- Force-Aware Neural Tangent Kernels for Scalable and Robust Active Learning of MLIPs
- Concept-Based Abductive and Contrastive Explanations for Behaviors of Vision Models