Similar Items: DEFault++: Automated Fault Detection, Categorization, and Diagnosis for Transformer Architectures
- Spiking Sequence Machines and Transformers
- Fast Byte Latent Transformer
- Transformers with Selective Access to Early Representations
- Taming Outlier Tokens in Diffusion Transformers
- Evaluating the Architectural Reasoning Capabilities of LLM Provers via the Obfuscated Natural Number Game
- Transformed Latent Variable Multi-Output Gaussian Processes