Similar Items: Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Transformation
- ORBIT: Preserving Foundational Language Capabilities in GenRetrieval via Origin-Regulated Merging
- Transformers Efficiently Perform In-Context Logistic Regression via Normalized Gradient Descent
- Global Optimality for Constrained Exploration via Penalty Regularization
- Globally Optimal Training of Spiking Neural Networks via Parameter Reconstruction
- Weight-Decay Turns Transformer Loss Landscapes Villani: Functional-Analytic Foundations for Optimization and Generalization
- Spiking Sequence Machines and Transformers