Similar Items: Transformers Efficiently Perform In-Context Logistic Regression via Normalized Gradient Descent
- Complex Equation Learner: Rational Symbolic Regression with Gradient Descent in Complex Domain
- Understanding In-Context Learning for Nonlinear Regression with Transformers: Attention as Featurizer
- Normalizing Trajectory Models
- Randomized Subspace Nesterov Accelerated Gradient
- Decentralized Proximal Stochastic Gradient Langevin Dynamics
- On the Wasserstein Gradient Flow Interpretation of Drifting Models