Similar Items: Adaptive Policy Selection and Fine-Tuning under Interaction Budgets for Offline-to-Online Reinforcement Learning
- Self-Play Enhancement via Advantage-Weighted Refinement in Online Federated LLM Fine-Tuning with Real-Time Feedback
- Label-Efficient School Detection from Aerial Imagery via Weakly Supervised Pretraining and Fine-Tuning
- DARTS: Targeting Prognostic Covariates in Budget-Constrained Sequential Experiments
- Federated Reinforcement Learning for Efficient Mobile Crowdsensing under Incomplete Information
- How Many Iterations to Jailbreak? Dynamic Budget Allocation for Multi-Turn LLM Evaluation
- Online Bayesian Calibration under Gradual and Abrupt System Changes