Similar Items: Semantic Reward Collapse and the Preservation of Epistemic Integrity in Adaptive AI Systems
- Reward Hacking in Rubric-Based Reinforcement Learning
- Rubric-Grounded RL: Structured Judge Rewards for Generalizable Reasoning
- RHyVE: Competence-Aware Verification and Phase-Aware Deployment for LLM-Generated Reward Hypotheses
- SCPRM: A Schema-aware Cumulative Process Reward Model for Knowledge Graph Question Answering
- To Build or Not to Build? Factors that Lead to Non-Development or Abandonment of AI Systems
- Contextual Multi-Objective Optimization: Rethinking Objectives in Frontier AI Systems