Similar Items: VecCISC: Improving Confidence-Informed Self-Consistency with Reasoning Trace Clustering and Candidate Answer Selection
- Characterizing the Consistency of the Emergent Misalignment Persona
- SCPRM: A Schema-aware Cumulative Process Reward Model for Knowledge Graph Question Answering
- Abductive Reasoning with Probabilistic Commonsense
- Shepherd: A Runtime Substrate Empowering Meta-Agents with a Formalized Execution Trace
- Rubric-Grounded RL: Structured Judge Rewards for Generalizable Reasoning
- The First Drop of Ink: Nonlinear Impact of Misleading Information in Long-Context Reasoning