Similar Items: The Last Word Often Wins: A Format Confound in Chain-of-Thought Corruption Studies
- The Frequency Confound in Language-Model Surprisal and Metaphor Novelty
- Accurate and Efficient Statistical Testing for Word Semantic Breadth
- Reasoning over Object Descriptions Improves Coreference Resolution in Task-Based Dialogue Systems
- Multi-Level Narrative Evaluation Outperforms Lexical Features for Mental Health
- Geometry-Calibrated Conformal Abstention for Language Models
- From Unstructured Recall to Schema-Grounded Memory: Reliable AI Memory via Iterative, Schema-Aware Extraction