Similar Items: Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling
- Detecting Hallucinations in Large Language Models via Internal Attention Divergence Signals
- SemEval-2026 Task 7: Everyday Knowledge Across Diverse Languages and Cultures
- Dependency Parsing Across the Resource Spectrum: Evaluating Architectures on High and Low-Resource Languages
- Benchmarking Parameter-Efficient Fine-Tuning of Large Language Models for Low-Resource Tajik Text Generation with the Tajik Web Corpus
- Rebellious Student: Reversing Teacher Signals for Reasoning Exploration with Self-Distilled RLVR
- Learning More from Less: Exploiting Counterfactuals for Data-Efficient Chart Understanding