Similar Items: A Comprehensive Analysis of Tokenization and Self-Supervised Learning in End-to-End Automatic Speech Recognition applied on French Language
- PairAlign: A Framework for Sequence Tokenization via Self-Alignment with Applications to Audio Tokenization
- When Audio-Language Models Fail to Leverage Multimodal Context for Dysarthric Speech Recognition
- Efficient Pre-Training with Token Superposition
- Automatically Finding and Validating Unexpected Side-Effects of Interventions on Language Models
- Cubit: Token Mixer with Kernel Ridge Regression
- The First Token Knows: Single-Decode Confidence for Hallucination Detection