Text this: Efficient Pre-Training with Token Superposition