Text this: CMTA: Leveraging Cross-Modal Temporal Artifacts for Generalizable AI-Generated Video Detection