Text this: An adaptive multi-teacher-student knowledge distillation framework for scalable cloud–edge multimodal learning