Text this: A hierarchical multimodal intelligent learning framework for cognitive diagnosis and learning performance prediction