Text this: A model for music emotion recognition and aesthetic education effect evaluation based on deep learning and multimodal data