Similar Items: A CLIP-Based Cross-Modal Matching Model for Image-Text Retrieval