Similar Items: Are We Making Progress in Multimodal Domain Generalization? A Comprehensive Benchmark Study