Similar Items: Multi-Source Visual Language Model Fusion for Indoor Geolocation Reliability