Text this: VIP: Visual-guided Prompt Evolution for Efficient Dense Vision-Language Inference