Text this: Cross-Modal Navigation with Multi-Agent Reinforcement Learning