Text this: ViSQA: A benchmark dataset and baseline models for Vietnamese spoken question answering