Text this: Optimal Posterior Sampling for Policy Identification in Tabular Markov Decision Processes