Text this: Question Difficulty Estimation for Large Language Models via Answer Plausibility Scoring