Text this: Measuring, refining and calibrating speaker and language information extracted from speech