Text this: Large-Scale clustering of acoustic segments for sub-word acoustic modelling