Text this: Determining the number of clusters using penalised k-means clustering