Text this: Optimizing Data Collection for Machine Learning