Text this: How the visual brain can learn to parse images using a multiscale, incremental grouping process