Text this: Memorizing Features Efficiently for Self-supervised Video Object Segmentation