Text this: Efficient, VRAM-Constrained xLM Inference on Clients