Text this: One Pool, Two Caches: Adaptive HBM Partitioning for Accelerating Generative Recommender Serving