Similar Items: Towards Compute-Aware In-Switch Computing for LLMs Tensor-Parallelism on Multi-GPU Systems