Similar Items: On the Incomparability of Cache Algorithms in Terms of Timing Leakage
- QubitCache: Quantum-Inspired Probabilistic Attention Preservation for KV-Cache Compression
- Filtering of Q(t) Measurement Data for Estimating Leakage Current
- DKC-LLM: Dynamic Knowledge Caching for Large Language Models in Business Applications
- Quantitative information flow under generic leakage functions and adaptive adversaries
- A Low-Cost Multi-Objective Cache Prefetcher for Complex and Irregular Memory Access Patterns
- Context-Aware Autoscaling for Cost-Efficient Large Language Model Inference With Prefix Cache Integration