Similar Items: DKC-LLM: Dynamic Knowledge Caching for Large Language Models in Business Applications
- Context-Aware Autoscaling for Cost-Efficient Large Language Model Inference With Prefix Cache Integration
- GDPKG-LLM: Integrating Gene, Disease, and Pharmacogenomics Knowledge Graphs for Cognitive Neuroscience Using Large Language Models
- Irminsul: MLA-Native Position-Independent Caching for Agentic LLM Serving
- Adaptive Robust Watermarking for Large Language Models via Dynamic Token Embedding Perturbation
- On the Incomparability of Cache Algorithms in Terms of Timing Leakage
- CacheRAG: A Semantic Caching System for Retrieval-Augmented Generation in Knowledge Graph Question Answering