Similar Items: ChunkFlow: Communication-Aware Chunked Prefetching for Layerwise Offloading in Distributed Diffusion Transformer Inference
- Privacy-preserving Chunk Scheduling in a BitTorrent Implementation of Federated Learning
- Communication Offloading on SmartNIC DPUs: A Quantitative Approach
- Structure-Aware Chunking for Tabular Data in Retrieval-Augmented Generation
- LLM-Enhanced Deep Reinforcement Learning for Task Offloading in Collaborative Edge Computing
- GriNNder: Breaking the Memory Capacity Wall in Full-Graph GNN Training with Storage Offloading
- Taming Request Imbalance: SLO-Aware Scheduling for Disaggregated LLM Inference