Similar Items: ShardTensor: Domain Parallelism for Scientific Machine Learning
- Nitsum: Serving Tiered LLM Requests with Adaptive Tensor Parallelism
- Lifting to tensors when compiling scientific computing workloads for AI Engines
- Towards Compute-Aware In-Switch Computing for LLMs Tensor-Parallelism on Multi-GPU Systems
- Regulating Branch Parallelism in LLM Serving
- Parallel-in-Time Training of Recurrent Neural Networks for Dynamical Systems Reconstruction
- Surviving Partial Rank Failures in Wide Expert-Parallel MoE Inference