Similar Items: A Study on the Performance of Distributed Training of Data-driven CFD Simulations
- Adaptation of AI-accelerated CFD Simulations to the IPU platform
- AGoQ: Activation and Gradient Quantization for Memory-Efficient Distributed Training of LLMs
- Microbenchmark-Driven Analytical Performance Modeling Across Modern GPU Architectures
- ZipCCL: Efficient Lossless Data Compression of Communication Collectives for Accelerating LLM Training
- Assessing Performance and Porting Strategies for Gravitational $N$-Body Simulations on the RISC-V-Based Tenstorrent Wormholetextsuperscript{texttrademark}
- A Treasure Trove of Performance: Analyzing the IO500 Submission Data