Similar Items: On Similarity of Computational Kernels in our Codes and Proxies
- Exploring Sparse Matrix Multiplication Kernels on the Cerebras CS-3
- KEET: Explaining Performance of GPU Kernels Using LLM Agents
- Towards Compute-Aware In-Switch Computing for LLMs Tensor-Parallelism on Multi-GPU Systems
- Akita: A High Usability Simulation Framework for Computer Architecture
- Accelerating MoE with Dynamic In-Switch Computing on Multi-GPUs
- Lifting to tensors when compiling scientific computing workloads for AI Engines