Similar Items: Design Conductor 2.0: An agent builds a TurboQuant inference accelerator in 80 hours
- DPU or GPU for Accelerating Neural Networks Inference -- Why not both? Split CNN Inference
- VitaLLM: A Versatile and Tiny Accelerator for Mixed-Precision LLM Inference on Edge Devices
- AME-PIM: Can Memory be Your Next Tensor Accelerator?
- LLM-Driven Design Space Exploration of FPGA-based Accelerators
- AccelSync: Verifying Synchronization Coverage in Accelerator Pipeline Programs
- Efficient, VRAM-Constrained xLM Inference on Clients