Similar Items: XtraMAC: An Efficient MAC Architecture for Mixed-Precision LLM Inference on FPGA
- VitaLLM: A Versatile and Tiny Accelerator for Mixed-Precision LLM Inference on Edge Devices
- TokenStack: A Heterogeneous HBM-PIM Architecture and Runtime for Efficient LLM Inference
- LLM-Driven Design Space Exploration of FPGA-based Accelerators
- TransDot: An Area-efficient Reconfigurable Floating-Point Unit for Trans-Precision Dot-Product Accumulation for FPGA AI Engines
- NVLLM: A 3D NAND-Centric Architecture Enabling Edge on-Device LLM Inference
- Silicon Showdown: Performance, Efficiency, and Ecosystem Barriers in Consumer-Grade LLM Inference