Similar Items: Enhancing Instruction Prefetching via Cache and TLB Management
- Non-Monotonic Latency in Apple MPS Decoding: KV Cache Interactions and Execution Regimes
- Understanding Simulated Architecture via gem5 Call-Stack Profiling
- FLARE: One-Shot PE-Level Fault Localization in Systolic Arrays via Algebraic Test Vectors
- RAG-Enhanced Kernel-Based Heuristic Synthesis (RKHS): A Structured Methodology Using Large Language Models for Hardware Design
- AHASD: Asynchronous Heterogeneous Architecture for LLM Adaptive Drafting Speculative Decoding on Mobile Devices
- RecFlash: Fast Recommendation System on In-Storage Computing with Frequency-Based Data Mapping