Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

31.1 A 14.08-to-135.69Token/s ReRAM-on-Logic Stacked Outlier-Free Large-Language-Model Accelerator with Block-Clustered Weight-Compression and Adaptive Parallel-Speculative-Decoding

Saved in:
Bibliographic Details
Published in:ArXiv cs.AR Recent Papers
Format: Online Article RSS Article
Published: 2026
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1865036767122423813
collection WordPress RSS
FRELIP Feed Integration
container_title ArXiv cs.AR Recent Papers
description
discipline_display Engineering & Technology
discipline_facet Engineering & Technology
format Online Article
RSS Article
genre Journal Article
id rss_article:51200
institution FRELIP
journal_source_facet ArXiv cs.AR Recent Papers
publishDate 2026
publishDateSort 2026
record_format rss_article
spellingShingle 31.1 A 14.08-to-135.69Token/s ReRAM-on-Logic Stacked Outlier-Free Large-Language-Model Accelerator with Block-Clustered Weight-Compression and Adaptive Parallel-Speculative-Decoding
ArXiv cs.AR Recent Papers
Chemical Engineering
Engineering & Technology
sub_discipline_display Chemical Engineering
sub_discipline_facet Chemical Engineering
subject_display ArXiv cs.AR Recent Papers
Chemical Engineering
Engineering & Technology
ArXiv cs.AR Recent Papers
Chemical Engineering
Engineering & Technology
subject_facet ArXiv cs.AR Recent Papers
Chemical Engineering
Engineering & Technology
title 31.1 A 14.08-to-135.69Token/s ReRAM-on-Logic Stacked Outlier-Free Large-Language-Model Accelerator with Block-Clustered Weight-Compression and Adaptive Parallel-Speculative-Decoding
title_auth 31.1 A 14.08-to-135.69Token/s ReRAM-on-Logic Stacked Outlier-Free Large-Language-Model Accelerator with Block-Clustered Weight-Compression and Adaptive Parallel-Speculative-Decoding
title_full 31.1 A 14.08-to-135.69Token/s ReRAM-on-Logic Stacked Outlier-Free Large-Language-Model Accelerator with Block-Clustered Weight-Compression and Adaptive Parallel-Speculative-Decoding
title_fullStr 31.1 A 14.08-to-135.69Token/s ReRAM-on-Logic Stacked Outlier-Free Large-Language-Model Accelerator with Block-Clustered Weight-Compression and Adaptive Parallel-Speculative-Decoding
title_full_unstemmed 31.1 A 14.08-to-135.69Token/s ReRAM-on-Logic Stacked Outlier-Free Large-Language-Model Accelerator with Block-Clustered Weight-Compression and Adaptive Parallel-Speculative-Decoding
title_short 31.1 A 14.08-to-135.69Token/s ReRAM-on-Logic Stacked Outlier-Free Large-Language-Model Accelerator with Block-Clustered Weight-Compression and Adaptive Parallel-Speculative-Decoding
title_sort 31.1 a 14.08-to-135.69token/s reram-on-logic stacked outlier-free large-language-model accelerator with block-clustered weight-compression and adaptive parallel-speculative-decoding
topic ArXiv cs.AR Recent Papers
Chemical Engineering
Engineering & Technology
url https://arxiv.org/abs/2605.09375v1