Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

AMMA: A Multi-Chiplet Memory-Centric Architecture for Low-Latency 1M Context Attention Serving

Saved in:
Bibliographic Details
Published in:ArXiv cs.AR Recent Papers
Format: Online Article RSS Article
Published: 2026
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1864130790243172353
collection WordPress RSS
FRELIP Feed Integration
container_title ArXiv cs.AR Recent Papers
description
discipline_display Engineering & Technology
discipline_facet Engineering & Technology
format Online Article
RSS Article
genre Journal Article
id rss_article:48945
institution FRELIP
journal_source_facet ArXiv cs.AR Recent Papers
publishDate 2026
publishDateSort 2026
record_format rss_article
spellingShingle AMMA: A Multi-Chiplet Memory-Centric Architecture for Low-Latency 1M Context Attention Serving
ArXiv cs.AR Recent Papers
Chemical Engineering
Engineering & Technology
sub_discipline_display Chemical Engineering
sub_discipline_facet Chemical Engineering
subject_display ArXiv cs.AR Recent Papers
Chemical Engineering
Engineering & Technology
ArXiv cs.AR Recent Papers
Chemical Engineering
Engineering & Technology
subject_facet ArXiv cs.AR Recent Papers
Chemical Engineering
Engineering & Technology
title AMMA: A Multi-Chiplet Memory-Centric Architecture for Low-Latency 1M Context Attention Serving
title_auth AMMA: A Multi-Chiplet Memory-Centric Architecture for Low-Latency 1M Context Attention Serving
title_full AMMA: A Multi-Chiplet Memory-Centric Architecture for Low-Latency 1M Context Attention Serving
title_fullStr AMMA: A Multi-Chiplet Memory-Centric Architecture for Low-Latency 1M Context Attention Serving
title_full_unstemmed AMMA: A Multi-Chiplet Memory-Centric Architecture for Low-Latency 1M Context Attention Serving
title_short AMMA: A Multi-Chiplet Memory-Centric Architecture for Low-Latency 1M Context Attention Serving
title_sort amma: a multi-chiplet memory-centric architecture for low-latency 1m context attention serving
topic ArXiv cs.AR Recent Papers
Chemical Engineering
Engineering & Technology
url https://arxiv.org/abs/2604.26103v2