Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Benchmark Integrity and Reasoning-Trace Errors in Medical Question Answering With Large Language Models: Mixed Methods Study With Sparse Autoencoders

Saved in:
Bibliographic Details
Published in:Journal of Medical Internet Research
Format: Online Article RSS Article
Published: 2026
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867936037497470976
collection WordPress RSS
FRELIP Feed Integration
container_title Journal of Medical Internet Research
description
discipline_display Communications
discipline_facet Communications
format Online Article
RSS Article
genre Journal Article
id rss_article:91270
institution FRELIP
journal_source_facet Journal of Medical Internet Research
last_indexed 2026-06-14T02:03:22.160Z
publishDate 2026
publishDateSort 2026
record_format rss_article
spellingShingle Benchmark Integrity and Reasoning-Trace Errors in Medical Question Answering With Large Language Models: Mixed Methods Study With Sparse Autoencoders
Communications
General
Communications
sub_discipline_display General
sub_discipline_facet General
subject_display Communications
General
Communications
Communications
General
Communications
subject_facet Communications
General
Communications
title Benchmark Integrity and Reasoning-Trace Errors in Medical Question Answering With Large Language Models: Mixed Methods Study With Sparse Autoencoders
title_auth Benchmark Integrity and Reasoning-Trace Errors in Medical Question Answering With Large Language Models: Mixed Methods Study With Sparse Autoencoders
title_full Benchmark Integrity and Reasoning-Trace Errors in Medical Question Answering With Large Language Models: Mixed Methods Study With Sparse Autoencoders
title_fullStr Benchmark Integrity and Reasoning-Trace Errors in Medical Question Answering With Large Language Models: Mixed Methods Study With Sparse Autoencoders
title_full_unstemmed Benchmark Integrity and Reasoning-Trace Errors in Medical Question Answering With Large Language Models: Mixed Methods Study With Sparse Autoencoders
title_short Benchmark Integrity and Reasoning-Trace Errors in Medical Question Answering With Large Language Models: Mixed Methods Study With Sparse Autoencoders
title_sort benchmark integrity and reasoning-trace errors in medical question answering with large language models: mixed methods study with sparse autoencoders
topic Communications
General
Communications
url https://www.jmir.org/2026/1/e90061