Channels - MedHopQA: A Disease-Centered Multi-Hop Reasoning Benchmark and Evaluation Framework for LLM-Based Biomedical Question Answering

Similar Items: MedHopQA: A Disease-Centered Multi-Hop Reasoning Benchmark and Evaluation Framework for LLM-Based Biomedical Question Answering

Quick Look
Overview of the MedHopQA track at BioCreative IX: track description, participation and evaluation of systems for multi-hop medical question answering
Quick Look
ASTRA-QA: A Benchmark for Abstract Question Answering over Documents
Quick Look
Context Convergence Improves Answering Inferential Questions
Quick Look
Pt-HotpotQA: Evaluating Multi-Hop Question Answering on Original and Portuguese-translated Datasets Using LLMs
Quick Look
Neural at ArchEHR-QA 2026: One Method Fits All: Unified Prompt Optimization for Clinical QA over EHRs
Quick Look
Question Difficulty Estimation for Large Language Models via Answer Plausibility Scoring
Quick Look
Reliable Answers for Recurring Questions: Boosting Text-to-SQL Accuracy with Template Constrained Decoding
Quick Look
DoGMaTiQ: Automated Generation of Question-and-Answer Nuggets for Report Evaluation
Quick Look
Benchmarking Retrieval Strategies for Biomedical Retrieval-Augmented Generation: A Controlled Empirical Study
Quick Look
Factorized Latent Reasoning for LLM-based Recommendation
Quick Look
When to Retrieve During Reasoning: Adaptive Retrieval for Large Reasoning Models
Quick Look
Retrieval-Augmented Reasoning for Chartered Accountancy
Quick Look
Reproducing Adaptive Reranking for Reasoning-Intensive IR
Quick Look
FollowTable: A Benchmark for Instruction-Following Table Retrieval
Quick Look
TabEmbed: Benchmarking and Learning Generalist Embeddings for Tabular Understanding
Quick Look
InterLV-Search: Benchmarking Interleaved Multimodal Agentic Search
Quick Look
Questions and Answers-Copyright Column
Quick Look
RAG over Thinking Traces Can Improve Reasoning Tasks
Quick Look
LASAR: Latent Adaptive Semantic Aligned Reasoning for Generative Recommendation
Quick Look
LLM-Enhanced Topical Trend Detection at Snapchat
Quick Look
Storage Is Not Memory: A Retrieval-Centered Architecture for Agent Recall
Quick Look
An Embarrassingly Simple Graph Heuristic Reveals Shortcut-Solvable Benchmarks for Sequential Recommendation
Quick Look
RecRM-Bench: Benchmarking Multidimensional Reward Modeling for Agentic Recommender Systems
Quick Look
AlbumFill: Album-Guided Reasoning and Retrieval for Personalized Image Completion