Channels - AssayBench: An Assay-Level Virtual Cell Benchmark for LLMs and Agents :: FRELIP Discovery

Similar Items: AssayBench: An Assay-Level Virtual Cell Benchmark for LLMs and Agents

Quick Look
V4FinBench: Benchmarking Tabular Foundation Models, LLMs, and Standard Methods on Corporate Bankruptcy Prediction
Quick Look
TopBench: A Benchmark for Implicit Prediction and Reasoning over Tabular Question Answering
Quick Look
On the Hardness of Junking LLMs
Quick Look
Exploration Hacking: Can LLMs Learn to Resist RL Training?
Quick Look
Raising the Ceiling: Better Empirical Fixation Densities for Saliency Benchmarking
Quick Look
A Domain Incremental Continual Learning Benchmark for ICU Time Series Model Transportability
Quick Look
When No Benchmark Exists: Validating Comparative LLM Safety Scoring Without Ground-Truth Labels
Quick Look
Interpreting Reinforcement Learning Agents with Susceptibilities
Quick Look
Position: agentic AI orchestration should be Bayes-consistent
Quick Look
Superintelligent Retrieval Agent: The Next Frontier of Information Retrieval
Quick Look
Dynamic Skill Lifecycle Management for Agentic Reinforcement Learning
Quick Look
CalBench: Evaluating Coordination-Privacy Trade-offs in Multi-Agent LLMs
Quick Look
NonZero: Interaction-Guided Exploration for Multi-Agent Monte Carlo Tree Search
Quick Look
Penalty-Based First-Order Methods for Bilevel Optimization with Minimax and Constrained Lower-Level Problems
Quick Look
Observable Performance Does Not Fully Reflect System Organization: A Multi-Level Analysis of Gait Dynamics Under Occlusal Constraint
Quick Look
SWE-WebDevBench: Evaluating Coding Agent Application Platforms as Virtual Software Agencies
Quick Look
Early Detection of Water Stress by Plant Electrophysiology: Machine Learning for Irrigation Management
Quick Look
Exponential families from a single KL identity
Quick Look
A Unified Framework of Hyperbolic Graph Representation Learning Methods
Quick Look
Assessing the Role of Intersection Proximity in Pedestrian Crashes: Insights from Data Mining Approach
Quick Look
PROMISE-AD: Progression-aware Multi-horizon Survival Estimation for Alzheimer's Disease Progression and Dynamic Tracking
Quick Look
Auto-FlexSwitch: Efficient Dynamic Model Merging via Learnable Task Vector Compression
Quick Look
Neural Aided Kalman Filtering for UAV State Estimation in Degraded Sensing Environments
Quick Look
FiLMMeD: Feature-wise Linear Modulation for Cross-Problem Multi-Depot Vehicle Routing