Channels - When No Benchmark Exists: Validating Comparative LLM Safety Scoring Without Ground-Truth Labels :: FRELIP Discovery

Similar Items: When No Benchmark Exists: Validating Comparative LLM Safety Scoring Without Ground-Truth Labels

Quick Look
Generating Statistical Charts with Validation-Driven LLM Workflows
Quick Look
Raising the Ceiling: Better Empirical Fixation Densities for Saliency Benchmarking
Quick Look
Themis: Training Robust Multilingual Code Reward Models for Flexible Multi-Criteria Scoring
Quick Look
Steer Like the LLM: Activation Steering that Mimics Prompting
Quick Look
TopBench: A Benchmark for Implicit Prediction and Reasoning over Tabular Question Answering
Quick Look
A Domain Incremental Continual Learning Benchmark for ICU Time Series Model Transportability
Quick Look
Label-Efficient School Detection from Aerial Imagery via Weakly Supervised Pretraining and Fine-Tuning
Quick Look
Ecologically-Constrained Task Arithmetic for Multi-Taxa Bioacoustic Classifiers Without Shared Data
Quick Look
Why Global LLM Leaderboards Are Misleading: Small Portfolios for Heterogeneous Supervised ML
Quick Look
U-Define: Designing User Workflows for Hard and Soft Constraints in LLM-Based Planning
Quick Look
Evaluating the Architectural Reasoning Capabilities of LLM Provers via the Obfuscated Natural Number Game
Quick Look
Continual Knowledge Updating in LLM Systems: Learning Through Multi-Timescale Memory Dynamics
Quick Look
Low-Cost Black-Box Detection of LLM Hallucinations via Dynamical System Prediction
Quick Look
How Many Iterations to Jailbreak? Dynamic Budget Allocation for Multi-Turn LLM Evaluation
Quick Look
Spectral Model eXplainer: a chemically-grounded explainability framework for spectral-based machine learning models
Quick Look
When and Why SignSGD Outperforms SGD: A Theoretical Study Based on $ell_1$-norm Lower Bounds
Quick Look
Augmented Lagrangian Multiplier Network for State-wise Safety in Reinforcement Learning
Quick Look
Physiologically Grounded Driver Behavior Classification: SHAP-Driven Elite Feature Selection and Hybrid Gradient Boosting for Multimodal Physiological Signals
Quick Look
Safety and accuracy follow different scaling laws in clinical large language models
Quick Look
Self-Play Enhancement via Advantage-Weighted Refinement in Online Federated LLM Fine-Tuning with Real-Time Feedback
Quick Look
Joint Treatment Effect Estimation from Incomplete Healthcare Data: Temporal Causal Normalizing Flows with LLM-driven Evolutionary MNAR Imputation
Quick Look
A decoupled diffusion planner that adapts to changing cost limits by using cost-conditioned generation for safety and reward gradients for performance
Quick Look
Early Detection of Water Stress by Plant Electrophysiology: Machine Learning for Irrigation Management
Quick Look
Exponential families from a single KL identity