Similar Items: CyBiasBench: Benchmarking Bias in LLM Agents for Cyber-Attack Scenarios
- LoopTrap: Termination Poisoning Attacks on LLM Agents
- When Alignment Isn't Enough: Response-Path Attacks on LLM Agents
- MOSAIC-Bench: Measuring Compositional Vulnerability Induction in Coding Agents
- ML-Bench&Guard: Policy-Grounded Multilingual Safety Benchmark and Guardrail for Large Language Models
- Latent Adversarial Detection: Adaptive Probing of LLM Activations for Multi-Turn Attack Detection
- Exposing LLM Safety Gaps Through Mathematical Encoding:New Attacks and Systematic Analysis