Similar Items: ML-Bench&Guard: Policy-Grounded Multilingual Safety Benchmark and Guardrail for Large Language Models
- CyBiasBench: Benchmarking Bias in LLM Agents for Cyber-Attack Scenarios
- GLiGuard: Schema-Conditioned Classification for LLM Safeguard
- MOSAIC-Bench: Measuring Compositional Vulnerability Induction in Coding Agents
- SST-Guard: Detecting and Characterizing Server-Side Google Analytics in the Wild
- KingsGuard: Enclave Data Protection Under Real-World TEE Vulnerabilities
- ClawGuard: Out-of-Band Detection of LLM Agent Workflow Hijacking via EM Side Channel