Multi-hop cross-prompt injection benchmark for multi-agent AI systems. 250 attack cases, 7 taxonomy categories, 4 defenses evaluated. Watch: https://www.youtube.com/watch?v=fGOlMij4HPQ
-
Updated
Jun 11, 2026 - Python
Multi-hop cross-prompt injection benchmark for multi-agent AI systems. 250 attack cases, 7 taxonomy categories, 4 defenses evaluated. Watch: https://www.youtube.com/watch?v=fGOlMij4HPQ
Hybrid AI Security Gateway — rule-based + LLM-assisted (Nemotron + Llama) prompt injection detection + MCP security (rules+policy+AI Assisted)
Defense-in-depth input safety for LLMs — perplexity gate + FAISS + ModernBERT + LoRA + Llama Guard 3, behind a deterministic policy gate. 99.88% accuracy, 99.47% jailbreak recall, calibrated confidence, ONNX-optimized. Live demo on HF Spaces.
PromptShield is an automated red-blue adversarial testing framework designed to evaluate LLM application security against prompt injection and data leakage, generating actionable mitigation reports mapped to the MITRE ATT&CK framework.
Three-layer ensemble defence (anomaly + classifier + semantic) against prompt injection in RAG pipelines. F1=0.837, FPR=0.000.
Add a description, image, and links to the llm-security-prompt-injection topic page so that developers can more easily learn about it.
To associate your repository with the llm-security-prompt-injection topic, visit your repo's landing page and select "manage topics."