nlp-security

Here are 3 public repositories matching this topic...

mohd-ibadullah / prompt-injection-detection

Prompt injection detection and remediation for LLM email agents

python machine-learning pytorch ensemble-learning email-security roberta fastapi huggingface-transformers prompt-injection llm-security nlp-security

Updated Jun 11, 2026
Python

sumit9000 / Prompt-injections

Star

Prompt injection is a type of attack where malicious users craft prompts that trick or manipulate a language model into: - Ignoring system-level or developer instructions - Producing harmful, biased, or manipulated content - Bypassing safety mechanisms or revealing hidden data

open-source ai-safety trustworthy-ai llm-prompting nlp-security

Updated Jul 24, 2025

nand9lohot / llm-adversarial-attacks-textattack

Star

A reproducible adversarial ML lab that demonstrates TextFooler, BERTAttack, and DeepWordBug attacks against transformer-based sentiment models, with Docker automation and adversarial security reporting.

adversarial-machine-learning ai-security distilbert textattack mitre-atlas llm-security nlp-security ai-security-toolkit ai-ml-redteam

Updated Mar 17, 2026
Python

Improve this page

Add a description, image, and links to the nlp-security topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the nlp-security topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly