adverserial-attack

Star

Here are 4 public repositories matching this topic...

eth-sri / insec

Star

Reproduction Package for "Black-Box Adversarial Attacks on LLM-Based Code Completion" [ICML 2025]

security code-completion llm adverserial-attack

Updated Jun 16, 2025
Python

ziakhan1516 / Adversarial-Attack-on-CNN-Models

Star

FGSM (Fast Gradient Sign Method) is an adversarial attack technique that adds small, calculated perturbations to input data to fool CNNs. Proposed by Ian Goodfellow in 2014, it generates adversarial examples to mislead the model's predictions.

cnn fgsm-attack gradient-based-attack adverserial-attack

Updated Aug 19, 2024
Jupyter Notebook

sohaib0075 / Auditing-Content-Moderation-AI

Star

Auditing a content moderation model using DistilBERT on the Jigsaw dataset. Covers bias analysis, adversarial attacks (character evasion and label poisoning), mitigation techniques, and a guardrail pipeline to improve fairness, robustness, and real-world reliability.

cybersecurity adverserial-search distilbert adverserial-attack

Updated Apr 21, 2026
Jupyter Notebook

Taleef7 / peft-robustness-thesis

Star

An empirical investigation into the robustness-efficiency tradeoff of PEFT methods against jailbreak attacks

jailbreak robustness peft llms peft-fine-tuning-llm adverserial-attack

Updated May 4, 2026
Python

Improve this page

Add a description, image, and links to the adverserial-attack topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the adverserial-attack topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adverserial-attack

Here are 4 public repositories matching this topic...

eth-sri / insec

ziakhan1516 / Adversarial-Attack-on-CNN-Models

sohaib0075 / Auditing-Content-Moderation-AI

Taleef7 / peft-robustness-thesis

Improve this page

Add this topic to your repo