Skip to content

camgrimsec/ai-codegen-security-linter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🛡️ ai-codegen-security-linter

Semgrep rules that catch insecure patterns AI code generators (Copilot, Cursor, ChatGPT) commonly produce.

AI assistants write functional code fast — but they also generate hardcoded secrets, prompt injection vulnerabilities, insecure deserialization, and unprotected LLM endpoints at an alarming rate. This ruleset catches those patterns before they hit production.

Part of the GRIMSEC DevSecOps Suite.


🎯 What This Catches

Category Rules Examples
Hardcoded Secrets 6 API keys (AWS, OpenAI, GitHub), JWT secrets, Flask SECRET_KEY, DB connection strings with embedded creds
Prompt Injection 5 User input in f-string prompts, string concat in LLM calls, LangChain unsanitized input, system prompt leakage
Insecure Deserialization 6 pickle.load(), yaml.load() without SafeLoader, torch.load(), node-serialize, Java ObjectInputStream
LLM App Security 5 exec() on LLM output, LLM output in SQL, no rate limiting on LLM endpoints, OpenAI key in source

22 rules across Python, JavaScript, TypeScript, Java, Go, and Ruby.

⚡ Quick Start

Scan a project

# Install Semgrep
pip install semgrep

# Clone this ruleset
git clone https://github.com/camgrimsec/ai-codegen-security-linter.git

# Scan your project
semgrep scan --config ai-codegen-security-linter/rules/ /path/to/your/project

Use as a remote config in CI

# .github/workflows/security.yaml
- name: AI Codegen Security Lint
  run: semgrep scan --config "https://github.com/camgrimsec/ai-codegen-security-linter/rules/" --error

Pre-commit hook

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/semgrep/semgrep
    rev: v1.96.0
    hooks:
      - id: semgrep
        args: ['--config', 'https://github.com/camgrimsec/ai-codegen-security-linter/rules/', '--error']

📊 CVSS Severity Scorer

Built-in scoring engine maps all 21 rules to CVSS 3.1 base scores and generates prioritized risk reports grouped by Critical, High, and Medium.

Usage

# Step 1: Run Semgrep with JSON output
semgrep scan --config rules/ --json -o results.json /path/to/your/project

# Step 2: Generate prioritized risk report
python -m scorer results.json                          # Console (default)
python -m scorer results.json --format json             # JSON for CI/CD
python -m scorer results.json --format markdown          # Markdown for PRs
python -m scorer results.json --min-severity high        # Filter by severity
python -m scorer results.json -f json -o report.json     # Write to file

Sample Console Output

========================================================================
  AI CODEGEN SECURITY LINTER — PRIORITIZED RISK REPORT
========================================================================
  Findings  : 11
  Max CVSS  : 9.8
  Avg CVSS  : 9.3
  Files Hit : 4
  Rules Hit : 7

  🔴 CRITICAL: 7    🟠 HIGH: 4    🟡 MEDIUM: 0    ⚪ LOW: 0
========================================================================

  🔴 CRITICAL (7 findings)
  ────────────────────────────────────────────────────────────────────

  [9.8] Pickle Deserialization of Untrusted Data
    Rule : ai-codegen-pickle-load-untrusted
    CWE  : CWE-502
    CVSS : CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H
    File : app/ml/model_loader.py:42
    Fix  : Use json, msgpack, or safetensors for ML models.

CVSS Score Map

Severity CVSS Range Rule Count Examples
🔴 Critical 9.0 - 9.8 9 pickle.load, yaml.load, torch.load, exec(llm_output), LangChain code exec, node-serialize, Java deser
🟠 High 7.0 - 8.6 10 Hardcoded API keys, JWT secrets, prompt injection (f-string/concat/LangChain), Flask SECRET_KEY, LLM→SQL
🟡 Medium 4.0 - 5.3 2 System prompt leak, no rate limit on LLM endpoint

Output Formats

  • Console — Color-coded terminal output with severity grouping
  • JSON — Structured output for CI/CD pipelines, dashboards, and downstream tools
  • Markdown — Drop into GitHub Issues, PRs, or wiki pages

📁 Rule Structure

rules/
├── hardcoded-secrets/
│   ├── api-key-in-source.yaml        # AWS, OpenAI, GitHub, Slack, GitLab keys
│   └── jwt-secret-hardcoded.yaml     # JWT signing + Flask SECRET_KEY
├── prompt-injection/
│   ├── user-input-in-prompt.yaml     # f-string, concat, template injection
│   └── langchain-unsafe-patterns.yaml # LangChain-specific unsafe patterns
├── insecure-deserialization/
│   ├── pickle-yaml-unsafe.yaml       # Python pickle, yaml, shelve, torch
│   └── js-deserialization.yaml       # node-serialize, eval-as-parse, Java deser
└── llm-app-security/
    └── insecure-llm-patterns.yaml    # exec(llm_output), SQL injection via LLM, rate limiting

scorer/ ├── init.py ├── main.py # python -m scorer entrypoint ├── cli.py # CLI argument parsing ├── cvss_map.py # CVSS 3.1 scores + vectors for all 21 rules └── engine.py # Scoring engine + console/JSON/markdown formatters


## 🧪 Testing

```bash
# Run rule tests against fixtures
semgrep scan --config rules/ --test tests/

Test files use Semgrep's ruleid: and ok: annotations to validate true/false positives.

🤖 Why AI-Generated Code Needs Dedicated Rules

Standard SAST rules catch generic issues. These rules target patterns specific to how AI assistants generate code:

  • Copilot inlines API keys as "placeholder" values that look real enough to work
  • Cursor generates complete Flask/FastAPI apps with hardcoded SECRET_KEY
  • ChatGPT produces LangChain examples with PythonREPLTool (arbitrary code exec)
  • All AI tools use pickle.load() in ML examples without mentioning RCE risk
  • All AI tools build LLM prompts with f-string interpolation (prompt injection)

These aren't theoretical — they're the most common patterns seen in code reviews of AI-assisted PRs.

📊 Rule Metadata

Every rule includes:

  • CWE mapping (e.g., CWE-798, CWE-502, CWE-77)
  • OWASP mapping where applicable
  • Confidence rating (HIGH/MEDIUM/LOW)
  • Severity (ERROR/WARNING)
  • AI-codegen-specific context in the message explaining why AI tools produce this pattern

🗺️ Roadmap

  • Go-specific rules (hardcoded creds in http.NewRequest, unsafe template exec)
  • Rust rules (unsafe blocks AI generators over-use)
  • Terraform/IaC rules (AI-generated overly permissive IAM policies)
  • VS Code extension for real-time linting
  • Semgrep App integration for dashboard visibility
  • SARIF output integration with GitHub Advanced Security

🤝 Contributing

See CONTRIBUTING.md for guidelines. Key areas where help is needed:

  1. New rules — especially for Go, Rust, and C#
  2. False positive tuning — test against real AI-generated codebases
  3. Documentation — remediation guides for each rule category

📜 License

MIT License. See LICENSE.

🔗 Related Projects

About

Semgrep rules catching insecure patterns AI code generators (Copilot, Cursor, ChatGPT) commonly produce — hardcoded secrets, prompt injection, insecure deserialization, LLM app security.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages