Skip to content

killua156/cforge-cve-tracker

Repository files navigation

cforge-cve-tracker

A CVE tracker for conda-forge and PyPI packages. Periodically snapshots both package inventories, cross-references them against public vulnerability databases (OSV.dev, GHSA), stores results in SQLite, computes diffs between snapshots, and exposes everything through a small web UI and JSON API.

Why this exists

There is no free, public, comprehensive view of "which conda-forge packages have known CVEs right now." Anaconda offers a curated commercial product; community tools like pip-audit and jake audit installed environments but don't give a channel-wide snapshot. This project fills that gap for anyone doing CVE remediation work against conda-forge.

What it does

  • Pulls the full conda-forge package inventory (~32k packages, all platforms) from repodata.json.
  • Pulls the full PyPI inventory (~580k names) from the /simple/ index. Hydrates per-package versions lazily, only for packages with at least one OSV-linked CVE — keeps per-refresh PyPI traffic bounded.
  • Queries OSV.dev (PyPI + cross-ecosystem) and GitHub Security Advisories for each package in both inventories.
  • Maps upstream CVEs to conda-forge package names via PyPI-name heuristics and a hand-maintained alias file. PyPI inventory is queried directly — no mapping needed.
  • Resolves whether the currently available versions (conda-forge or PyPI) are affected by each CVE.
  • Persists everything in SQLite, snapshotted by date.
  • Diffs consecutive snapshots so you can see new CVEs, newly-patched packages, and newly-affected packages day over day.
  • Serves a FastAPI app with HTML pages (search packages, see CVE detail, browse diffs) and a JSON API.

What it does NOT do (yet)

  • Scan local conda environments. Use pip-audit or jake for that.
  • Track non-public vulnerabilities or do its own vulnerability research.
  • Replace Anaconda's curated CVE product. Statuses here are best-effort from public sources.

Quickstart

git clone <repo-url> cforge-cve-tracker
cd cforge-cve-tracker
python -m venv .venv && source .venv/bin/activate   # or .venv\Scripts\activate on Windows
pip install -e ".[dev]"

# Initialize the database
cforge-cve init-db

# One command does it all: pulls repodata, queries OSV, enriches with GHSA
# (if GITHUB_TOKEN is set), and takes a snapshot. ~30-90 min first time.
cforge-cve refresh

# Or run the steps individually:
cforge-cve ingest-packages           # ~2-5 min: pull conda-forge inventory
cforge-cve map-names                 # resolve conda-forge -> PyPI/native names
cforge-cve ingest-cves               # query OSV for conda-forge CVEs
cforge-cve ingest-pypi               # ~5 min: pull PyPI /simple/ index + lazy versions
cforge-cve ingest-pypi-cves          # query OSV PyPI for the pypi inventory
GITHUB_TOKEN=... cforge-cve ingest-ghsa   # enrich severity/CVSS from GHSA
cforge-cve snapshot                  # persist the current affected set (both ecosystems)

# After two refreshes, see what changed:
cforge-cve diff                      # pretty table; --json for machine-readable

# Run the web UI + JSON API
cforge-cve serve                     # http://localhost:8000
# - HTML pages: /, /packages, /packages/{name}, /cves, /cves/{id}, /diffs/latest
# - JSON API:    /api/packages, /api/cves, /api/diffs/{a}..{b}, /api/docs (OpenAPI)
# - Atom feed:   /feed.xml (new CVEs from the most recent diff)

For scheduled runs, see scripts/cron-refresh.sh or the GitHub Actions workflow in .github/workflows/refresh.yml (daily at 06:00 UTC).

Repo layout

cforge-cve-tracker/
├── README.md                    # This file
├── docs/
│   ├── ARCHITECTURE.md          # System design, data flow, schema
│   ├── BUILD_PLAN.md            # Phased build plan for Claude Code
│   ├── DATA_SOURCES.md          # API docs, rate limits, gotchas
│   └── MAPPING.md               # How CVEs are mapped to conda-forge names
├── src/cforge_cve/
│   ├── __init__.py
│   ├── cli.py                   # Click/Typer CLI entry point
│   ├── config.py                # Settings (DB path, API keys, channels)
│   ├── ingest/
│   │   ├── condaforge.py        # Pulls repodata.json
│   │   ├── osv.py               # OSV.dev batch client
│   │   ├── ghsa.py              # GitHub Security Advisories (GraphQL)
│   │   └── mapping.py           # conda-forge name <-> PyPI/upstream name
│   ├── storage/
│   │   ├── schema.sql           # SQLite DDL
│   │   ├── db.py                # Connection + migrations
│   │   ├── models.py            # Pydantic models
│   │   └── repository.py        # CRUD + diff queries
│   ├── api/
│   │   └── routes.py            # FastAPI JSON endpoints
│   └── web/
│       ├── app.py               # FastAPI app + Jinja2 setup
│       ├── templates/           # HTML templates
│       └── static/              # CSS/JS
├── data/
│   ├── aliases.yml              # Manual conda-forge <-> upstream name overrides
│   └── cforge.db                # SQLite DB (gitignored)
├── tests/
└── scripts/
    └── cron-refresh.sh

Tech stack

  • Python 3.11+
  • httpx for async HTTP (OSV batch queries benefit from concurrency)
  • SQLite (stdlib sqlite3) — file-based, perfect for snapshots
  • FastAPI + Jinja2 for the web UI
  • Typer for CLI
  • Pydantic v2 for data models
  • packaging (PyPA library) for version-range matching
  • pytest + respx for testing

Status

v1.0 + post-v1.0 reliability + PyPI ingest. Build-plan phases 0–11 complete; native-lib coverage added on top of the original conda-forge scope; PyPI inventory now tracked directly via the /simple/ index.

Latest production snapshot of the conda-forge half:

  • 32,762 conda-forge packages indexed across 7 subdirs (noarch, linux-64, osx-64, osx-arm64, win-64, linux-aarch64, linux-ppc64le)
  • 399,387 historical (name, version) rows
  • 6,799 unique CVEs (PyPI ecosystem via OSV batch + native libs via OSV /v1/query against the Debian ecosystem)
  • 138,253 affected (package, version, CVE) tuples
  • Daily cron at 06:00 UTC; full diff history queryable via /diffs/latest

PyPI half (first run pending in production):

  • Full /simple/ index (~580k names)
  • Per-package version data hydrated lazily for the ~5-15k packages with at least one OSV-linked CVE

See /about on the running site for explicit limitations and roadmap, and docs/BUILD_PLAN.md for the full build/operational history.

License

MIT. Vulnerability data from OSV.dev is CC-BY-4.0; GHSA data is under GitHub's terms. Attribute appropriately if you redistribute.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors