Visual Snapshots

bin/snap.py lets agents (and humans) see what every theme actually looks like without uploading screenshots over chat. It boots each theme's WordPress Playground locally via @wp-playground/cli, drives Playwright Chromium across bin/snap_config.py::ROUTES × VIEWPORTS, captures a screenshot plus diagnostic artifacts for every cell (rendered HTML, console messages, page errors, network failures, DOM-heuristic findings, axe-core a11y violations, computed dimensions for INSPECT_SELECTORS, optional interactive states from INTERACTIONS), and runs a tiered gate that classifies the result as pass | warn | fail. bin/check.py --visual is the recommended pre-commit gate.

Quick reference

# First time? Verify all deps are ready before booting Playground.
python3 bin/snap.py doctor

# Capture one theme at every route/viewport
python3 bin/snap.py shoot chonk

# Just the desktop checkout (fastest inner loop)
python3 bin/snap.py shoot chonk --routes checkout-filled --viewports desktop

# Quick subset (snap_config.QUICK_*) -- fastest "did anything explode" sweep
python3 bin/snap.py shoot chonk --quick

# Smart sweep -- only re-shoot themes whose files moved in git
python3 bin/snap.py shoot --changed
# (framework changes under bin/* fall back to all themes)

# Capture every theme in parallel (~400MB RAM/worker, ~2x faster)
python3 bin/snap.py shoot --all --concurrency 2

# Boot a single theme and leave it running for interactive poking via the
# cursor-ide-browser MCP (or any browser) at http://localhost:9400/
python3 bin/snap.py serve chonk          # admin auto-login is enabled
                                          # for /wp-admin/ access

# Aggregate findings into reviewable markdown + the tiered gate verdict
python3 bin/snap.py report --open
# -> tmp/snaps/<theme>/review.md   per-theme triage with **GATE: …** badge
# -> tmp/snaps/<theme>/review.json per-theme JSON (gate + counts + routes)
# -> tmp/snaps/review.md           cross-theme rollup + parity drift
# -> tmp/snaps/review.json         overall gate + per-theme gates
# Final line: STATUS: PASS | WARN | FAIL

# Visual regression: compare current snaps to committed baselines
python3 bin/snap.py diff --all
python3 bin/snap.py diff chonk --threshold 0.5

# Promote latest snaps -> committed baselines (after reviewing diffs)
python3 bin/snap.py baseline --all
python3 bin/snap.py baseline chonk --route home --viewport desktop

# The single pre-commit gate: shoot + diff + report --strict, scoped to
# the themes that actually changed by default (--visual-scope=changed).
python3 bin/check.py --visual
# Full sweep before a release:
python3 bin/check.py --visual --visual-scope=all
# Smoke test for one theme + the QUICK_* subset:
python3 bin/check.py chonk --visual --visual-scope=quick

Per-cell artifacts

tmp/snaps/<theme>/<viewport>/<route>.png                 # screenshot (Read directly)
tmp/snaps/<theme>/<viewport>/<route>.html                # final rendered DOM
tmp/snaps/<theme>/<viewport>/<route>.findings.json       # heuristics + axe + console + 4xx/5xx + INSPECT
tmp/snaps/<theme>/<viewport>/<route>.a11y.json           # raw axe-core violations report
tmp/snaps/<theme>/<viewport>/<route>.<flow>.png          # interactive cells (e.g. cart-filled.line-remove.png)
tmp/snaps/<theme>/review.md                              # per-theme review with GATE badge
tmp/snaps/<theme>/review.json                            # per-theme machine-readable summary
tmp/snaps/review.md                                      # cross-theme rollup + parity drift
tmp/snaps/review.json                                    # overall gate + per-theme gates
tmp/diffs/<theme>/<viewport>/<route>.png                 # per-pixel diff overlay
tests/visual-baseline/<theme>/<viewport>/<route>.png     # committed reference

The tmp/ tree is .gitignored; the tests/visual-baseline/ PNGs are committed. bin/vendor/axe.min.js is also gitignored — the framework downloads it from a version-pinned CDN URL on first run.

DOM heuristics

The snap framework runs a custom set of DOM-heuristic checks on every captured page, in addition to axe-core. Each finding has a severity (error / warn / info), a stable kind, a human-readable message, and any extra context (selectors, measurements, source URLs):

horizontal-overflow — page is wider than the viewport.
wc-error / wc-info / wc-message / wc-validation-error — visible WC notices, captured verbatim.
php-debug-output — PHP notice/warning/fatal text leaked into the body.
raw-i18n-token — a literal __() token rendered (means a string was never translated).
broken-image — <img> failed to load.
img-missing-alt — visible image without alt.
img-oversized — natively > 4000px wide.
responsive-image-overserved — served > 3× the rendered slot.
responsive-image-blurry — served < 0.75× the rendered slot.
text-overflow-truncated — ellipsis is actively hiding content.
empty-landmark — <main>/<nav>/<aside> rendered with no visible text or media.
narrow-sidebar — a sidebar selector matched but rendered < 200px on a desktop viewport.
view-transition-name-collision — two or more elements share the same view-transition-name. Chrome aborts every transition with InvalidStateError on the next navigation when this happens; the heuristic catches it from the static DOM by walking computed style.
inspect-selector-missing — a selector listed in INSPECT_SELECTORS matched zero elements (likely time to update the config).
Network: any HTTP response ≥ 400 is captured into network_failures[], split into 4xx (warn) and 5xx (fail).
Console: console.error is captured into console[] (warn-tier) and pageerror into page_errors[] (fail-tier), both filtered against KNOWN_NOISE_SUBSTRINGS.

The tiered gate

Every cell's findings are classified into one of three buckets:

fail (build-blocking, exit 1): heuristic error, uncaught JS (after noise filter), HTTP 5xx, axe critical/serious.
warn (loud banner, exit 0): heuristic warn/info, HTTP 4xx, console errors, axe moderate/minor, parity drift, perf-budget exceedances, interaction-failed.
pass: nothing flagged.

The verdict appears as a STATUS: PASS | WARN | FAIL line at the end of every report and check run. It also lives at the top of each per-theme review.md as a **GATE: …** badge so triage starts with the verdict, not the table.

Recommended loops

When you make ANY change that could affect rendered output (template, theme.json, CSS, pattern, blueprint), the loop is:

Make the change.
python3 bin/snap.py shoot <theme> --routes <route> --viewports <viewport> for the affected cell(s).
Read the PNG to verify.
python3 bin/snap.py report and read the STATUS: line; drill into per-theme review.md if anything is non-pass.
If wider impact possible: python3 bin/snap.py check --changed (smart, fast) or python3 bin/snap.py check (full sweep before a release).
If diffs are intentional: python3 bin/snap.py baseline --all and commit the updated baselines alongside the change.

Build-pipeline integration

Other build scripts grew matching --snap flags so the gate runs inline after a mutation:

python3 bin/clone.py <name> --snap — auto-baseline a freshly-cloned theme.
python3 bin/sync-playground.py --snap — re-shoot affected themes after blueprint sync.
python3 bin/append-wc-overrides.py --snap — re-shoot after appending WC override CSS.

WP-admin Themes-card screenshot

bin/build-theme-screenshots.py consumes the snap framework's home-route output to generate each theme's screenshot.png — the 1200×900 image WordPress shows on the Themes admin card. It looks for the home shot in this order: committed baseline (tests/visual-baseline/<theme>/desktop/home.png), then the freshest unbaselined shot (tmp/snaps/<theme>/desktop/home.png). If neither exists it tells you which snap.py shoot command to run. So the canonical inner loop after editing tokens that change the home page is:

python3 bin/snap.py shoot mybrand --routes home --viewports desktop
python3 bin/build-theme-screenshots.py mybrand

bin/check.py's check_theme_screenshots_distinct fails when two themes ship identical screenshot.png bytes (the failure mode bin/clone.py produces by copying Obel's placeholder verbatim). Re-running build-theme-screenshots.py is always the fix.

Configuration

bin/snap_config.py is the single config file:

ROUTES — every (slug, URL path) the framework visits. Add a route here and it appears in every theme's review.
VIEWPORTS — Playwright viewport sizes (mobile / tablet / desktop / wide). Same idea.
INSPECT_SELECTORS — per-route map of CSS selectors whose computed width, height, display, and grid-template-columns get captured into *.findings.json and rendered into the per-theme review.md "Inspector measurements" tables. This is how the cart/checkout sidebar regression got diagnosed without re-shooting — add an entry here when you find yourself running ad-hoc Playwright probes to measure layout issues, so the next regression is visible immediately.
INTERACTIONS — per-route list of scripted flows (menu-open, qty-increment, swatch-pick, line-remove, field-focus). Each flow renders an extra <route>.<flow>.png cell so the post-interaction state is reviewable side-by-side with the static one.
KNOWN_NOISE_SUBSTRINGS — substring filter for pre-confirmed-harmless console / page errors. Add to it only after investigation confirms upstream noise — never to silence a real theme bug.
BUDGETS — soft thresholds for console_warning_count, page_weight_kb, image_count, request_count. Exceedances become findings at the configured severity. Set max: None to disable a budget.
QUICK_* — subsets used when shoot is invoked with --quick.

First-time setup

python3 -m pip install --user playwright Pillow
playwright install chromium      # ~90 MB Chromium download
python3 bin/snap.py doctor       # verifies everything is wired up

@wp-playground/cli is fetched on demand by npx --yes; no global install required. First boot takes ~2 minutes (WordPress download, plugin install, content seeding); subsequent boots are ~30 seconds when the playground cache is warm.

Fifty on GitHub · Live demos · GPL-2.0-or-later · Block-only WooCommerce themes, zero CSS files, zero JS, zero build step

Visual Snapshots

Visual Snapshots

Quick reference

Per-cell artifacts

DOM heuristics

The tiered gate

Recommended loops

Build-pipeline integration

WP-admin Themes-card screenshot

Configuration

First-time setup

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Fifty

Overview

Pipeline (operator docs)

Get going

Build a theme

WooCommerce

Working in the repo

Clone this wiki locally