Test Commander

An AI-assisted testing system and quality intelligence center. Test Commander helps teams move from requirements and exploration to BDD, automation, evidence, and reporting — with a continuous learning loop and a team-facing console.

It is built as a Claude Code plugin plus a small Python and TypeScript runtime. It is designed to be installed once and grown phase by phase.

Status: Phase 13 complete (2026-06-02) — project complete (Phases 0–13). tc-core ships /tc:init, /tc:status, /tc:journal, /tc:next. tc-requirements ships /tc:review-requirements, /tc:review-user-stories, /tc:review-acceptance-criteria, /tc:requirements-coverage, /tc:requirements-to-tests. tc-knowledge ships /tc:learn-from-docs, /tc:learn-from-specs, /tc:learn-from-code, /tc:learn-from-api, /tc:learn-from-tests. tc-explore ships /tc:create-charter, /tc:explore (with the internal exploration-review sub-mode), /tc:session-summary, /tc:test-ideas. tc-bdd ships /tc:generate-bdd (with the internal review sub-mode) and /tc:review-bdd; tc-traceability ships /tc:traceability-map. tc-build-framework ships /tc:build-framework; tc-automation-plan ships /tc:automation-plan; tc-automate ships /tc:automate (with the internal automation-review sub-mode) and /tc:review-automation; tc-test-data ships /tc:generate-test-data. tc-run ships /tc:run (with the internal evidence-index sub-mode) and /tc:analyze-results; tc-evidence is the internal evidence indexer; tc-quality-report ships /tc:report and /tc:quality-gate. tc-learning ships /tc:learn, /tc:learn-from-failures, /tc:learn-from-exploration, /tc:learn-from-feedback, /tc:review-lessons, /tc:promote-lessons. tc-visualize ships /tc:visualize, the eight /tc:diagram-* commands, /tc:generate-infographic, and /tc:render-visuals. tc-web ships /tc:web-init, /tc:web-start, /tc:web-sync, /tc:web-index-artifacts, and /tc:web-export — the read-only web console. tc-governance ships the controlled-execution pipeline (intent → plan → policy → approval → bounded execution → validation → audit) behind the console's /api/execute — no /tc:* commands. tc-mcp exposes the workspace through the expanded Runtime API (apps/api: the /api/runtime/ namespace) and a schema-first MCP server (apps/mcp: tc_status, tc_plan, tc_run_command) — both alternative front-ends to the same governance pipeline, with the seven permission levels enforced server-side; no /tc:* commands. tc-sandbox ships /tc:sandbox-init, /tc:sandbox-launch, /tc:sandbox-status, /tc:sandbox-sync, /tc:sandbox-stop, and /tc:sandbox-export — on-demand, team-accessible Test Commander environments launched from GitHub Actions, governed by the same Phase-10.5 pipeline and safe-by-default (allow-listed hosts, blocked private ranges). tc-continuous-quality ships /tc:watch-changes, /tc:impact-analysis, /tc:coverage-gap-analysis, /tc:propose-tests, /tc:create-test-pr, and /tc:continuous-quality-check — continuous quality mode that watches changes, maps impact, finds coverage gaps, and opens clearly-labeled test PRs when the configured autonomy level (0–4) allows, all through the same Phase-10.5 pipeline. See planning/plan.md for the full roadmap, docs/user-guide/workflow.md for the Phase 1 walkthrough, docs/user-guide/reviewing-requirements.md for the Phase 2 walkthrough, docs/user-guide/building-project-knowledge.md for the Phase 3 walkthrough, docs/user-guide/exploring-an-app.md for the Phase 4 walkthrough, docs/user-guide/generating-bdd.md for the Phase 5 walkthrough, docs/user-guide/automation.md for the Phase 6 walkthrough, docs/user-guide/running-tests.md for the Phase 7 walkthrough, docs/user-guide/learning-loop.md for the Phase 8 walkthrough, docs/user-guide/visuals.md for the Phase 9 walkthrough, docs/user-guide/web-console.md for the Phase 10 walkthrough, docs/user-guide/governance.md for the Phase 10.5 governance guide, docs/user-guide/integrating.md for the Phase 11 integration guide (Runtime API + MCP server), docs/user-guide/sandbox.md for the Phase 12 sandbox guide, and docs/user-guide/continuous-quality.md for the Phase 13 continuous-quality guide.

What Test Commander Is

A disciplined workflow that turns product context into testable artifacts: requirements reviews, exploration notes, test ideas, BDD specs, Playwright automation, evidence, and a live quality report.
A Claude Code plugin (test-commander) with skills that orchestrate each step.
A workspace convention (.test-commander/) that keeps every quality artifact in one place, versioned in git, with full traceability.
A continuous learning loop that captures lessons from failures, exploration, and human feedback — and applies them only after human review.

Universal by Design

Test Commander is product-domain-agnostic. It ships with universal English and software-engineering defaults only — no e-commerce, healthcare, finance, research, or other product-domain vocabulary in the shipped rubric, tags, methodology, fixtures, or examples. The tool does not assume what product your team is testing.

Consuming projects extend Test Commander for their own domain through four explicit hooks:

<workspace>/config.yaml extensions to rubric keyword sets (PCI, HIPAA, your role taxonomy, etc.).
Your project's own requirement and exploration documents under .test-commander/documents/uploaded/.
Project knowledge ingested in Phase 3 (/tc:learn-from-docs, /tc:learn-from-code, ...).
Project-defined values inside shipped tag namespaces (@area:<feature>, @risk:<class>, @persona:<role>).

See docs/user-guide/customizing-for-your-project.md for worked examples and the full extension model, and Decision D19 for the rationale.

What Test Commander Is Not

It is not a replacement for skilled testers.
It is not a fully autonomous QA system.
It is not a promise that AI can understand every product perfectly.
It is not a test automation silver bullet.
It is not a wrapper over third-party skill plugins — every skill is owned in-repo.

Who Benefits

Role	Value
Testers	Charter-based exploration, captured observations and risks, generated test ideas, BDD that's actually readable.
Automation engineers	Playwright framework scaffolded on demand, page objects and fixtures generated from BDD, test data kept out of code.
Developers	Requirements reviews catch ambiguity before code; impact analysis and proposed tests on PRs (later phases).
Product owners	Live quality report with release-readiness; coverage gaps and open questions visible.
Engineering leaders	Traceability from requirement to test result; risk register; learning loop that improves the test strategy over time.

How Test Commander Evolves

Test Commander is built in 13 phases. Each phase produces a working, demonstrable increment. The capstone target is phases 0–3, 4–8, and 10. Phases 9, 11, 12, and 13 follow.

See planning/plan.md for the full phased plan, including Decisions, Open Questions, and per-phase Definition of Done.

The roadmap summary:

Phase	Name
0	Repository foundation
1	Workspace and artifact model
2	Requirements and user story intelligence
3	Project knowledge ingestion
4	Exploratory testing
5	BDD generation and traceability
6	Playwright framework (lazy) and automation
7	Execution, evidence, and quality report
8	Continuous learning
9	Visual documentation and infographics
10	Web console MVP
11	Runtime API and MCP server
12	Sandboxed testing environment
13	Continuous quality agent

Getting Started

The full install guide lives in docs/install.md (filled out in Step 0.2). What follows is the short version.

Prerequisites the script will check for you:

make
Python 3.12
PDM
Docker (any compatible runtime)
Git

Two-stage install:

./bootstrap.sh    # verifies prereqs; auto-installs the safe ones
make install      # provisions the project and registers the Claude Code plugin

Platforms supported: macOS, Linux, Windows via WSL2 or Git Bash. PowerShell is explicitly not supported.

Once installed, open Claude Code and confirm test-commander:tc-core appears in available skills.

Core Workflow

The eventual end-to-end flow (commands roll out across phases):

/tc:init
/tc:review-requirements
/tc:learn-from-code
/tc:create-charter --area <feature>
/tc:explore --target <url> --charter <feature>
/tc:test-ideas --area <feature>
/tc:generate-bdd --area <feature>
/tc:automation-plan --area <feature>
/tc:generate-test-data --area <feature>
/tc:automate --feature <feature>
/tc:run --suite smoke
/tc:report
/tc:learn
/tc:next

/tc:next always tells you what to do next based on the state of .test-commander/.

Repository Layout

test-commander/
  .claude-plugin/marketplace.json     # local marketplace
  plugins/test-commander/             # the Claude Code plugin
    .claude-plugin/plugin.json
    skills/
      tc-core/SKILL.md                # phase 0
      tc-requirements/SKILL.md        # phase 2
      tc-bdd/SKILL.md                 # phase 5
      ...
  docs/                               # vision, architecture, methodology, user guide
  planning/plan.md                    # the phased plan
  scripts/                            # verify_skills.py and friends
  bootstrap.sh                        # prereq checker
  Makefile                            # install / lint / test / build / run / verify
  pyproject.toml                      # PDM, Python 3.12+

The per-project quality workspace lives at .test-commander/ in consuming projects, not here.

Documentation

Most docs/ files are stubs in Phase 0 and get filled in by their owning phase.

For agents (Claude / automated operators)

AGENTS.md is the entry point an agent reads at the start of every session. It names the source of truth (planning/plan.md), lists the 19 settled decisions (D1–D19), enumerates the seven Per-Phase Conventions, documents the TDD micro-cycle and verify chain, describes the commit and phase sign-off pattern, and lists what NOT to do. Read it before touching code.

Contributing

See CONTRIBUTING.md. The short version: pick a phase step, build it small, test it, document it, raise a PR referencing the plan step.

License

MIT.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Test Commander

What Test Commander Is

Universal by Design

What Test Commander Is Not

Who Benefits

How Test Commander Evolves

Getting Started

Core Workflow

Repository Layout

Documentation

For agents (Claude / automated operators)

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 164 Commits
.claude-plugin		.claude-plugin
.github/workflows		.github/workflows
apps		apps
continuous		continuous
docs		docs
implementation		implementation
planning		planning
plugins/test-commander		plugins/test-commander
runtime		runtime
sandbox		sandbox
scripts		scripts
tests		tests
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
TODO.md		TODO.md
bootstrap.sh		bootstrap.sh
docker-compose.yml		docker-compose.yml
pdm.lock		pdm.lock
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Test Commander

What Test Commander Is

Universal by Design

What Test Commander Is Not

Who Benefits

How Test Commander Evolves

Getting Started

Core Workflow

Repository Layout

Documentation

For agents (Claude / automated operators)

Contributing

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages