Cleanup 418 (v5): revert broken Copilot autofixes, restore green CI#1246
Cleanup 418 (v5): revert broken Copilot autofixes, restore green CI#1246jpoley wants to merge 11 commits into
Conversation
…ixes Add `flowspec doctor` health-check command and `flowspec gpg` agent signing module, fully addressing all Copilot findings from PRs #1236, #1237, #1238, and #1239 plus a thorough independent security review. Key design decisions in signing module: - Fingerprint normalization centralized in get_key_fingerprint() (strip, uppercase, 40-char hex validation, logging on failure) - Key generation uses --status-fd for reliable fingerprint extraction - Secret key verification uses exact fingerprint match (case-insensitive) - All test mocks use realistic 40-char hex fingerprints via TEST_FP constant Doctor module improvements: - httpx imported at module level (coding standard compliance) - Strict flow.<phase>.agent.md validation (rejects flow.agent.md) - UTF-8 encoding on all file I/O - Version check distinguishes "confirmed current" from "could not check" Documentation quality: - All GPG commands use fingerprint (not email) to avoid ambiguity - Broken related-doc links replaced with anchor links - Security note about ~/.gnupg file permissions added - Manual deletion warns about keyring cleanup Closes #1221 #1222 #1223 #1224 #1225 #1226 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: jason poley <jason.poley@gmail.com>
Remove commands: - /flow:research (rarely used) - /flow:intake (rarely used) - /flow:custom (power user only) - /flow:security_* (5 commands - not core workflow) - /vibe (casual mode) Rewrite README.md: - Document all 15 remaining commands - Organize into Core (5), Setup (2), Sub-commands (5), Utilities (2), Automation (1) - Remove bloated ASCII diagrams - Remove false "fully supported" claims for agents - Remove legacy /speckit section - Accurate project structure Command count: 22 → 15 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: jason poley <jason.poley@gmail.com>
Four quality issues flagged by Copilot review on the consolidation PR: - doctor.py: the "beads CLI" check was probing the wrong binary (`beads` instead of `bd`). Reworked `check_tool_installed` to accept a separate `display_name` and an optional `version_getter`, then wired `run_all_checks` to look up `bd` while still showing "beads CLI" in output. - doctor.py: reuse existing `check_backlog_installed_version` / `check_beads_installed_version` helpers from the package root so version parsing stays consistent with the rest of the CLI (replaces the generic `split()[-1]` heuristic). Lazy-imported to avoid a circular import with `flowspec_cli/__init__.py`. - doctor.py: remove the misleading `flowspec doctor --fix` hint — auto-fix is not implemented, so the hint was followed by a "not implemented" notice. Per-check `fix_command` guidance is still rendered inline. - gpg_cli.py: in `setup_command`, read keyring state once (`existing_fingerprint`) and branch on the cached value. Previously repeated `key_exists()` / `get_key_fingerprint()` calls could race on transient keyring errors and report "key already exists" while later failing with "no agent GPG key found". Also adds regression tests: - `test_display_name_separate_from_binary` - `test_version_getter_preferred_over_subprocess` - `test_version_getter_none_falls_back_to_subprocess` - `test_beads_check_probes_bd_binary` Validation: - uv run ruff format --check . → 323 files clean - uv run ruff check . → all checks passed - uv run pytest tests/ -x -q → 3532 passed, 42 skipped Signed-off-by: jason poley <jason.poley@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Address Copilot review feedback on PR #1243: the previous guard ``existing_fingerprint = get_key_fingerprint() if key_exists() else None`` still performed two keyring reads, because ``key_exists()`` itself calls ``get_key_fingerprint()`` internally. That left the original race back in place — a transient keyring failure between the two reads could make ``setup`` report "key already exists" and then fail configuring git signing because no fingerprint is available. Drop the redundant ``key_exists()`` call and branch on the cached ``existing_fingerprint`` alone. Also adds tests/test_gpg_cli.py with three regression tests pinning ``get_key_fingerprint.call_count == 1`` and ``key_exists.call_count == 0`` on the existing-key, no-key, and --force paths so this pattern can't silently reappear. Signed-off-by: jason poley <jason.poley@gmail.com>
Addresses all three Copilot findings on PR #1244. PR #1244 only fixed the caching pattern in ``gpg setup`` — ``gpg status`` and ``gpg rotate`` still repeated the same double/triple-read pattern, and the new ``_check_installed_tool_version`` probe cascade in ``doctor`` shipped without ``subprocess`` timeouts. Caching (src/flowspec_cli/gpg_cli.py) ------------------------------------- ``key_exists()`` is implemented in terms of ``get_key_fingerprint()``, and the old ``get_key_info()`` also called ``get_key_fingerprint()`` internally. Chaining these issued multiple keyring reads per command invocation and could produce inconsistent output under transient keyring failures (e.g. "Configured" paired with fingerprint "unknown"). - ``status_command``: now reads the fingerprint exactly once and passes it through as ``get_key_info(fingerprint=fingerprint)``. Previously: ``key_exists()`` + ``get_key_fingerprint()`` + ``get_key_info()`` = 3 reads. - ``rotate_command``: now reads once and branches on the cached value. Previously: ``key_exists()`` + ``get_key_fingerprint()`` = 2 reads. - ``key_exists`` is no longer imported in this module — the fingerprint value itself is now the single source of truth. Supporting change (src/flowspec_cli/signing.py) ----------------------------------------------- ``get_key_info`` gains an optional ``fingerprint`` keyword argument. When provided, the keyring read is skipped entirely; when omitted, behavior is unchanged so existing callers and tests keep working. Subprocess timeout (src/flowspec_cli/doctor.py) ----------------------------------------------- ``_check_installed_tool_version`` cascades three ``subprocess.run`` probes (``--version`` / ``version`` / ``-V``). Each now uses ``timeout=5`` and catches ``subprocess.TimeoutExpired`` alongside ``OSError``, so a hung or interactive tool can no longer block ``flowspec doctor`` indefinitely. Regression tests ---------------- 9 new tests total; all pin invariants that can silently regress: tests/test_gpg_cli.py (rewritten + expanded, 8 tests): - ``TestModuleSurface::test_key_exists_is_not_imported`` — static guardrail: ``key_exists`` must not reappear as a gpg_cli import. - Setup (3): kept from #1244 — one keyring read on existing-key, no-key, and --force paths. - Status (2, new): one keyring read on configured and not-configured paths; ``get_key_info`` is called with ``fingerprint=<cached>``. - Rotate (2, new): one keyring read on no-key and rotating-with-yes paths. tests/test_signing.py (2 new): - ``test_get_key_info_accepts_cached_fingerprint`` — passing ``fingerprint=`` bypasses the keyring entirely. - ``test_get_key_info_cached_none_returns_none_without_keyring`` — ``fingerprint=None`` falls back to the keyring exactly once (cache, not suppression flag). tests/test_doctor.py (2 new): - ``test_timeout_expired_is_treated_as_unknown_version`` — a ``TimeoutExpired`` from any probe is swallowed, not re-raised. - ``test_timeout_is_configured_on_each_probe`` — every ``subprocess.run`` call receives an explicit positive ``timeout``. Validation ---------- - ``uv run ruff format --check .`` → 324 files clean - ``uv run ruff check .`` → all checks passed - ``uv run pytest tests/ -x -q`` → 3544 passed, 42 skipped Signed-off-by: jason poley <jason.poley@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Both ``Potential fix for pull request finding`` commits introduced on PR #1245 (d08b27a, 48d2134) need to be reverted. The third autofix commit, 9a76497 (doctor.py), is benign and is kept. d08b27a — hallucinated kwargs ----------------------------- The autofix changed ``setup_command`` to call:: generate_agent_key(fingerprint=existing_fingerprint) configure_git_signing(project_root=root, fingerprint=fingerprint) but neither function accepts a ``fingerprint`` keyword argument: - ``src/flowspec_cli/signing.py:99`` ``def generate_agent_key() -> str`` - ``src/flowspec_cli/signing.py:166`` ``def configure_git_signing( project_root: Optional[Path] = None) -> None`` Both calls would raise ``TypeError`` at runtime the first time an actual setup was attempted. This is why the ``test`` / ``validate-dev-role`` / ``validate-qa-role`` CI jobs failed. 48d2134 — duplicate safety, wrong contract ------------------------------------------ The autofix added a post-delete keyring re-read in ``rotate_command``:: delete_agent_key() remaining_fingerprint = get_key_fingerprint() if remaining_fingerprint == old_fingerprint: raise GPGConfigurationError( "Failed to delete the existing agent GPG key." ) Problems: 1. ``delete_agent_key()`` already raises ``GPGError`` if either the ``gpg --delete-secret-and-public-key`` call or the ``keyring.delete_password`` call fails (signing.py:349-356). The extra verification is duplicate safety. 2. The extra read is the exact pattern PR #1244/#1245 set out to eliminate, and it breaks the ``TestRotateCommandKeyringReads::test_reads_fingerprint_once_when_rotating`` regression test. 3. The stray long-line also left ``gpg_cli.py`` failing ``ruff format --check`` — that was the ``lint`` CI failure. Reverting this restores the single-read invariant and the format check. 9a76497 (doctor.py) kept ------------------------ That commit only splits ``(result.stdout or result.stderr).strip()`` into two separate strips. Semantics unchanged; no reason to revert. Validation ---------- - ``uv run ruff format --check .`` → 324 files clean - ``uv run ruff check .`` → all checks passed - ``uv run pytest tests/ -x -q`` → 3544 passed, 42 skipped This reverts commits d08b27a and 48d2134. Signed-off-by: jason poley <jason.poley@gmail.com>
🟢 Security Scan Results (Advisory)
No critical or high severity issues found. How to remediate# View detailed scan results
gh run download 24640329044 -n security-scan-results-ac4a8776a345a4a571878b8eb09e8c27acbee7d2
# Triage findings with AI assistance
specify security triage
# Generate fix suggestions
specify security fix
# Apply fixes and re-scan
flowspec security scanSee the Security tab for detailed findings. |
There was a problem hiding this comment.
Pull request overview
This PR restores green CI after incorrect automated “autofix” changes by ensuring the flowspec gpg commands align with the signing API and by re-establishing the “avoid redundant keyring reads” behavior and its regression coverage.
Changes:
- Adds/updates GPG signing primitives and Typer subcommands (
gpg setup/status/rotate) and wires thegpganddoctorcommands into the main CLI. - Adds extensive regression tests for GPG signing/keyring-read behavior and
doctorsubprocess timeouts. - Updates documentation (new GPG signing guide) and rewrites README; removes several
.claude/commands/*documents.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
src/flowspec_cli/signing.py |
GPG/keyring/git-signing primitives, including cached-fingerprint support in get_key_info. |
src/flowspec_cli/gpg_cli.py |
Implements flowspec gpg setup/status/rotate with caching commentary and Rich output. |
src/flowspec_cli/doctor.py |
Implements doctor checks + subprocess version-probe timeouts. |
src/flowspec_cli/__init__.py |
Registers gpg sub-app and adds doctor command to the main CLI. |
tests/test_signing.py |
Unit tests for signing primitives and keyring behavior. |
tests/test_gpg_cli.py |
CLI-level regression tests for avoiding redundant get_key_fingerprint() reads. |
tests/test_doctor.py |
Regression tests ensuring tool version probes are time-bounded. |
docs/guides/gpg-signing.md |
New end-user guide for agent GPG signing setup, status, and rotation. |
README.md |
Substantial rewrite of product messaging and command documentation. |
.claude/commands/vibe/vibe.md |
Removes repo-local vibe command doc. |
.claude/commands/flow/security_workflow.md |
Removes repo-local security workflow doc. |
.claude/commands/flow/security_web.md |
Removes repo-local security web doc. |
.claude/commands/flow/security_triage.md |
Removes repo-local security triage doc. |
.claude/commands/flow/security_report.md |
Removes repo-local security report doc. |
.claude/commands/flow/security_fix.md |
Removes repo-local security fix doc. |
.claude/commands/flow/intake.md |
Removes repo-local intake command doc. |
.claude/commands/flow/custom.md |
Removes repo-local custom workflow command doc. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def test_get_key_info_cached_none_returns_none_without_keyring( | ||
| self, mock_keyring, mock_gpg_commands | ||
| ): | ||
| """Explicit ``fingerprint=None`` should fall back to reading the keyring. | ||
|
|
||
| This preserves the zero-argument contract and confirms the optional | ||
| parameter is treated as a *cache*, not a suppression flag. | ||
| """ | ||
| mock_keyring.get_password.return_value = None | ||
|
|
||
| info = get_key_info(fingerprint=None) | ||
|
|
||
| assert info is None | ||
| # Fell back to the keyring exactly once. | ||
| assert mock_keyring.get_password.call_count == 1 | ||
| mock_gpg_commands.assert_not_called() |
There was a problem hiding this comment.
test_get_key_info_cached_none_returns_none_without_keyring is internally inconsistent: the name says "without_keyring", but the assertions expect a keyring read to occur (call_count == 1) and the docstring says it should fall back to reading the keyring. Rename the test (and/or update its docstring) so it reflects the behavior being pinned.
| <div align="center"> | ||
| <img src="./images/flowspec.png"/> | ||
| <h1>flowspec</h1> | ||
| <h3><em>Spec-Driven Development with AI-Powered Task Management</em></h3> | ||
| <h3><em>Spec-Driven Development for AI-Assisted Coding</em></h3> | ||
| </div> | ||
|
|
||
| <p align="center"> | ||
| <strong>Plan features with specialized AI agents. Track tasks with backlog.md + beads. Ship with confidence.</strong> | ||
| <strong>Structured specifications. Tracked tasks. Confident shipping.</strong> | ||
| </p> | ||
|
|
||
| <p align="center"> | ||
| <a href="https://github.com/jpoley/flowspec/actions/workflows/release.yml"><img src="https://github.com/jpoley/flowspec/actions/workflows/release.yml/badge.svg" alt="Release"/></a> | ||
| <a href="https://hub.docker.com/r/jpoley/flowspec-agents"><img src="https://img.shields.io/docker/pulls/jpoley/flowspec-agents" alt="Docker Pulls"/></a> | ||
| <a href="https://github.com/jpoley/flowspec/stargazers"><img src="https://img.shields.io/github/stars/jpoley/flowspec?style=social" alt="GitHub stars"/></a> | ||
| <a href="https://github.com/jpoley/flowspec/blob/main/LICENSE"><img src="https://img.shields.io/github/license/jpoley/flowspec" alt="License"/></a> | ||
| <a href="https://scorecard.dev/viewer/?uri=github.com/jpoley/flowspec"><img src="https://api.scorecard.dev/projects/github.com/jpoley/flowspec/badge" alt="OpenSSF Scorecard"/></a> |
There was a problem hiding this comment.
The PR description focuses on reverting two broken autofix commits to restore green CI, but this diff also includes a broad rewrite of the README (tagline, quick start, command reference, removed badges, etc.). Please confirm this README rewrite is intentional and in-scope for this PR; otherwise consider moving it to a separate PR so the revert/CI-fix intent stays focused.
| def get_key_info(fingerprint: Optional[str] = None) -> Optional[dict[str, str]]: | ||
| """Get detailed information about the agent's GPG key. | ||
|
|
||
| Args: | ||
| fingerprint: Optional cached fingerprint to use instead of reading | ||
| the keyring. When omitted, the fingerprint is read via | ||
| :func:`get_key_fingerprint`. Pass a value already in hand to | ||
| avoid a redundant keyring read (and the inconsistency that | ||
| multiple reads can surface during transient failures). | ||
|
|
||
| Returns: | ||
| Dictionary with key details (fingerprint, uid, creation date), or | ||
| None if no key exists or the GPG command fails. | ||
| """ | ||
| if fingerprint is None: | ||
| fingerprint = get_key_fingerprint() | ||
| if not fingerprint: | ||
| return None |
There was a problem hiding this comment.
get_key_info’s docstring describes different behavior for "when omitted" vs "pass fingerprint=None", but with the current fingerprint: Optional[str] = None signature those cases are indistinguishable. Consider either (a) updating the docstring to say "when not provided or None" or (b) using a sentinel default if you truly need to distinguish an explicit None from omission.
| # Read key state exactly once. ``key_exists()`` itself calls | ||
| # ``get_key_fingerprint()`` internally, so using both here would still | ||
| # incur two keyring reads and reintroduce the inconsistency this | ||
| # caching was meant to prevent. Branch on the fingerprint alone. | ||
| existing_fingerprint = get_key_fingerprint() | ||
|
|
||
| if existing_fingerprint and not force: | ||
| console.print("[yellow]Agent GPG key already exists.[/yellow]") | ||
| console.print("[dim]Use --force to regenerate the key.[/dim]") | ||
| fingerprint = existing_fingerprint | ||
| else: | ||
| if existing_fingerprint and force: | ||
| console.print("[yellow]Regenerating agent GPG key...[/yellow]") | ||
| delete_agent_key() | ||
|
|
||
| # Generate new key | ||
| console.print("[cyan]Generating agent GPG key...[/cyan]") | ||
| fingerprint = generate_agent_key() | ||
| console.print("[green]✓[/green] GPG key generated") |
There was a problem hiding this comment.
setup_command’s comment and tests frame this as "read key state exactly once per invocation", but the no-key path still calls generate_agent_key(), which itself calls get_key_fingerprint() internally (see signing.generate_agent_key). That means the command can still perform multiple keyring reads and can print "GPG key generated" even when the second read finds an existing key and no generation occurred. Consider threading the cached fingerprint through (e.g., adding an optional cached-fingerprint parameter to generate_agent_key) or adjusting the messaging/comment to match the actual behavior.
| # Read keyring state exactly once. ``key_exists()`` is implemented in | ||
| # terms of ``get_key_fingerprint()``, so calling both would duplicate | ||
| # reads and could produce the "exists but no fingerprint" inconsistency | ||
| # under transient keyring failures. | ||
| old_fingerprint = get_key_fingerprint() | ||
|
|
||
| if not old_fingerprint: | ||
| console.print("[yellow]No agent GPG key found. Nothing to rotate.[/yellow]") | ||
| console.print("[dim]Run 'flowspec gpg setup' to create a new key.[/dim]") | ||
| return | ||
|
|
||
| if not yes: | ||
| console.print( | ||
| "[yellow]⚠ This will delete the current agent GPG key and generate a new one.[/yellow]" | ||
| ) | ||
| console.print(f"[dim]Current fingerprint: {old_fingerprint}[/dim]") | ||
| console.print() | ||
| confirm = typer.confirm("Do you want to continue?", default=False) | ||
| if not confirm: | ||
| console.print("[dim]Cancelled.[/dim]") | ||
| raise typer.Exit(0) | ||
|
|
||
| try: | ||
| # Delete old key | ||
| console.print("[cyan]Deleting old key...[/cyan]") | ||
| delete_agent_key() | ||
| console.print("[green]✓[/green] Old key deleted") | ||
|
|
||
| # Generate new key | ||
| console.print("[cyan]Generating new key...[/cyan]") | ||
| new_fingerprint = generate_agent_key() | ||
| console.print("[green]✓[/green] New key generated") |
There was a problem hiding this comment.
rotate_command reads old_fingerprint = get_key_fingerprint() up front, but then calls delete_agent_key() and generate_agent_key(), both of which call get_key_fingerprint() internally in signing.py. If the intent is to avoid multiple keyring reads (and avoid inconsistencies under transient keyring failures), consider letting delete_agent_key / generate_agent_key accept the cached fingerprint or introducing variants that operate on a provided fingerprint.
- Rename test_get_key_info_cached_none_returns_none_without_keyring to test_get_key_info_explicit_none_falls_back_to_keyring (name matched opposite of behavior) - Fix get_key_info docstring: "when not provided or None" (both cases behave identically; prior wording implied different behavior) - Add fingerprint param to configure_git_signing; callers now pass the already-held fingerprint to skip the redundant keyring read - Add cached_fingerprint param to delete_agent_key; rotate_command now passes old_fingerprint to avoid a second keyring read - Thread fingerprints through setup_command and rotate_command; update test assertions to pin the forwarded-fingerprint contract - Change "GPG key generated" message to "GPG key ready" (generate_agent_key may return an existing key if one appears between the CLI check and the internal check; "ready" is always accurate) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: jason poley <jason.poley@gmail.com>
Summary
Replaces closed PR #1245. Same
cleanup-418branch. Three Copilot autofix commits were pushed to #1245 on top of my caching fix, and two of them broke CI. This PR reverts the two broken ones, keeps the benign one, and restores all checks to green locally.What broke on #1245
Four CI jobs went red:
lint(format),test,validate-dev-role,validate-qa-role. Root causes were two Copilot autofix commits.Reverted:
d08b27a— hallucinated kwargs insetup_commandThe autofix changed two calls in
src/flowspec_cli/gpg_cli.py:Neither function accepts a
fingerprintkwarg:src/flowspec_cli/signing.py)generate_agent_keydef generate_agent_key() -> strconfigure_git_signingdef configure_git_signing(project_root: Optional[Path] = None) -> NoneBoth calls would raise
TypeErrorat runtime the first time anyone ranflowspec gpg setup. That's what took down thetest/validate-*-rolejobs.Reverted:
48d2134— duplicate safety + format drift inrotate_commandThe autofix added a post-delete keyring re-read:
Three problems:
delete_agent_key()already raisesGPGErrorif either thegpg --delete-secret-and-public-keycall or thekeyring.delete_passwordcall fails (signing.py:349-356). This verification is duplicate safety.TestRotateCommandKeyringReads::test_reads_fingerprint_once_when_rotating.gpg_cli.pyfailingruff format --check— that was thelintfailure.Kept:
9a76497— benigndoctor.pyrefactorSplits
(result.stdout or result.stderr).strip()into two separate.strip()calls. Semantically unchanged. No reason to revert.Validation
uv run ruff format --check .— 324 files cleanuv run ruff check .— all checks passeduv run pytest tests/ -x -q— 3544 passed, 42 skippedgit commit -s)A note on the Copilot autofix findings
Both reverted commits were titled "Potential fix for pull request finding" — the autofix responded to some security/CodeQL advisory on #1245. The advisory scan was flagged as advisory, not blocking in the original comment. The proposed "fixes" were incorrect (see above); the underlying concern they were aimed at (keyring reads / delete verification) is already handled:
f203a82).delete_agent_key()itself.If the same advisory re-appears on this PR, the right follow-up is to add a real
fingerprintkwarg togenerate_agent_key/configure_git_signinginsigning.pyfirst, then wire it through — not to push call-site code that doesn't match the API.🤖 Generated with Claude Code