Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
45f9785
Add channel and thread filtering to message broker (#293)
ptone Jun 2, 2026
d54278d
Fix test suite leaking Hub credentials, corrupting agent state (#296)
ptone Jun 2, 2026
e64a1a7
docs: document REGION and ZONE overrides in starter-hub README
ptone Jun 3, 2026
b2eaa59
fix: use sudo to check repository path existence in gce-demo-setup-re…
ptone Jun 3, 2026
5d21b25
cmd: fix nil-pointer panic in harness-config when Hub is disabled
ptone Jun 3, 2026
26caeb9
scripts/starter-hub: add MACHINE_TYPE override support to provision s…
ptone Jun 3, 2026
e4585dc
docs: add external channels, GCE hub setup, and multi-broker guides (…
zeroasterisk Jun 3, 2026
c5143bc
Add sort and filter to agent list view (#297)
ptone Jun 3, 2026
0ede010
Add prominent disconnected overlay to web terminal (#295)
ptone Jun 3, 2026
f7e0109
feat(store): Postgres backend — pgx driver, Ent schema parity, full s…
ptone Jun 5, 2026
b1f08a0
Restore contents of .scion as before the recent pull
ptone Jun 5, 2026
d1a01c7
Remove scratchpad markdown files as requested
ptone Jun 5, 2026
afaae9a
Organize developer tools into hack and fix build config
ptone Jun 5, 2026
36efc16
fix(sciontool/hub): prevent token-file test from clobbering live agen…
ptone Jun 5, 2026
eeee331
feat(hub): multi-node broker dispatch — affinity, durable intent, lif…
ptone Jun 5, 2026
02efd44
feat(runtime): NFS-coordinated workspace sharing across nodes (Model …
ptone Jun 5, 2026
3aab748
fix: prevent stale disconnect from marking reconnected broker offline…
ptone Jun 5, 2026
e04d431
feat: resource clone & delete with authz hardening (#301)
ptone Jun 5, 2026
0c5ba2d
fix: multi-node session fixes + Cloud Run deployment (#309)
ptone Jun 5, 2026
05ef2ba
Guard hub agent phase transitions against spurious session lifecycle …
ptone Jun 5, 2026
c177a86
Fix TestDeleteStopped_RequiresGroveContext failing in hub environment…
ptone Jun 5, 2026
2c3d45c
Fix agent list task overflow and unify action buttons (#311)
ptone Jun 5, 2026
ca7701b
feat(hub): auth proxy mode (Google IAP) (#310)
ptone Jun 5, 2026
6d2e868
fix: resolve workspace file browser to groves/ instead of projects/ (…
ptone Jun 5, 2026
59e7304
fix(hub): allow agents to create sub-agents (drop User FK on created_…
ptone Jun 6, 2026
fe5bc87
Add health-debug to sciontool and a feature to repair-inject auth int…
ptone Jun 6, 2026
41a40f1
Add Reset Auth button to agent detail page UI
Jun 6, 2026
9c6a131
fix: add more of the triage-remediation for broken auth during upgrad…
ptone Jun 6, 2026
87c0487
fix(hub): use deterministic UUID for plugin broker IDs to match α mig…
ptone Jun 6, 2026
a3a7530
fix(hub): address PR #319 review feedback (#319)
ptone Jun 6, 2026
a204880
fix(hub): multi-node session fixes — OAuth state_mismatch + shared si…
ptone Jun 6, 2026
e0d8fcb
build(deps-dev): bump postcss in /extras/agent-viz/web (#203)
dependabot[bot] Jun 6, 2026
6016dcd
fix(ci): apply gofmt to fix CI format check (#324)
ptone Jun 6, 2026
a12d276
feat: enforce unique agent slugs within a project (#325)
ptone Jun 6, 2026
d36ec50
feat: add project rename support (#326)
ptone Jun 6, 2026
69ca824
chore: remove spurious files and add commit guidelines (#329)
ptone Jun 6, 2026
1c74f56
fix: resolve hub auth token expiry deadlock after signing-key rotatio…
ptone Jun 6, 2026
a7aeb08
fix: update scion-env on GitHub token refresh to prevent gh CLI 401s …
ptone Jun 6, 2026
d9eb95c
docs: generate daily release notes for May 29 - June 5, 2026 (#331)
ptone Jun 6, 2026
056e8f6
Project visibility via membership (subtractive model) (#332)
ptone Jun 6, 2026
b232fbb
fix(ci): add missing !no_sqlite build tag to signing_key_shared_test.…
ptone Jun 6, 2026
2b71d8c
automation: add cloud run deploy pattern for hub (#330)
ptone Jun 7, 2026
003d2a0
Discord chat (#333)
ptone Jun 7, 2026
e01216e
feat: auto-prefix bare email recipients with user: in scion message (…
ptone Jun 7, 2026
ce31734
Scion/fix auth reset self heal (#337)
ptone Jun 7, 2026
5122e03
fix: align hubManagedProjectPath to prefer projects/ over groves/ (#338)
ptone Jun 7, 2026
050c65f
fix(scion-telegram): update group_links on supergroup migration inste…
ptone Jun 7, 2026
7e3c1db
Scion/harness refactor (#336)
ptone Jun 7, 2026
2c2db6f
fix: remove unused _linkedDiscordId property to fix typecheck CI (#340)
ptone Jun 7, 2026
b2b2ba2
fix(hub): carry channel and thread_id through outbound message API (#…
ptone Jun 7, 2026
c92d272
feat(agent-viz): add Agent Communications transcript panel (#342)
ngtanthanh-qc Jun 7, 2026
44b492f
Extract standalone provisionShared from nfsBackend.Provision
ptone Jun 7, 2026
3c4c97a
Remove Provision from WorkspaceBackend interface
ptone Jun 7, 2026
7928d2b
Retarget provisioning tests to call provisionShared directly
ptone Jun 7, 2026
f6f0913
Document Tier-3 mount-type seam for future vendor types
ptone Jun 7, 2026
7918a16
docs(runtime): drop stale Provision reference from localBackend comment
ptone Jun 7, 2026
3cee66c
Export provisionShared → ProvisionShared for CLI subcommand access
ptone Jun 7, 2026
552b52b
Add SentinelDir field to ProvisionInput for k8s mount constraint
ptone Jun 7, 2026
e5d440f
Add sciontool provision subcommand for k8s init container
ptone Jun 7, 2026
a719bae
Rewire buildPod init container to invoke sciontool provision
ptone Jun 7, 2026
d638b67
Delete retired shell scripts, update init-container tests
ptone Jun 7, 2026
6b7c6b3
Extract Tier-1 provisioning to config-free pkg/provision leaf package
ptone Jun 7, 2026
e28abcb
fix(provision): guard chown target against filesystem root in init-co…
ptone Jun 7, 2026
9db77e6
style(provision): rename stale 'nfsBackend.Provision:' log prefixes t…
ptone Jun 7, 2026
50c029e
fix(provision): propagate context.Context through provisioning path
ptone Jun 7, 2026
6029c60
fix(ci): resolve golangci-lint errcheck and gofmt failures (#346)
ptone Jun 7, 2026
0c1353e
Add cloudrun-volume and gke-shared-volume mount types (#169)
ptone Jun 7, 2026
6ecd613
Add Realize variants and backends for cloudrun-volume and gke-shared-…
ptone Jun 7, 2026
21d3743
Add Cloud Run runtime with broker-side direct provisioning (#169)
ptone Jun 7, 2026
4980b3b
docs(cloudrun): explain why SharingModeSharedPlain is fixed in provis…
ptone Jun 7, 2026
ba79900
fix: prevent test suite from leaking hub status updates and telemetry…
ptone Jun 7, 2026
00ddd69
feat(worktree): adapt Phase 1 onto pkg/provision (post-#169) (#350)
ptone Jun 7, 2026
0346899
feat: add hub-managed port pool for agent containers
zeroasterisk Jun 2, 2026
04e8d05
fix(port-pool): release ports on restart + trim trailing slash
zeroasterisk Jun 6, 2026
a0d14ea
fix(port-pool): add input validation + comprehensive tests from debug…
zeroasterisk Jun 6, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
48 changes: 0 additions & 48 deletions .coordinator-state.md

This file was deleted.

502 changes: 502 additions & 0 deletions .design/auth-proxy-mode.md

Large diffs are not rendered by default.

772 changes: 772 additions & 0 deletions .design/broker-dispatch.md

Large diffs are not rendered by default.

7 changes: 7 additions & 0 deletions .design/decoupled-harness-implementation.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
# Decoupled Harness Implementation: Script-Based Provisioning

> **Packaging follow-on complete.** The harness-config decoupling work
> ([`harness-config-decoupling.md`](./harness-config-decoupling.md)) relocated
> OpenCode, Codex, and Antigravity bundles to `harnesses/<name>/`, removed their
> Go embed/built-in implementations, and shrunk the default-install set to
> `{claude, gemini}`. Each bundle is now self-contained (config + provisioner +
> Dockerfile + Cloud Build config) under [`harnesses/`](../harnesses/README.md).

## Motivation

Today, every harness implementation lives as compiled Go code inside the scion binary (`pkg/harness/`). Each harness performs a similar set of operations — writing config files, injecting auth credentials, rewriting settings JSON/YAML/TOML — but the specifics are unique per harness. This means:
Expand Down
267 changes: 267 additions & 0 deletions .design/harness-config-decoupling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,267 @@
# Harness-Config Decoupling: Top-Level Bundle Directory & Opt-In Install

**Status:** Draft plan — 2026-06-06
**Owner:** harness-refactor agent (for ptone@google.com)
**Related:** [`decoupled-harness-implementation.md`](./decoupled-harness-implementation.md) (the container-script provisioning work this builds on)

## Motivation

Today every harness ships compiled into the scion binary and is **installed by
default**. `harness.All()` returns `{gemini, claude, opencode, codex}`, and
`scion init` / `scion server` startup seed each one's embedded config into
`~/.scion/harness-configs/<name>/` from `pkg/harness/<name>/embeds/`.

We want to move to a model where **harnesses and their configs are not all
installed by default**. The first step is to:

1. Establish a **new top-level harness-config directory at the repo root** that
holds harness bundles as plain on-disk artifacts (not Go embeds).
2. **Refactor OpenCode and Codex** out of `pkg/harness/*/embeds/` into that
directory.
3. **Port the Antigravity harness config** (from
[`ptone/scion-antigravity`](https://github.com/ptone/scion-antigravity)) into
that directory.

The container-script migration (`decoupled-harness-implementation.md`, Phases
0–5) already did the hard part: Codex and OpenCode are fully declarative
(`config.yaml` + `provision.py`) and run their provisioning inside the agent
container. This plan is the **packaging/distribution** follow-on — it changes
*where the bundles live* and *whether they are installed automatically*, not how
they provision.

## Current State (verified)

| Concern | Where it lives today |
|---|---|
| Default-install set | `pkg/harness/harness.go::All()` → gemini, claude, opencode, codex |
| Default-install call sites | `cmd/project.go` (`scion init`), `cmd/templates.go` (`templates update-default`), `cmd/server_foreground.go` (`scion server`) — all call `harness.All()` |
| Seeding from embeds | `pkg/config/harness_config.go::SeedHarnessConfig()` walks `h.GetHarnessEmbedsFS()` |
| OpenCode bundle (embedded) | `pkg/harness/opencode/embeds/{config.yaml,opencode.json,provision.py}` + `pkg/harness/opencode/embeds.go` (`//go:embed`) |
| Codex bundle (embedded) | `pkg/harness/codex/embeds/{config.yaml,config.toml,scion_notify.sh,bashrc,provision.py}` + `pkg/harness/codex/embeds.go` |
| Built-in Go fallbacks | `pkg/harness/opencode.go`, `pkg/harness/codex.go` (+ `codex_config.go`), selected by `harness.New()` / `harness.Resolve()` |
| Opt-in install (already exists!) | `cmd/harness_config_install.go` → `scion harness-config install <source>` supports local dir, `github.com/...` shorthand, `file://`, `:gcs:`, and `.tgz`/`.zip` archives |
| Image builds | `image-build/{opencode,codex,claude,gemini}/Dockerfile`; DAG in `image-build/scripts/lib/targets.sh`; `image-build/cloudbuild-harnesses.yaml` |
| Antigravity source layout | `antigravity/{config.yaml,provision.py,dialect.yaml,skills/}` + root `Dockerfile` + `cloudbuild.yaml` |

Key insight: **the opt-in install command already exists.** The work is mostly
about (a) relocating the bundles, (b) shrinking the default set, and (c) deciding
the fate of the Go embeds and built-in fallbacks.

## Target State

```
<repo root>/
harnesses/ # NEW top-level harness-config directory
opencode/
config.yaml
provision.py
Dockerfile # moved from image-build/opencode/
cloudbuild.yaml # per-bundle image build
home/
.config/opencode/opencode.json
README.md
codex/
config.yaml
provision.py
Dockerfile # moved from image-build/codex/
cloudbuild.yaml
home/
.codex/config.toml
.codex/scion_notify.sh
.bashrc
README.md
antigravity/
config.yaml
provision.py
dialect.yaml
Dockerfile # ported from ptone/scion-antigravity
cloudbuild.yaml
skills/.gitkeep
home/
.gemini/...
README.md
```

Note: the bundle root now carries non-harness-config files (`Dockerfile`,
`cloudbuild.yaml`). `scion harness-config install` copies the whole directory, so
these get copied into `~/.scion/harness-configs/<name>/` too. That is harmless
(they're ignored at provision time) but the install/seed allowlist and
`ComputeHarnessConfigRevision` should be reviewed so image-build files don't
perturb the config revision hash — see Phase D.4.

- `harness.All()` (default-install set) shrinks to **`{gemini, claude}`** (TBD —
see Decision 2).
- OpenCode / Codex / Antigravity become **opt-in**, installed with:
```
scion harness-config install harnesses/opencode # from a repo checkout
scion harness-config install github.com/GoogleCloudPlatform/scion/tree/main/harnesses/codex
```
- The `harnesses/` bundles are the **single source of truth** for these configs.
No duplicate copies under `pkg/harness/*/embeds/`.

## Decisions (locked — ptone, 2026-06-06)

1. **Directory name: `harnesses/`** at the repo root.
2. **Default-install set shrinks to `{claude, gemini}`.** OpenCode, Codex, and
Antigravity become opt-in bundles.
3. **Drop the Go entirely.** Remove both the embeds
(`pkg/harness/{opencode,codex}/embeds*`) **and** the built-in Go
implementations (`opencode.go`, `codex.go`, `codex_config.go`). The
`harnesses/` bundles become the sole source; OpenCode/Codex resolve purely as
container-script harnesses from an installed bundle. No built-in fallback is
retained. (This is more aggressive than the prior design's "keep fallback one
release" guidance — the parity oracle goes away, so the relocated bundles must
be locked down with golden/install tests first; see Phase A.4 and Risks.)
4. **Co-locate `Dockerfile` + cloudbuild file inside each bundle.** Each
`harnesses/<name>/` is self-contained (config + provisioner + image build),
matching the antigravity repo layout. The centralized `image-build/{opencode,
codex}/` dirs are removed and the build DAG/cloudbuild wiring is repointed at
the bundle dirs.
5. **Keep first-party bundles in this repo** under `harnesses/` for now (no split
into separate repos this phase).

## Implementation Plan

Decisions locked above. Steps are ordered to keep the tree green at each commit;
the destructive Go removal (Phase D) lands only after the relocated bundles are
proven (Phase A.4).

### Phase A — Establish `harnesses/` and relocate the OpenCode/Codex bundles

1. Create top-level `harnesses/` with `opencode/` and `codex/` subdirs.
2. Move the embedded bundle files into the new layout, converting the implicit
`mapEmbedFileToHomePath` placement into an **explicit `home/**`** layout
(the prior design's preferred end state, §"File Seeding and Packaging
Changes"):
- OpenCode: `opencode.json` → `harnesses/opencode/home/.config/opencode/opencode.json`; `config.yaml`, `provision.py` at bundle root.
- Codex: `config.toml` → `home/.codex/config.toml`; `scion_notify.sh` → `home/.codex/scion_notify.sh`; `bashrc` → `home/.bashrc`; `config.yaml`, `provision.py` at root.
3. Move the image build into each bundle (Decision 4): `image-build/opencode/Dockerfile`
→ `harnesses/opencode/Dockerfile`, same for codex; add a per-bundle
`cloudbuild.yaml` (extract the opencode/codex steps from
`image-build/cloudbuild-harnesses.yaml`, threading `BASE_IMAGE` from
`scion-base`).
4. **Lock down behavior before deleting the Go oracle.** Capture golden output
from the existing built-in + container-script paths (command construction,
seeded file layout, provision staging) as fixtures, and add a CI smoke test:
`scion harness-config install harnesses/<name> --name <name>-test` →
`scion harness-config show <name>-test` → assert config parses and a dry
provision stages the expected bundle. This replaces the parity oracle that
Decision 3 removes.
5. Add a `README.md` per bundle (purpose, `install` command, auth modes, image
build) — mirror the antigravity repo's README.

### Phase B — Port Antigravity

1. Copy `antigravity/{config.yaml,provision.py,dialect.yaml,skills/}` plus the
root `Dockerfile` and `cloudbuild.yaml` from `ptone/scion-antigravity` into
`harnesses/antigravity/` (Decision 4 keeps image build in-bundle).
2. Reconcile `config.yaml` against the current `HarnessConfigEntry` schema and
`ValidateHarnessConfig`. The antigravity config exercises fields a relocated
first-party bundle may not have: the top-level `mcp:` global-config mapping
block, `dialect.yaml`, and `oauth-token` / `vertex-ai` auth types (the latter
with an empty `vertex-ai: {}` body). Confirm the in-repo schema accepts all of
them; add schema support for any rejected field before merging.
3. The antigravity image needs keyring packages (`gnome-keyring`, `libsecret`)
not in `scion-base` — its `Dockerfile`/`cloudbuild.yaml` already encode the
`core-base → scion-base → antigravity` chain; verify they reference the
in-repo base image tags rather than the external repo's registry.
4. Confirm `ContainerScriptHarness.Provision` stages `dialect.yaml` (it does,
`container_script_harness.go:342`).

### Phase C — Shrink the default-install set

1. Change `harness.All()` to return `{GeminiCLI, ClaudeCode}` (Decision 2).
2. Audit the three call sites (`cmd/project.go`, `cmd/templates.go`,
`cmd/server_foreground.go`) — confirm none assume opencode/codex presence.
3. Update tests that assert the 4-harness default (e.g.
`pkg/config/init_test.go`, `templates_test.go`).

### Phase D — Drop the Go (embeds + built-in implementations)

Decision 3 — remove entirely, no fallback. Land this after Phase A.4 proves the
relocated bundles.

1. Delete `pkg/harness/opencode/` (embeds + `embeds.go`), `pkg/harness/codex/`
(embeds + `embeds.go`).
2. Delete `pkg/harness/opencode.go`, `pkg/harness/codex.go`,
`pkg/harness/codex_config.go`, and their `_test.go` + `*_parity_test.go`
files (the parity tests compared against the now-removed built-in oracle;
their coverage moves to the Phase A.4 install/golden tests).
3. Remove the `codex`/`opencode` cases from `harness.New()` and
`harness.newBuiltin()` so resolution flows: container-script (installed
bundle) → declarative-generic. With no bundle installed, `--harness codex`
falls to `Generic` — acceptable now that they're opt-in (surface a clear
"not installed; run scion harness-config install" hint where practical).
4. Review the install/seed allowlist and `ComputeHarnessConfigRevision` so the
newly co-located `Dockerfile`/`cloudbuild.yaml` in each bundle don't break
provisioning or destabilize the revision hash (either exclude them, or accept
them as part of the hash deliberately).
5. `scion harness-config reset codex` currently restores *embedded* defaults via
`harness.New` — with embeds gone it must change. Repoint `reset` to fail
clearly with "reinstall from bundle: scion harness-config install
harnesses/codex" guidance (and update its tests).
6. Remove `image-build/opencode/` and `image-build/codex/` and repoint the build
DAG (`image-build/scripts/lib/targets.sh`) + `cloudbuild-harnesses.yaml` at
the bundle dirs (or split codex/opencode out of the centralized `harnesses`
target entirely, since their images are now bundle-local).

### Phase E — Discoverability & docs ✓

1. [x] Add `harnesses/README.md` indexing available bundles + install commands.
2. [x] Update `image-build/README.md` (image hierarchy no longer lists
opencode/codex centrally), top-level `README.md`, and
`decoupled-harness-implementation.md` cross-references.
3. [x] Verified web UI harness fallback lists in `agent-create.ts` and
`project-settings.ts` — they enumerate known/installable harnesses (incl.
opt-in ones), not the default-install set; left as-is with clarifying
comments.
4. `scion harness-config list --available` deferred — out of scope for this PR;
noted as follow-up in `harnesses/README.md`.

### Phase F — Migration for existing installs

Existing machines already have `~/.scion/harness-configs/{opencode,codex}/`
seeded. Shrinking defaults and dropping embeds must **not** delete a user's
installed config.

1. `scion init`/upgrade must leave existing installed configs untouched
(additive-only upgrade is already the contract —
`decoupled-harness-implementation.md` §"Existing Installation Upgrade Plan").
2. Existing codex/opencode configs keep resolving as container-script harnesses
from their on-disk dir (they already declare `provisioner.type:
container-script`), so removing the Go built-in does not break them — **but**
any legacy config still on `provisioner.type: builtin` would break. Add an
upgrade check that flags/auto-activates such configs (`--activate-script`)
before the built-in is removed.
3. Document that fresh installs no longer get opencode/codex automatically, plus
the one-line `harness-config install` to restore them. No agent-home
rewrites; already-created agents keep working.

## Risks & Open Questions

- **No more parity oracle (Decision 3).** Removing the built-in Go
implementations deletes the reference behavior the parity tests checked
against. Phase A.4 golden + install tests must land *first* and be trusted.
- **Legacy `provisioner.type: builtin` configs break** once the Go built-in is
gone (Phase F.2). Needs an upgrade/auto-activate safety net.
- **`reset` semantics change** (Phase D.5) — agree on the replacement
(reinstall-from-bundle hint).
- **Image-build files inside config bundles** (Decision 4) mean
`harness-config install`/sync copies `Dockerfile`/`cloudbuild.yaml` into the
installed config dir and into Hub artifacts. Confirm that's acceptable and
doesn't perturb `ComputeHarnessConfigRevision` (Phase D.4).
- **Hub-distributed configs**: brokers install on demand so are unaffected, but
the Hub's own seed/catalog may assume the 4-harness set — audit
`pkg/runtimebroker` + Hub harness-config endpoints.
- **Antigravity schema gaps**: the ported `config.yaml` may use fields the
in-repo validator hasn't accepted from a first-party bundle (MCP mapping
block, empty `vertex-ai` type). Phase B.2 must validate before merging.
- **Web UI / templates** that list harnesses (`web/`, `cmd/templates.go`
template harness-configs) may hard-code the 4 names — grep before shipping.

## Out of Scope (for this phase)

- Migrating Claude/Gemini to container-script bundles (that's
`decoupled-harness-implementation.md` Phase 6).
- Splitting first-party bundles into standalone repos (Decision 5, deferred).
- A full remote harness catalog / marketplace.
Loading