Skip to content

feat(deploy): host prod + dev backends on one box via Cloudflare Tunnel with Watchtower CD#47

Draft
torrid-fish wants to merge 10 commits into
feat/api-toolsfrom
feat/two-env-cd-deploy
Draft

feat(deploy): host prod + dev backends on one box via Cloudflare Tunnel with Watchtower CD#47
torrid-fish wants to merge 10 commits into
feat/api-toolsfrom
feat/two-env-cd-deploy

Conversation

@torrid-fish

@torrid-fish torrid-fish commented May 23, 2026

Copy link
Copy Markdown
Member

目的

把這台機器升級成可以同時 host 兩個 jpcorrect-backend 的 self-host node,並把 deploy 相關檔案全部收進 deploy/

  • Prod (api.sessatakuma.dev) — 對外開放、應用層 auth (CLIENT_API_KEY + JWT)、GIN_MODE=release、Swagger 隱藏。Image pin 在 :stable,只有 push 新的 v*.*.* git tag 才會自動更新。
  • Dev (api-dev.sessatakuma.dev) — 只給內部開發者用、邊緣由 Cloudflare Access (Zero Trust) gate、GIN_MODE=debug、Swagger UI 看得到。Image 跟 :latest,每次 main merge 都會自動更新。

Watchtower (一個 container) poll GHCR 每 5 分鐘,看到新 image 就替換對應的 container(兩個 backend + api-tools),零停機重啟。

此 PR base 疊在 #38 (feat/api-tools) 上。#38 把 self-host 0→1(單一 backend + api-tools + auth 強化);本 PR 接手做 1→2(prod/dev 分流)並把 deploy 檔案重整進 deploy/

方法/實作說明

六個 conventional commits,每個都可獨立 review:

  1. feat(api): skip auth middlewares when GIN_MODE=debuginternal/api/api.goif !gin.IsDebugging() 包住 APIKeyMiddleware / AuthMiddlewareinternal/cmd/api.go 在 debug 模式跳過 JWKS init、容忍空 JWKS_URL。Side effect:本地 make air 不再需要貼 CLIENT_API_KEY / JWT。
  2. build(deploy): split compose stack into prod + dev with watchtower CD — prod/dev 雙 backend、雙 Postgres、三 network、cloudflared、watchtower 的 compose 拆分。
  3. ci(cd): publish :stable tag on semver releasesdocker/metadata-action 多一條 raw tag rule,只在 git tag (v*.*.*) push 才產生 :stable
  4. docs(agents): document two-env deploy stack and debug-mode middleware skip — AGENTS.md deployment section 重寫、env-var 表、Common Gotchas 同步。
  5. refactor(deploy): consolidate stack under deploy/ and pull api-tools from GHCR — 見下方「檔案重整」。
  6. docs: document deploy/ layout, api-tools GHCR service, raw compose commands — AGENTS.md + README 對齊新佈局。

檔案重整(commit 5)

所有 deploy-only 檔案收進 deploy/,跟 root 的 dev/build 檔案分開:

```
deploy/
├── compose.yml # the stack (compose default name; auto-detected, 不用 -f)
├── Makefile # make -C deploy up / up-prod / up-dev / pull-* / down
├── env/{prod,dev,api-tools}.example # tracked 範本;真檔 gitignored host-only
└── cloudflared/{config.yml, creds/} # creds gitignored host-only
```

  • compose.deploy.ymldeploy/compose.yml;root 留一個 compose.yml(local-dev Postgres,原 compose.local.yml)。
  • deploy/compose.yml pin 了 top-level name: jpcorrect-backend,所以即使檔案移位,volume / network 名稱維持 jpcorrect-backend_*./ bind mounts 改以 deploy/ 為基準解析。
  • .env.deploy.{prod,dev}.exampledeploy/env/{prod,dev}.example
  • api-tools 改從 GHCR 拉:不再 clone/在 host 跑,而是 deploy/compose.yml 內的一個 service(ghcr.io/sessatakuma/api-tools,預設 :stable,可用 API_TOOLS_IMAGE 覆寫)。單一共享 instance 服務兩個 backend,watchtower 一併監看。新增 deploy/env/api-tools.exampleYAHOO_API_KEY)。
  • Deploy 的 Make targets 移到 deploy/Makefilemake -C deploy <target>);root Makefile 精簡成 air + swag,local Postgres 用生 docker compose up -d / logs -f / stop 驅動。

關鍵實作

  • 網路隔離:postgres-dev 跟 postgres-prod 在不同 docker bridge network,DNS 不跨網互解;backend-dev 即使被 compromised 也碰不到 prod DB。Cloudflared 跟 api-tools 只在 jpcorrect-shared 上。

    Container prod-net dev-net jpcorrect-shared
    postgres-prod
    postgres-dev
    backend-prod
    backend-dev
    api-tools
    cloudflared
  • Debug-mode middleware skip:唯一信號 = gin.IsDebugging()。Prod 全 middleware 啟動;dev 全跳過,由邊緣 CF Access gate。

  • Watchtower label-enable:只 label backend-prod / backend-dev / api-tools;postgres / cloudflared / watchtower 自己不會被自動更新。${HOME}/.docker/config.json mount 進去拿 GHCR 認證。

關聯 Issue

無 (內部基礎建設工作)。

附註

一次性 host 設定(這個 PR merge 後做一次)

```bash
docker network create jpcorrect-shared 2>/dev/null || true

api-tools 現在是 deploy/compose.yml 內的 service(從 GHCR 拉),不再需要單獨啟動。

若舊的 standalone jpcorrect-api-tools 還在跑,先停掉釋放 container 名稱:

docker rm -f jpcorrect-api-tools 2>/dev/null || true

Cloudflare tunnel jb 既有於 deploy/cloudflared/creds/。註冊 DNS route:

docker run --rm -v $PWD/deploy/cloudflared/creds:/home/nonroot/.cloudflared
cloudflare/cloudflared:latest tunnel route dns jb api.sessatakuma.dev
docker run --rm -v $PWD/deploy/cloudflared/creds:/home/nonroot/.cloudflared
cloudflare/cloudflared:latest tunnel route dns jb api-dev.sessatakuma.dev

Cloudflare Zero Trust dashboard(無 IaC):

Access → Applications → Add → Self-hosted

Application domain: api-dev.sessatakuma.dev

Policy: e.g. include emails @sessatakuma.dev

api.sessatakuma.dev 不要加 — prod 維持公開。

(Image private 才需要) GHCR login,讓 Watchtower 拿得到認證

docker login ghcr.io -u # PAT with read:packages

起 stack

cp deploy/env/prod.example deploy/env/prod # 填 prod CLIENT_API_KEY / JWKS_URL / ALLOWED_ORIGINS
cp deploy/env/dev.example deploy/env/dev # dev:CLIENT_API_KEY / JWKS_URL 留空
cp deploy/env/api-tools.example deploy/env/api-tools # 填 YAHOO_API_KEY
make -C deploy up
```

Verification 一覽

  • curl https://api.sessatakuma.dev/healthzok
  • curl https://api.sessatakuma.dev/swagger/index.html → 404(release 隱藏)
  • curl -X POST https://api.sessatakuma.dev/v1/dict-query -d '{\"word\":\"先生\"}' → 401(沒帶 API key);加 -H 'X-API-Key: <key>' → 200
  • 瀏覽器 https://api-dev.sessatakuma.dev/swagger/index.html → 先過 CF Access → 看到 Swagger UI;"Try it out" /v1/dict-query 不帶 Authorize → 200(middleware 跳過)
  • docker logs jpcorrect-backend-dev → 有 ⚠️ GIN_MODE=debug — APIKeyMiddleware and AuthMiddleware are DISABLED...
  • 網路隔離已用 throwaway curl container 驗證(dev 網路碰不到 postgres-prod)
  • Push commit to main → 5 分鐘內 backend-dev image 變新;推 v0.0.0-test tag → 5 分鐘內 backend-prod 變新(之後刪測試 tag)

本地已用 local-only 方式(不接 cloudflared)端到端驗證過:release/debug middleware 行為、swagger 可見性、網路隔離、api-tools 可達性、name: pin 後 volume 名稱不變,皆通過。

✅ api-dev 已實機驗證(2026-05-27):透過真實 Cloudflare Tunnel jb + Zero Trust Access 端到端測試通過。api-dev.sessatakuma.dev DNS route 用 cloudflared tunnel route dns jb api-dev.sessatakuma.dev 註冊後,curl /healthz302 導向 sessatakuma.cloudflareaccess.com/cdn-cgi/access/login/…,確認 route 正常且 CF Access gate 有生效。

  • TODO:
    • api-dev tunnel route 上線並實機驗證(2026-05-27 完成)
    • API-tools repo 加上 publish workflow(push :latest/:stable 到 GHCR)。:stable tag 出現前,api-tools 暫時用 API_TOOLS_IMAGE=ghcr.io/sessatakuma/api-tools:latest
    • Merge 後執行上面的一次性 setup
    • 推一次 v0.0.0-test 驗證 CD,跑完刪掉
    • feat(deploy): self-host stack, api-tools integration, auth hardening #38 merge 後 rebase 本條到 main、改 PR base

🤖 Generated with Claude Code

torrid-fish and others added 4 commits May 23, 2026 03:23
Wraps APIKeyMiddleware (on the 7 api-tools routes) and AuthMiddleware
(on the /v1/* user-scoped routes) in `if !gin.IsDebugging()` so that
debug builds skip both. Intended use cases:

- Local dev (`make air`): devs no longer need to paste CLIENT_API_KEY
  into Swagger UI's Authorize dialog or carry a JWT through every
  request to test handlers.
- Internal-dev deploy (api-dev.sessatakuma.dev): the dev backend sits
  behind Cloudflare Access at the edge; once a request reaches the
  app, the user has already been authenticated by Zero Trust.

A single warning is logged at startup when the skip is active so the
behavior is loud in container logs.

Also tolerates an empty JWKS_URL in debug mode — InitializeJWKS is
skipped entirely, since AuthMiddleware (the only JWKS consumer) is
not registered. The release-mode fatal-on-empty stays unchanged so
production misconfiguration is still caught early.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rewrites the deploy compose stack to host two backend instances on a
single machine, each in its own isolated network, both exposed through
a single Cloudflare Tunnel and kept up-to-date by Watchtower.

Six services in compose.deploy.yml:
- postgres-prod / postgres-dev — one Postgres per env, no host bind,
  separate named volumes (postgres_prod_data, postgres_dev_data)
- backend-prod (image: :stable) — public, full auth, GIN_MODE=release
- backend-dev  (image: :latest) — Cloudflare Access-gated, GIN_MODE=debug
- cloudflared  — single tunnel `jb`, ingress driven by
  deploy/cloudflared/config.yml (two hostnames → two backends)
- watchtower   — label-enable mode, polls GHCR every 5 min, mounts
  ${HOME}/.docker/config.json for private-registry auth

Network isolation:
                     prod-net  dev-net  jpcorrect-shared
  postgres-prod        ✓
  postgres-dev                   ✓
  backend-prod         ✓                       ✓
  backend-dev                    ✓             ✓
  cloudflared                                  ✓

So backend-dev cannot reach postgres-prod (different network, no DNS).
Both backends reach the sibling jpcorrect-api-tools container via the
external jpcorrect-shared bridge.

Other artifacts:
- compose.local.yml: a dedicated local-dev Postgres bound to 127.0.0.1
  (separate postgres_local_data volume) so `make air` keeps working
  without touching the deploy postgreses
- deploy/cloudflared/config.yml: tunnel ingress config, committed (no
  secrets); bind-mounted on top of ./.cloudflared/ so the credentials
  dir stays a pure secrets-only location
- .env.deploy.{prod,dev}.example: split from the old single
  .env.deploy.example (deleted in the previous commit). Prod sets
  GIN_MODE=release with full auth; dev sets GIN_MODE=debug and leaves
  CLIENT_API_KEY / JWKS_URL empty (middlewares are skipped in debug)
- Makefile: deploy-up-{prod,dev}, deploy-pull-{prod,dev},
  deploy-up-infra targets. db-up now points at compose.local.yml
- .gitignore: track *.example files explicitly while ignoring real
  .env.deploy.{prod,dev} env files

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a `:stable` raw tag to the docker/metadata-action output that
fires only on git tag pushes matching `v*.*.*`. Together with the
existing `:latest` (default-branch only) and semver tags, this lets
the production backend track `:stable` so it only auto-updates when
a release is explicitly cut, while the dev backend tracks `:latest`
and gets every main merge.

Trade-off vs reusing semver tags directly: `:stable` is a moving
pointer (always the most recent semver release), so prod's
Watchtower instance can poll a single tag rather than needing the
operator to bump the BACKEND_PROD_IMAGE env each release.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… skip

Rewrites the Deployment stack section of AGENTS.md (CLAUDE.md is a
symlink) to cover the new two-env layout: prod vs dev posture, the
network membership table, the one-time host setup (jpcorrect-shared
network, tunnel route dns, Cloudflare Access policy, GHCR docker
login), and the day-to-day make targets.

Updates the environment-variable table to reflect that:
- CLIENT_API_KEY is release-mode only (debug skips APIKeyMiddleware)
- JWKS_URL is required only in release mode
- GIN_MODE doubles as the auth-skip signal — debug builds disable
  app middlewares and require an edge gateway

Updates Common Gotchas #3, #8, #9, #11 to match the new behavior.
Replaces the old single-image BACKEND_IMAGE env-var row with the
prod/dev pair (BACKEND_PROD_IMAGE → :stable, BACKEND_DEV_IMAGE
→ :latest) and notes that POSTGRES_PORT now belongs to the local
compose stack.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@torrid-fish

Copy link
Copy Markdown
Member Author

Rebased onto new feat/api-tools tip (2026-05-23)

PR #38 had its history rewritten to drop the api-tools git submodule (see the comment on that PR). This branch was rebased on top to keep PR #47's diff clean.

Same 4 commits, new SHAs:

Old New
99bec5e ace155f feat(api): skip auth middlewares when GIN_MODE=debug
ba999ee 2a752f3 build(deploy): split compose stack into prod + dev with watchtower CD
1dfa5a3 47cffd1 ci(cd): publish :stable tag on semver releases
660f17a 5d02954 docs(agents): document two-env deploy stack and debug-mode middleware skip

Content unchanged. Verified locally: go build, go vet, go test, docker compose -f compose.deploy.yml config all clean.

🤖 Generated with Claude Code

@torrid-fish torrid-fish marked this pull request as draft May 27, 2026 11:28
@sessatakuma sessatakuma deleted a comment from chatgpt-codex-connector Bot May 27, 2026
torrid-fish and others added 6 commits May 27, 2026 11:34
…from GHCR

Move all deploy-only files under deploy/ to separate them from dev/build
files at the repo root:

- compose.deploy.yml      -> deploy/compose.yml   (pins top-level name:
  jpcorrect-backend so volume/network names are unaffected by the move;
  ./ bind mounts now resolve relative to deploy/)
- compose.local.yml       -> compose.yml          (root, local-dev Postgres)
- .env.deploy.{prod,dev}.example -> deploy/env/{prod,dev}.example
- cloudflared creds/config now under deploy/cloudflared/

Add api-tools as a service in the deploy stack, pulled from GHCR
(ghcr.io/sessatakuma/api-tools, default :stable, override via
API_TOOLS_IMAGE) instead of cloned/run locally. A single shared instance
serves both backends over the jpcorrect-shared bridge; watchtower watches
it alongside the two backends. New deploy/env/api-tools.example holds
YAHOO_API_KEY.

Split deploy Make targets into deploy/Makefile (run via make -C deploy
<target>; docker compose auto-detects deploy/compose.yml, no -f needed).
Root Makefile trimmed to air + swag; local Postgres is driven with raw
docker compose up -d / logs -f / stop.

.gitignore now tracks deploy/env/*.example and ignores deploy/env/{prod,
dev,api-tools} + deploy/cloudflared/creds/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…mmands

Sync AGENTS.md + README.md with the consolidated deploy/ stack:
- deploy/ file tree, project-name pinning note, network table (+api-tools row)
- api-tools served from GHCR in the deploy stack (cloned via uv only for dev)
- one-time host setup drops the sibling api-tools launch, adds a
  docker rm -f jpcorrect-api-tools migration note
- day-to-day uses make -C deploy <target> + cp api-tools.example
- env tables add API_TOOLS_IMAGE / api-tools.example
- local-dev commands rewritten to raw docker compose up -d / logs -f / stop
  and the raw uv run uvicorn invocation (removed make db-*/api-tools targets)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The local-unidic api-tools image bundles the UniDic dictionary and needs
no env vars (accent/furigana no longer come from the Yahoo MA API).

- compose.yml: remove the api-tools env_file (env/api-tools)
- delete deploy/env/api-tools + deploy/env/api-tools.example
- strip YAHOO_API_KEY from .env.example, README, AGENTS.md
- update AGENTS.md deploy tree, env table, gotcha #13 accordingly

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lets a feature branch be deployed to the dev env without touching the
:latest/:stable tags. The tag is emitted only for workflow_dispatch runs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A baked-in host=localhost:8080 made Swagger UI's Execute fire requests at
localhost regardless of where the UI is served, so Try-it-out failed on
api-dev.sessatakuma.dev. Omitting @host lets Swagger UI fall back to the
page origin, working across local/dev/prod without per-env regen.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…n daemons

containrrr/watchtower:latest is essentially dormant and defaults to
Docker Engine API v1.25, which Docker daemons >= v28 (API >= 1.40)
reject as too old. Without this, the container comes up "healthy" but
every monitoring tick errors out with "client version 1.25 is too
old", and images silently never auto-update. Pin to 1.41 - high enough
for modern daemons to accept, low enough for the stale watchtower
client to honor.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant