Skip to content

feat(web): edge cache HTML + metadata routes at Cloudflare#12

Merged
caio-pizzol merged 1 commit into
mainfrom
caio-pizzol/edge-cache-html-and-metadata
May 19, 2026
Merged

feat(web): edge cache HTML + metadata routes at Cloudflare#12
caio-pizzol merged 1 commit into
mainfrom
caio-pizzol/edge-cache-html-and-metadata

Conversation

@caio-pizzol

Copy link
Copy Markdown
Contributor

Stops paying R2 GetObject ops on every page view for HTML and the three metadata files. Two-part change, only one of which is in this repo.

Out-of-repo (already live, recorded in apps/web/README.md): Zone-level Cloudflare Cache Rule that enables caching for the homepage, the four content pages, the type/topic indexes + per-type/topic pages, and `/sitemap.xml`, `/robots.txt`, `/llms.txt`. Action is `cache: true` with `edge_ttl.mode: bypass_by_default` so origin Cache-Control drives the TTL.

In-repo: `deploy-site.yml` now adds `Cache-Control: public, max-age=300, stale-while-revalidate=3600` to HTML uploads and to the three metadata files. Non-HTML / non-metadata files (favicons, og-image, logo) keep their previous upload behavior so default static-asset caching is unchanged.

Verified before merge:

  • Cache Rule created via API: ruleset `bec0ea98e0444bcaaf518d626b4c47e2`, rule `b85abcfdb1fa4d07a160e88f7cc4fafd`, enabled
  • Expression narrowed to known cacheable paths only; future routes need explicit addition (intentional)
  • X-Robots-Tag transform rule (`http_response_headers_transform` ruleset) unchanged; `/documents/` and `/extracted/` not in the cache expression
  • Lefthook format/lint/typecheck green

Smoke test after merge auto-deploys:
```bash
URL="https://docxcorp.us/dataset"
curl -sI "$URL" | grep -iE "cf-cache-status|cache-control" # MISS, header from origin
curl -sI "$URL" | grep -i "cf-cache-status" # HIT (within 5 min)
curl -sI "https://docxcorp.us/documents/.docx" | grep -i "x-robots-tag" # noindex preserved
```

Rollback if needed: revert this PR (origin headers go away) and disable the Cache Rule in the Cloudflare dashboard.

Cloudflare doesn't cache HTML by default. After the Astro migration,
HTML pages were returning cf-cache-status: DYNAMIC, hitting R2 on
every request. Two-part fix:

1. A Cloudflare zone-level Cache Rule (set outside the repo, recorded
   in apps/web/README.md) enables caching for: '/', the four content
   pages, the two index pages, /types/* and /topics/*, and the three
   metadata files. Static assets keep Cloudflare's default behavior.
   Action: cache=true, edge_ttl.mode=bypass_by_default (TTL driven by
   origin Cache-Control).

2. deploy-site.yml sets Cache-Control: public, max-age=300,
   stale-while-revalidate=3600 on HTML uploads and on sitemap.xml,
   robots.txt, llms.txt. Non-HTML/non-metadata files keep their
   previous upload behavior so favicon/og-image/logo caching is
   unchanged.

After this deploys, two same-URL requests in a row should show
cf-cache-status: MISS then HIT. /documents/* and /extracted/* are not
in the Cache Rule expression, so the noindex transform on raw files
is unaffected.
@codecov

codecov Bot commented May 19, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@caio-pizzol caio-pizzol merged commit b4bef84 into main May 19, 2026
2 checks passed
@caio-pizzol caio-pizzol deleted the caio-pizzol/edge-cache-html-and-metadata branch May 19, 2026 18:00
caio-pizzol pushed a commit that referenced this pull request May 19, 2026
After PR #12 deployed, live Cache-Control returned max-age=14400 from
Cloudflare's zone-level Browser Cache TTL default, overriding origin
max-age=300. Patched the existing rule (id b85abcfdb1fa4d07a160e88f7cc4fafd)
via API to add browser_ttl.mode: respect_origin. Live now serves origin
max-age on HTML/metadata paths while keeping the 4-hour zone default for
static assets and content-addressed raw files where it's actually desired.

This commit keeps the README accurate; the Cloudflare rule itself was
updated out-of-band via API.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants