test: [YW-266] stabilize native-DB-concurrency test flakes (WsManager + mapping-store)#147
Conversation
📝 WalkthroughWalkthroughUpdates the ChangesCI and Test Configuration
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
c3854eb to
f4d8486
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c3854eb69b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
f4d8486 to
582c24f
Compare
… + mapping-store) Two native-DB (PGlite/WASM) test races that ate CI re-runs across unrelated PRs this cycle. Fixed at the mechanism level — no test.retry(), no bare timeout bumps to mask the races. server-core (mapping-store flake): vitest 4.x runs all five server-core test files in parallel forks. Every file spins up PGlite (WASM Postgres), so they cold-start the WASM module simultaneously and contend for file descriptors and memory on a loaded CI runner. The first test in each file — "mapping-store > get returns the record written by set" — was the most frequent victim because it runs before any warm-up query. Serialize the suite (maxWorkers/minWorkers 1) so the files cold-start one at a time. The tests are I/O-bound, so the wall-clock cost is small. client / WsManager (the original report) is fixed in the wystack submodule bump: ws.test.ts leaked ~12 PGlite WASM instances (sync afterEach never closed them), squeezing the 2500ms-sleep no-reconnect tests against the 5s bun timeout. The submodule commit adds a tracked-handle registry drained in an async afterEach plus timeout headroom. See wystack PR for detail. De-flake verified locally, each file run 20x with 0 failures: - server-core mapping-store.test.ts: 20/20 pass - client ws.test.ts (submodule): 20/20 pass NOTE: submodule points at the wystack YW-266 branch commit; re-point to the merged wystack main commit before merging this PR. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
582c24f to
f91dade
Compare
Problem
Two native-DB (PGlite/WASM) test races ate CI re-runs across unrelated PRs this cycle (e.g. #138 hit the mapping-store flake despite touching zero server-core files). They are a class, not one test:
@wystack/clientWsManager > does not reconnect on close code 4001— original report.@wystack/server-coremapping-store.test.ts > DrizzleMappingStore > get returns the record written by set— flaked on a PR touching zero server-core files; 53s file duration = native-DB timing contention.Both are test-setup races, fixed at the mechanism level — no
test.retry(), no bare timeout bumps to mask the race.Fixes
server-core —
vitest.config.ts(this repo)vitest 4.x runs all five server-core test files in parallel forks. Every file spins up PGlite (WASM Postgres), so they cold-start the WASM module simultaneously and contend for FDs/memory on a loaded runner. The first test in each file runs before any warm-up query, so it's the victim. Serialize the suite (
maxWorkers: 1/minWorkers: 1) so files cold-start one at a time. Tests are I/O-bound — small wall-clock cost.client / WsManager — wystack submodule bump
ws.test.tsleaked ~12 PGlite WASM instances (syncafterEachnever closed them), squeezing the 2500ms-sleep no-reconnect tests against the 5s bun timeout. The submodule commit adds a tracked-handle registry drained in an asyncafterEach, plus timeout headroom. Detail in youhaowei/wystack#46.Verification
De-flake proven by looped local runs (a single green run does not prove a flake fixed):
mapping-store.test.ts: 20/20 pass, 0 failws.test.ts(submodule): 20/20 pass, 0 failmaxWorkers:1): 70 passed, 1 todoSubmodule sequencing
The
libs/wystackpointer references the YW-266 branch commit so CI runs green against the fix now. Re-point to the merged wystack main commit before merging this PR (depends on youhaowei/wystack#46).Tracked internally as YW-266.
🤖 Generated with Claude Code
Greptile Summary
This PR stabilizes two recurring CI flakes caused by PGlite/WASM test-setup races — no retries or timeout masking. Both fixes target the mechanism: serializing the server-core vitest worker pool so five PGlite files no longer cold-start simultaneously, and bumping the wystack submodule to pick up an async
afterEachthat properly drains leaked WASM handles inws.test.ts.packages/server-core/vitest.config.ts: addsmaxWorkers: 1/minWorkers: 1with a detailed comment; forces sequential file execution and eliminates the FD/memory contention that caused the first test in each file to be the flake victim.libs/wystack: advances the submodule pointer to the wystack YW-266 branch commit (4bed431) which adds a tracked-handle registry drained via asyncafterEach; the PR description explicitly notes this must be re-pointed to the merged main SHA before merging.Confidence Score: 5/5
Safe to merge once the wystack submodule is re-pointed to the merged main commit as documented.
Both changes are targeted and low-risk: the vitest config serializes existing test files (no production code touched), and the submodule bump only affects test infrastructure. The pre-merge gate (re-pointing the submodule to wystack main after wystack#46 merges) is clearly documented in the PR description and acknowledged by the author.
No files require special attention beyond the documented pre-merge step of re-pointing libs/wystack to the merged wystack main SHA.
Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[CI: server-core test run] --> B{Before fix\nmaxWorkers default} B --> C[Fork 1: mapping-store.test.ts\nPGlite WASM cold-start] B --> D[Fork 2: other.test.ts\nPGlite WASM cold-start] B --> E[Fork 3: ...\nPGlite WASM cold-start] C & D & E --> F[Contend for FDs/memory\non loaded CI runner] F --> G[First test in each file\nflakes / times out] A2[CI: server-core test run] --> B2{After fix\nmaxWorkers: 1} B2 --> H[Fork 1: mapping-store.test.ts\nPGlite WASM cold-start] H --> I[Fork 1 done] I --> J[Fork 2: other.test.ts\nPGlite WASM cold-start] J --> K[... sequential ...] K --> L[All tests pass] style G fill:#f88,stroke:#c00 style L fill:#8f8,stroke:#0c0%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%% flowchart TD A[CI: server-core test run] --> B{Before fix\nmaxWorkers default} B --> C[Fork 1: mapping-store.test.ts\nPGlite WASM cold-start] B --> D[Fork 2: other.test.ts\nPGlite WASM cold-start] B --> E[Fork 3: ...\nPGlite WASM cold-start] C & D & E --> F[Contend for FDs/memory\non loaded CI runner] F --> G[First test in each file\nflakes / times out] A2[CI: server-core test run] --> B2{After fix\nmaxWorkers: 1} B2 --> H[Fork 1: mapping-store.test.ts\nPGlite WASM cold-start] H --> I[Fork 1 done] I --> J[Fork 2: other.test.ts\nPGlite WASM cold-start] J --> K[... sequential ...] K --> L[All tests pass] style G fill:#f88,stroke:#c00 style L fill:#8f8,stroke:#0c0Reviews (3): Last reviewed commit: "chore: re-point libs/wystack to main (YW..." | Re-trigger Greptile