refactor(producer): extract captureStage (SDR disk path)#730
Conversation
Move the SDR / DOM-only-HDR disk-capture body out of `executeRenderJob` into `services/render/stages/captureStage.ts`. Covers both branches of the disk path: parallel capture via `executeDiskCaptureWithAdaptiveRetry` (`workerCount > 1`) and sequential per-process capture (`workerCount === 1`, reusing `probeSession` when available). The HDR layered branch (`useLayeredComposite === true`) and the streaming encode fusion path (`useStreamingEncode === true` with successful encoder spawn) stay inline in the sequencer — they will be extracted by the next two PRs in the stack. Hard constraints preserved verbatim: - `probeSession` is closed (and the sequencer's `let probeSession` nulled via the returned result) at the same points. - `captureAttempts` is mutated in place — the parallel retry loop still pushes each attempt onto the array the sequencer owns. - `workerCount` reassignment from adaptive retry survives via the returned result. - `lastBrowserConsole` is set to the buffer of whichever session was active last (probe close path or sequential capture finally). - `job.framesRendered` is updated at the same per-frame / per-progress points; `Capturing frame N/M [(K workers)]` `updateJobStatus` payloads fire at the same 30-frame and completion checkpoints. - `perfStages.captureMs` is still computed by the sequencer from the outer `stage4Start` so its window covers both the in-sequencer setup (fileServer init, calibration, worker resolution, preset selection) AND the capture call. Two small new exports on `renderOrchestrator.ts`: - `executeDiskCaptureWithAdaptiveRetry` — was a private helper; the stage calls it directly. - `updateJobStatus` — was a private helper; the stage uses it for the per-frame progress callbacks so the `completedAt` branch matches. These re-introduce a small runtime cycle between the stage and the orchestrator (orchestrator imports `runCaptureStage`; stage imports helpers back). The cycle is safe (both modules finish loading before any stage function is invoked at runtime) and will be flattened in a follow-up PR that consolidates capture helpers into a shared module. Removes the now-orphaned `captureFrame` import from the orchestrator. Verified inside `Dockerfile.test`: - `font-variant-numeric`: audio correlation 1.000 - `many-cuts`: 0 failed frames, audio correlation 0.994 - `variables-prod`: PSNR ~69 dB, audio correlation 0.975 - `sub-composition-video`: PSNR ~43-52 dB, audio correlation 0.947 (exercises video extraction + capture end-to-end) - `gsap-letters-render-compat`: PSNR ~53-55 dB, audio correlation 1.000 (exercises the parallel capture path; 5/5 PASS overall) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
vanceingalls
left a comment
There was a problem hiding this comment.
Verdict: approve.
Extraction of the SDR / DOM-only-HDR disk-capture path into captureStage. Both branches (workerCount > 1 parallel adaptive-retry, workerCount === 1 sequential) lifted byte-clean.
Verified against the pre-#730 orchestrator:
captureAttemptsmutation preserved — the stage still appends to the caller-owned array viacaptureAttempts.push(...attempts). The JSDoc onCaptureStageInput.captureAttemptscalls this out explicitly.workerCountadaptive-shrink preserved — the last attempt'sworkerscount flows back to the sequencer through the result.probeSessionclose-and-null contract preserved — closed in the parallel branch when present, reused in the sequential branch viaprepareCaptureSessionForReuse. Either way the result returnsprobeSession: nulland the sequencer'slet probeSessionis updated.lastBrowserConsoleset to the correct session's buffer in both branches (parallel: probe session's buffer if it existed; sequential: the working session's buffer infinally). This matches the audit-all-sites rule — every place that previously assignedlastBrowserConsolestill does, with the same source.- Progress callback semantics identical: 30-frame checkpoint + completion (parallel) or per-frame (sequential), same
Capturing frame N/Mstrings, same 25 + frameProgress * 45 progress math.
The totalFrames narrowing (input declares number, sequencer narrows from job.totalFrames: number | undefined via probeStage) is the cleanest typing improvement in the stack so far. Pre-#730 the orchestrator just read job.totalFrames directly inside the loop, which TypeScript was tolerating via the early throw in probeStage.
Important (non-blocking, cross-stack):
captureStage.ts:55-64— importsexecuteDiskCaptureWithAdaptiveRetryfromrenderOrchestrator.ts, which now imports the stage back. The header comment acknowledges this as a runtime cycle and points to a future consolidation. #737 only movesupdateJobStatus;executeDiskCaptureWithAdaptiveRetry(pluscountFrameRanges,safeCleanup,sampleDirectoryBytesfrom #720) stays in the orchestrator. Worth filing as a follow-up before the stack lands so it doesn't get forgotten — the cycle is safe at runtime today, but it's a structural smell that gets harder to undo as more stages depend on the orchestrator's helpers.
Nits:
captureStage.ts:128— the parallel branch'sCapturing frame N/M (W workers)payload uses the originalworkerCountfor the(W workers)suffix, not the adaptive-reduced one. This is faithful to the pre-extraction code (it had the same behavior) but is arguably a latent bug — after the first retry, the message reports the wrong worker count. Out of scope for this PR; flagged for future cleanup.
Praise: the inline comments inside the stage are denser than the equivalent inline comments inside executeRenderJob were — moving the code into a module justified the extra documentation, and the result is more readable than the original.
— Vai
miguel-heygen
left a comment
There was a problem hiding this comment.
Clean mechanical extraction — no behavior changes, no introduced bugs. Verified imports, error handling, and cleanup invariants are preserved. LGTM. — Magi

What
Phase 1 PR 1.6 of the distributed-render refactor. Moves the SDR / DOM-only-HDR disk-capture body of
executeRenderJobintopackages/producer/src/services/render/stages/captureStage.ts. The sequencer now callsrunCaptureStageat the same code point with identical inputs and outputs.Covers both branches of the disk-capture path:
workerCount > 1: parallel capture viaexecuteDiskCaptureWithAdaptiveRetry, with adaptive retry that can shrink the worker count.workerCount === 1: sequential capture in the orchestrator process, reusingprobeSessionwhen non-null.The HDR layered branch (
useLayeredComposite === true) and the streaming encode fusion path (useStreamingEncode === truewith a successful encoder spawn) stay inline in the sequencer for now — they will be extracted by PR 1.7 and PR 1.8 in the same stack.Why
Continues the Phase 1 mechanical extraction. The capture stage is the largest single piece of the refactor and the one most likely to drift on a sloppy extraction, so this PR carries the most verification artifacts (5 fixtures in the Docker smoke run, spanning calibration / parallel capture / sub-composition video / GSAP / static text).
How
captureStage.tsexportsrunCaptureStage(input) → CaptureStageResult. The function body is the existing disk-capture code lifted verbatim with identical control flow, identicalupdateJobStatuspayload shapes, and the same closure references (buildCaptureOptions,createRenderVideoFrameInjector) passed in by the sequencer.} else {disk-capture branch with a call torunCaptureStage(...). Re-assignsworkerCount,probeSession, andlastBrowserConsolefrom the returned result.executeDiskCaptureWithAdaptiveRetry(was private)updateJobStatus(was private)captureFrameimport from the orchestrator (oxlint flagged).Preserved invariants
captureAttemptsmutated in place — the parallel path still appends each retry attempt onto the array the sequencer owns.workerCountis reduced by adaptive retry; the final value returns to the sequencer and is used for the streaming-encode gate logging in subsequent code.probeSessionis closed at the same code points (parallel: after retries; sequential: in the session'sfinally).lastBrowserConsoleis set to the buffer of whichever session was active last.job.framesRenderedis updated at the same per-frame and per-progress points.updateJobStatus(..., "rendering", "Capturing frame N/M [(K workers)]", ...)fires at the same 30-frame and completion checkpoints (parallel) or every frame (sequential), with the same percentage math25 + frameProgress * 45.perfStages.captureMsis still computed by the sequencer from the outerstage4Start, so the timing window covers both the in-sequencer setup (file server init, calibration, worker resolution, preset selection) AND the capture call.Known follow-up: small runtime import cycle
The stage imports
executeDiskCaptureWithAdaptiveRetryandupdateJobStatus(runtime) plusCaptureAttemptSummary,ProgressCallback,RenderJob(types) fromrenderOrchestrator.ts. The orchestrator importsrunCaptureStagefrom the stage. This re-introduces a small runtime cycle that PR 1.3.5 broke for an earlier set of helpers.The cycle is safe in practice: both modules finish module-init before any stage function is invoked at runtime, and the imports are only dereferenced inside
runCaptureStage's body. The same pattern will appear in PR 1.7 and PR 1.8 as more capture helpers are needed.After PR 1.10 merges, a follow-up will consolidate the capture helpers (
executeDiskCaptureWithAdaptiveRetry,updateJobStatus,safeCleanup,sampleDirectoryBytes,countFrameRanges, etc.) intorender/shared.ts(or a siblingrender/orchestratorHelpers.ts) so the stages import them without reaching back into the orchestrator. Doing it now would balloon this PR's diff with unrelated churn.Test plan
bunx oxlint packages/producer/src/services/render/stages/captureStage.ts packages/producer/src/services/renderOrchestrator.ts— clean.bunx oxfmt --check— clean.bun run --filter @hyperframes/producer typecheck— clean.bun run --filter @hyperframes/producer build— clean.bun test packages/producer/src/services/— 176 pass, same single pre-existing failure unrelated to this PR.docker run hyperframes-producer:test --sequential font-variant-numeric many-cuts variables-prod sub-composition-video gsap-letters-render-compat— 5/5 PASS with audio correlations 1.000 / 0.994 / 0.975 / 0.947 / 1.000 and zero failed visual checkpoints across all 100-point sweeps.gsap-letters-render-compatandsub-composition-videobetween them exercise the parallel-capture path with video extraction; the rest cover sequential capture.regressionworkflow.🤖 Generated with Claude Code