Skip to content

refactor(producer): extract captureStage (SDR disk path)#730

Open
jrusso1020 wants to merge 1 commit into
05-12-refactor_producer_extract_audiostage_from_executerenderjobfrom
05-12-refactor_producer_extract_capturestage_sdr_disk_path_
Open

refactor(producer): extract captureStage (SDR disk path)#730
jrusso1020 wants to merge 1 commit into
05-12-refactor_producer_extract_audiostage_from_executerenderjobfrom
05-12-refactor_producer_extract_capturestage_sdr_disk_path_

Conversation

@jrusso1020
Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 commented May 12, 2026

Stacked on #725#726 → this. Rebase to `main` once both ancestors merge.

What

Phase 1 PR 1.6 of the distributed-render refactor. Moves the SDR / DOM-only-HDR disk-capture body of executeRenderJob into packages/producer/src/services/render/stages/captureStage.ts. The sequencer now calls runCaptureStage at the same code point with identical inputs and outputs.

Covers both branches of the disk-capture path:

  • workerCount > 1: parallel capture via executeDiskCaptureWithAdaptiveRetry, with adaptive retry that can shrink the worker count.
  • workerCount === 1: sequential capture in the orchestrator process, reusing probeSession when non-null.

The HDR layered branch (useLayeredComposite === true) and the streaming encode fusion path (useStreamingEncode === true with a successful encoder spawn) stay inline in the sequencer for now — they will be extracted by PR 1.7 and PR 1.8 in the same stack.

Why

Continues the Phase 1 mechanical extraction. The capture stage is the largest single piece of the refactor and the one most likely to drift on a sloppy extraction, so this PR carries the most verification artifacts (5 fixtures in the Docker smoke run, spanning calibration / parallel capture / sub-composition video / GSAP / static text).

How

  • New file captureStage.ts exports runCaptureStage(input) → CaptureStageResult. The function body is the existing disk-capture code lifted verbatim with identical control flow, identical updateJobStatus payload shapes, and the same closure references (buildCaptureOptions, createRenderVideoFrameInjector) passed in by the sequencer.
  • Sequencer replaces the inline } else { disk-capture branch with a call to runCaptureStage(...). Re-assigns workerCount, probeSession, and lastBrowserConsole from the returned result.
  • Two small new orchestrator exports needed by the stage:
    • executeDiskCaptureWithAdaptiveRetry (was private)
    • updateJobStatus (was private)
  • Removes the now-orphaned captureFrame import from the orchestrator (oxlint flagged).

Preserved invariants

  • captureAttempts mutated in place — the parallel path still appends each retry attempt onto the array the sequencer owns.
  • workerCount is reduced by adaptive retry; the final value returns to the sequencer and is used for the streaming-encode gate logging in subsequent code.
  • probeSession is closed at the same code points (parallel: after retries; sequential: in the session's finally).
  • lastBrowserConsole is set to the buffer of whichever session was active last.
  • job.framesRendered is updated at the same per-frame and per-progress points.
  • updateJobStatus(..., "rendering", "Capturing frame N/M [(K workers)]", ...) fires at the same 30-frame and completion checkpoints (parallel) or every frame (sequential), with the same percentage math 25 + frameProgress * 45.
  • perfStages.captureMs is still computed by the sequencer from the outer stage4Start, so the timing window covers both the in-sequencer setup (file server init, calibration, worker resolution, preset selection) AND the capture call.

Known follow-up: small runtime import cycle

The stage imports executeDiskCaptureWithAdaptiveRetry and updateJobStatus (runtime) plus CaptureAttemptSummary, ProgressCallback, RenderJob (types) from renderOrchestrator.ts. The orchestrator imports runCaptureStage from the stage. This re-introduces a small runtime cycle that PR 1.3.5 broke for an earlier set of helpers.

The cycle is safe in practice: both modules finish module-init before any stage function is invoked at runtime, and the imports are only dereferenced inside runCaptureStage's body. The same pattern will appear in PR 1.7 and PR 1.8 as more capture helpers are needed.

After PR 1.10 merges, a follow-up will consolidate the capture helpers (executeDiskCaptureWithAdaptiveRetry, updateJobStatus, safeCleanup, sampleDirectoryBytes, countFrameRanges, etc.) into render/shared.ts (or a sibling render/orchestratorHelpers.ts) so the stages import them without reaching back into the orchestrator. Doing it now would balloon this PR's diff with unrelated churn.

Test plan

  • bunx oxlint packages/producer/src/services/render/stages/captureStage.ts packages/producer/src/services/renderOrchestrator.ts — clean.
  • bunx oxfmt --check — clean.
  • bun run --filter @hyperframes/producer typecheck — clean.
  • bun run --filter @hyperframes/producer build — clean.
  • bun test packages/producer/src/services/ — 176 pass, same single pre-existing failure unrelated to this PR.
  • docker run hyperframes-producer:test --sequential font-variant-numeric many-cuts variables-prod sub-composition-video gsap-letters-render-compat5/5 PASS with audio correlations 1.000 / 0.994 / 0.975 / 0.947 / 1.000 and zero failed visual checkpoints across all 100-point sweeps. gsap-letters-render-compat and sub-composition-video between them exercise the parallel-capture path with video extraction; the rest cover sequential capture.
  • Full regression matrix on CI via the regression workflow.

🤖 Generated with Claude Code

Move the SDR / DOM-only-HDR disk-capture body out of `executeRenderJob`
into `services/render/stages/captureStage.ts`. Covers both branches of
the disk path: parallel capture via `executeDiskCaptureWithAdaptiveRetry`
(`workerCount > 1`) and sequential per-process capture (`workerCount === 1`,
reusing `probeSession` when available).

The HDR layered branch (`useLayeredComposite === true`) and the streaming
encode fusion path (`useStreamingEncode === true` with successful encoder
spawn) stay inline in the sequencer — they will be extracted by the next
two PRs in the stack.

Hard constraints preserved verbatim:
- `probeSession` is closed (and the sequencer's `let probeSession`
  nulled via the returned result) at the same points.
- `captureAttempts` is mutated in place — the parallel retry loop still
  pushes each attempt onto the array the sequencer owns.
- `workerCount` reassignment from adaptive retry survives via the
  returned result.
- `lastBrowserConsole` is set to the buffer of whichever session was
  active last (probe close path or sequential capture finally).
- `job.framesRendered` is updated at the same per-frame / per-progress
  points; `Capturing frame N/M [(K workers)]` `updateJobStatus` payloads
  fire at the same 30-frame and completion checkpoints.
- `perfStages.captureMs` is still computed by the sequencer from the
  outer `stage4Start` so its window covers both the in-sequencer setup
  (fileServer init, calibration, worker resolution, preset selection)
  AND the capture call.

Two small new exports on `renderOrchestrator.ts`:
- `executeDiskCaptureWithAdaptiveRetry` — was a private helper; the
  stage calls it directly.
- `updateJobStatus` — was a private helper; the stage uses it for the
  per-frame progress callbacks so the `completedAt` branch matches.

These re-introduce a small runtime cycle between the stage and the
orchestrator (orchestrator imports `runCaptureStage`; stage imports
helpers back). The cycle is safe (both modules finish loading before
any stage function is invoked at runtime) and will be flattened in a
follow-up PR that consolidates capture helpers into a shared module.
Removes the now-orphaned `captureFrame` import from the orchestrator.

Verified inside `Dockerfile.test`:
- `font-variant-numeric`: audio correlation 1.000
- `many-cuts`: 0 failed frames, audio correlation 0.994
- `variables-prod`: PSNR ~69 dB, audio correlation 0.975
- `sub-composition-video`: PSNR ~43-52 dB, audio correlation 0.947
  (exercises video extraction + capture end-to-end)
- `gsap-letters-render-compat`: PSNR ~53-55 dB, audio correlation 1.000
  (exercises the parallel capture path; 5/5 PASS overall)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verdict: approve.

Extraction of the SDR / DOM-only-HDR disk-capture path into captureStage. Both branches (workerCount > 1 parallel adaptive-retry, workerCount === 1 sequential) lifted byte-clean.

Verified against the pre-#730 orchestrator:

  • captureAttempts mutation preserved — the stage still appends to the caller-owned array via captureAttempts.push(...attempts). The JSDoc on CaptureStageInput.captureAttempts calls this out explicitly.
  • workerCount adaptive-shrink preserved — the last attempt's workers count flows back to the sequencer through the result.
  • probeSession close-and-null contract preserved — closed in the parallel branch when present, reused in the sequential branch via prepareCaptureSessionForReuse. Either way the result returns probeSession: null and the sequencer's let probeSession is updated.
  • lastBrowserConsole set to the correct session's buffer in both branches (parallel: probe session's buffer if it existed; sequential: the working session's buffer in finally). This matches the audit-all-sites rule — every place that previously assigned lastBrowserConsole still does, with the same source.
  • Progress callback semantics identical: 30-frame checkpoint + completion (parallel) or per-frame (sequential), same Capturing frame N/M strings, same 25 + frameProgress * 45 progress math.

The totalFrames narrowing (input declares number, sequencer narrows from job.totalFrames: number | undefined via probeStage) is the cleanest typing improvement in the stack so far. Pre-#730 the orchestrator just read job.totalFrames directly inside the loop, which TypeScript was tolerating via the early throw in probeStage.

Important (non-blocking, cross-stack):

  • captureStage.ts:55-64 — imports executeDiskCaptureWithAdaptiveRetry from renderOrchestrator.ts, which now imports the stage back. The header comment acknowledges this as a runtime cycle and points to a future consolidation. #737 only moves updateJobStatus; executeDiskCaptureWithAdaptiveRetry (plus countFrameRanges, safeCleanup, sampleDirectoryBytes from #720) stays in the orchestrator. Worth filing as a follow-up before the stack lands so it doesn't get forgotten — the cycle is safe at runtime today, but it's a structural smell that gets harder to undo as more stages depend on the orchestrator's helpers.

Nits:

  • captureStage.ts:128 — the parallel branch's Capturing frame N/M (W workers) payload uses the original workerCount for the (W workers) suffix, not the adaptive-reduced one. This is faithful to the pre-extraction code (it had the same behavior) but is arguably a latent bug — after the first retry, the message reports the wrong worker count. Out of scope for this PR; flagged for future cleanup.

Praise: the inline comments inside the stage are denser than the equivalent inline comments inside executeRenderJob were — moving the code into a module justified the extra documentation, and the result is more readable than the original.

— Vai

Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean mechanical extraction — no behavior changes, no introduced bugs. Verified imports, error handling, and cleanup invariants are preserved. LGTM. — Magi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants