Skip to content

fix(legacy-dj-name-remediation): truncate DJ_HANDLE to 128 chars (BS#1393 hot-fix)#1399

Merged
jakebromberg merged 1 commit into
mainfrom
bugfix/legacy-dj-name-handle-truncate
Jun 13, 2026
Merged

fix(legacy-dj-name-remediation): truncate DJ_HANDLE to 128 chars (BS#1393 hot-fix)#1399
jakebromberg merged 1 commit into
mainfrom
bugfix/legacy-dj-name-handle-truncate

Conversation

@jakebromberg

Copy link
Copy Markdown
Member

Context

First prod run of the legacy-dj-name-remediation job (PR #1394) aborted at batch 5/15:

```
[remediate] batch 1/15: 4763 shows updated, 0 marker rows reset, 0 marker rows re-resolved.
[remediate] batch 2/15: 4756 shows updated, 0 marker rows reset, 4 marker rows re-resolved.
[remediate] batch 3/15: 4244 shows updated, 0 marker rows reset, 5763 marker rows re-resolved.
[remediate] batch 4/15: 3515 shows updated, 0 marker rows reset, 10026 marker rows re-resolved.
[remediate] Fatal error: DrizzleQueryError
cause: PostgresError: value too long for type character varying(128)
```

Production has DJ_HANDLE values longer than the `shows.legacy_dj_name varchar(128)` ceiling. The 300+ char Cynocephalus handle documented in tubafrenzy#573 is one example.

The flowsheet ETL truncates at the source (`jobs/flowsheet-etl/job.ts:200` for bulk-load and `job.ts:338` for incremental); the remediation script's `fetchHandleMappings` did not, so an oversized handle on a single show aborted the per-batch CTE for the whole batch.

Fix

Truncate to the same 128-char ceiling in `fetchHandleMappings`. Now matches what the ETL would write for the same row, so re-run idempotency between the two writers is preserved.

Resume plan

Batches 1-4 of the prior run committed durably (~17,278 shows updated, ~15,793 markers re-resolved). The CTE's `COALESCE(trim(legacy_dj_name), '') IS DISTINCT FROM` filter skips already-clean rows on re-run, so the resumed pass will start at batch 5 and finish the remaining ~55k shows + truncated handles.

Test plan

  • `npm run typecheck` — clean
  • `npm run format:check` — clean
  • `npm run test:unit` — 17/17 pass on remediation suite (new test pins the 128-char cap)
  • Post-merge: Manual Build & Deploy `target=legacy-dj-name-remediation`, then re-run without `--dry-run` on EC2; expect resume from batch 5/15.

…1393 hot-fix)

The first prod remediation run aborted at batch 5/15 with PostgresError
`value too long for type character varying(128)` from `shows.legacy_dj_name`.
Production has DJ_HANDLE values longer than that column's ceiling — the
Cynocephalus handle (300+ chars) documented by the tubafrenzy fix in #573
is one example.

The flowsheet ETL truncates at the source (jobs/flowsheet-etl/job.ts:200 +
job.ts:338); the remediation script's `fetchHandleMappings` did not, so an
oversized handle on a single show aborted the per-batch CTE for the whole
batch. Batches 1-4 (the first ~17k shows + ~16k re-resolves) had already
committed durably, so the re-run will skip them via the CTE's
`IS DISTINCT FROM` filter and resume at batch 5.

Truncate to the same 128-char ceiling here so the remediation matches what
the ETL writes for the same row; new test pins the cap.
@jakebromberg jakebromberg merged commit 9378df9 into main Jun 13, 2026
5 checks passed
@jakebromberg jakebromberg deleted the bugfix/legacy-dj-name-handle-truncate branch June 13, 2026 00:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant