You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
flowsheet.dj_name on show_start / show_end / dj_join / dj_leave rows is the value transformToV2 surfaces on the public /flowsheet (v2) wire. For DJs whose auth_user.dj_name is null, the marker dj_name resolver's COALESCE(auth_user.dj_name, shows.legacy_dj_name) falls through to legacy_dj_name. That column is populated by the flowsheet ETL from tubafrenzy's FLOWSHEET_RADIO_SHOW_PROD.DJ_NAME — the user's full real name, not the on-air alias — so the public v2 wire surfaces real names for any affected marker entry.
And the BS legacy mirror in apps/backend/middleware/legacy/http.mirror.ts (mapShowToTubafrenzy) forwards djName: dj.realName || dj.name into tubafrenzy's DJ_NAME field on every show start.
Leak path
BS legacy mirror writes djName = realName || name into tubafrenzy's FLOWSHEET_RADIO_SHOW_PROD.DJ_NAME on startShow.
Flowsheet ETL pulls rs.DJ_NAME from tubafrenzy and stores it in shows.legacy_dj_name. Three writers, all wrong:
jobs/flowsheet-etl/fetch-legacy.ts (incremental)
jobs/flowsheet-etl/job.tsimportShows reads tuple[2] (DJ_NAME) instead of tuple[4] (DJ_HANDLE) in the bulk-load path
transformToV2 surfaces entry.dj_name for show_start / show_end / dj_join / dj_leave on the public v2 wire.
auth_user.name-fallback (the leak path #1286 and #1371 explicitly closed elsewhere) is not the source here — the leak is the legacy_dj_name arm of the COALESCE itself, because the ETL pulls from the wrong source column.
Fix
Two pieces, both in this issue's PR:
Source fix: ETL pulls DJ_HANDLE instead of DJ_NAME in all three writers above. Regression tests pin (a) the SELECT query string and (b) the bulk-load tuple position. LegacyShowRow.djName renamed to djHandle for clarity.
Historical data cleanup: new one-shot job jobs/legacy-dj-name-remediation/ walks every show, rewrites shows.legacy_dj_name from tubafrenzy's DJ_HANDLE, and nulls + re-resolves marker-row flowsheet.dj_name whose value matches the old polluted legacy_dj_name. Idempotent, --dry-run aware, ships with its own Dockerfile (Dockerfile.legacy-dj-name-remediation).
After the fix:
Marker dj_name on the v2 wire reflects the on-air handle (from auth_user.dj_name first, DJ_HANDLE second).
DJs with neither auth_user.dj_name nor DJ_HANDLE set surface empty dj_name on the wire (asymmetric-fallback case the live path already tolerates).
#605 is an open issue from 2026-04-28 about shows.legacy_dj_name not always being populated even when tubafrenzy has a value. Its example table cites real names (Umaesh Kumaran, r.j. igou) — those were the DJ_NAME (real name) values that would have been pulled if the ETL had caught them, applied as a manual workaround.
This issue = correctness bug (the rows that were picked up came from the wrong source column).
The remediation in step 2 above will scrub the 30 rows #605's manual workaround populated with DJ_NAME (real names) back to DJ_HANDLE (handles), so they too stop leaking PII. #605's coverage concern remains its own issue.
Test plan
New regression tests in tests/unit/jobs/flowsheet-etl/fetch-legacy.test.ts and tests/unit/jobs/flowsheet-etl/job.djName.test.ts assert the ETL never reverts to DJ_NAME / tuple[2].
Full unit suite (3104 tests) passes.
Build + typecheck + lint + Prettier all clean.
Remediation runbook: deploy the new job target via Manual Build & Deploy, docker run --rm --env-file .env <image> --dry-run first, then without --dry-run. Output reports (showsScanned, showsUpdated, markerRowsReset) and (reresolved).
Summary
flowsheet.dj_nameonshow_start/show_end/dj_join/dj_leaverows is the valuetransformToV2surfaces on the public/flowsheet(v2) wire. For DJs whoseauth_user.dj_nameis null, the marker dj_name resolver'sCOALESCE(auth_user.dj_name, shows.legacy_dj_name)falls through tolegacy_dj_name. That column is populated by the flowsheet ETL from tubafrenzy'sFLOWSHEET_RADIO_SHOW_PROD.DJ_NAME— the user's full real name, not the on-air alias — so the public v2 wire surfaces real names for any affected marker entry.Per tubafrenzy's own schema docs (https://github.com/WXYC/tubafrenzy/blob/main/docs/database-schema.md):
DJ_NAMEDJ_HANDLEAnd the BS legacy mirror in
apps/backend/middleware/legacy/http.mirror.ts(mapShowToTubafrenzy) forwardsdjName: dj.realName || dj.nameinto tubafrenzy'sDJ_NAMEfield on every show start.Leak path
djName = realName || nameinto tubafrenzy'sFLOWSHEET_RADIO_SHOW_PROD.DJ_NAMEonstartShow.rs.DJ_NAMEfrom tubafrenzy and stores it inshows.legacy_dj_name. Three writers, all wrong:jobs/flowsheet-etl/fetch-legacy.ts(incremental)jobs/flowsheet-etl/job.tsimportShowsreadstuple[2](DJ_NAME) instead oftuple[4](DJ_HANDLE) in the bulk-load pathjobs/flowsheet-etl/backfill-legacy-ids.ts(legacy-id backfill)dj_nameresolution on the webhook insert path viaCOALESCE(auth_user.dj_name, shows.legacy_dj_name). Marker rows previously haddj_name = NULLon the wire; now they hold the pollutedlegacy_dj_namefor DJs whoseauth_user.dj_nameis null.transformToV2surfacesentry.dj_nameforshow_start/show_end/dj_join/dj_leaveon the public v2 wire.auth_user.name-fallback (the leak path #1286 and #1371 explicitly closed elsewhere) is not the source here — the leak is thelegacy_dj_namearm of the COALESCE itself, because the ETL pulls from the wrong source column.Fix
Two pieces, both in this issue's PR:
DJ_HANDLEinstead ofDJ_NAMEin all three writers above. Regression tests pin (a) the SELECT query string and (b) the bulk-load tuple position.LegacyShowRow.djNamerenamed todjHandlefor clarity.jobs/legacy-dj-name-remediation/walks every show, rewritesshows.legacy_dj_namefrom tubafrenzy'sDJ_HANDLE, and nulls + re-resolves marker-rowflowsheet.dj_namewhose value matches the old pollutedlegacy_dj_name. Idempotent,--dry-runaware, ships with its own Dockerfile (Dockerfile.legacy-dj-name-remediation).After the fix:
dj_nameon the v2 wire reflects the on-air handle (fromauth_user.dj_namefirst,DJ_HANDLEsecond).auth_user.dj_namenorDJ_HANDLEset surface emptydj_nameon the wire (asymmetric-fallback case the live path already tolerates).Relationship to #605
#605 is an open issue from 2026-04-28 about
shows.legacy_dj_namenot always being populated even when tubafrenzy has a value. Its example table cites real names (Umaesh Kumaran,r.j. igou) — those were theDJ_NAME(real name) values that would have been pulled if the ETL had caught them, applied as a manual workaround.This issue is independent of #605:
The remediation in step 2 above will scrub the 30 rows #605's manual workaround populated with
DJ_NAME(real names) back toDJ_HANDLE(handles), so they too stop leaking PII. #605's coverage concern remains its own issue.Test plan
tests/unit/jobs/flowsheet-etl/fetch-legacy.test.tsandtests/unit/jobs/flowsheet-etl/job.djName.test.tsassert the ETL never reverts toDJ_NAME/tuple[2].docker run --rm --env-file .env <image> --dry-runfirst, then without--dry-run. Output reports(showsScanned, showsUpdated, markerRowsReset)and(reresolved).