Skip to content

Legacy mirror writes real names to tubafrenzy DJ_NAME — defensive contract for write-only PII #1400

@jakebromberg

Description

@jakebromberg

Context

BS#1393 closed the PII leak that flowed FLOWSHEET_RADIO_SHOW_PROD.DJ_NAME (full real name) into shows.legacy_dj_name and onto the public v2 marker dj_name wire. The leak is sealed at the read side: the flowsheet ETL now reads DJ_HANDLE (on-air alias) in all writers, and the historical-data remediation re-scrubbed every polluted legacy_dj_name row.

But the upstream write is still active. apps/backend/middleware/legacy/http.mirror.ts:364 (mapShowToTubafrenzy) sends:

return {
  djName: dj.realName || dj.name,    // ← real name, lands in tubafrenzy's DJ_NAME
  djHandle: ...,                      // ← on-air handle, lands in DJ_HANDLE
  ...
};

On every startShow call, BS forwards the DJ's real name into tubafrenzy's DJ_NAME column. The tubafrenzy admin UI uses that column to render show identifiers for moderators (paging through historical shows, reconciliation views, etc.), so removing it would degrade an internal-only surface. The column is currently write-only PII from BS's perspective: BS writes it, BS never reads it, and BS#1393 made sure no BS code path reaches it.

That's fine while the contract holds. The risk is a future engineer reaching for DJ_NAME as a convenient upstream identifier (e.g., a new alumni reporting feature, a "shows by DJ name" surface, anything that wants a human-readable name) and silently re-opening the leak from the opposite direction.

Suggested defensive measures (any/all)

  1. Doc-only: A code comment on mapShowToTubafrenzy:364 explicitly calling out the write-only-PII contract, with a DO NOT READ DJ_NAME warning. Cheapest, weakest enforcement.

  2. ETL-side guard: In any future ETL or query that reads from tubafrenzy, add a test or lint rule that bans references to DJ_NAME. There's prior art in this repo for SQL-grep tests (tests/unit/jobs/flowsheet-etl/job.djName.test.ts already grep-checks tuple positions).

  3. Stop writing it: BS sends djName: null (or just djHandle), and tubafrenzy's admin UI falls back to DJ_HANDLE for moderator display. Trade-off: degrades the admin UI's name-disambiguation for shows where multiple DJs share a handle. Probably not worth the disruption per the existing memory ("tubafrenzy on life support since 2026-05-10").

  4. Schema rename downstream: Eventually rename shows.legacy_dj_name to shows.legacy_dj_handle so the column's name encodes its semantic. Doc-only PR is the precursor (filed alongside this issue).

Recommendation

(1) + (4) for now. Skip (3) until tubafrenzy is fully sunset. (2) is overkill for the current attack surface.

This is a defensive-followup ticket — no active leak. Not blocking any other work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    tubafrenzyTouches WXYC/tubafrenzy (webhook, legacy mirror, flowsheet ETL, reconcile)

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions