Context
Discovered while triaging #1354 (closed as not-a-bug). The cleanDiscogsBio helper at shared/metadata/src/helpers/clean-discogs-bio.ts strips Discogs markup tokens ([a=Name], [l=Name], [r=…], [m=…], [url=…][/url]) from artist_bio before persisting. It is called from every artist_bio writer:
apps/backend/services/metadata/metadata.service.ts:142
apps/enrichment-worker/enrich.ts:156 and :191
jobs/flowsheet-metadata-backfill/enrich.ts:182
jobs/flowsheet-artwork-repair/repair.ts:153
jobs/album-level-backfill/job.ts:254
shared/metadata/src/normalize-lookup.ts:97
The premise of the helper is "consumers render raw text, so strip the markup or it shows as literal tokens." That premise doesn't hold for the two known consumers.
Consumer evidence
iOS (wxyc-ios-64, Shared/Metadata/Sources/Metadata/DiscogsMarkupParser.swift):
- Recognizes all four numeric-id prefixes (
[a12345], [l…], [r…], [m…]), the named-equals form ([a=Name], [l=Name]), [url=…][/url], and [b]/[i]/[u].
ArtistBioSection.swift:51-60 calls the async resolver path: [a12345] resolves to a tappable Discogs artist link via DiscogsAPIEntityResolver.shared.
- Sync fallback drops unresolved IDs at
DiscogsMarkupParser.swift:354 (return nil // Skip unresolved). The literal [a12345] is never rendered, in either path.
- When the backend serves
bioTokens (V1 proxy at apps/backend/controllers/proxy.controller.ts:221,504), iOS renders pre-parsed tokens directly with no parsing needed.
dj-site (src/components/experiences/modern/Rightbar/panels/album/AlbumCard.tsx:143):
- Prefers
<DiscogsMarkup tokens={bioTokens} /> when available; falls back to the raw artistBio string. The fallback is the only consumer that benefits from the strip — and only as a cosmetic cleanup for a rare path.
Cost of the current strip
So on the iOS V2 flowsheet path (no bioTokens served, raw artist_bio only), every strip is destroying link information that the iOS parser is fully equipped to consume.
V2 flowsheet doesn't serve bioTokens yet
apps/backend/controllers/proxy.controller.ts:221,504 populates bioTokens from artwork.profile_tokens on the iOS V1 proxy path. V2 flowsheet (/v2/flowsheet) serves artist_bio only — no bioTokens. V2 clients therefore pay the strip's cost with no offsetting benefit.
Remediation options
- Serve
bioTokens everywhere artist_bio is served. Extend the V2 flowsheet DTO with the pre-parsed token array LML already produces (profile_tokens). Stop calling cleanDiscogsBio at write time; let clients render tokens directly. Needs a @wxyc/shared DTO change and a coordinated iOS + dj-site consumer migration. Long-term right answer.
- Stop stripping, accept text-fallback loss. Parser-equipped clients get full fidelity; raw-text fallback paths (dj-site
AlbumCard else branch, any internal admin view) show literal markup. Minor cosmetic regression for fallbacks in exchange for restored link info on primary paths.
- Keep the strip, document the trade-off. Status quo. Add a doc-comment to
cleanDiscogsBio noting it's a fallback-friendly text representation and that primary clients should consume bioTokens instead.
Scope of audit
Out of scope
Context
Discovered while triaging #1354 (closed as not-a-bug). The
cleanDiscogsBiohelper atshared/metadata/src/helpers/clean-discogs-bio.tsstrips Discogs markup tokens ([a=Name],[l=Name],[r=…],[m=…],[url=…][/url]) fromartist_biobefore persisting. It is called from every artist_bio writer:apps/backend/services/metadata/metadata.service.ts:142apps/enrichment-worker/enrich.ts:156and:191jobs/flowsheet-metadata-backfill/enrich.ts:182jobs/flowsheet-artwork-repair/repair.ts:153jobs/album-level-backfill/job.ts:254shared/metadata/src/normalize-lookup.ts:97The premise of the helper is "consumers render raw text, so strip the markup or it shows as literal tokens." That premise doesn't hold for the two known consumers.
Consumer evidence
iOS (
wxyc-ios-64,Shared/Metadata/Sources/Metadata/DiscogsMarkupParser.swift):[a12345],[l…],[r…],[m…]), the named-equals form ([a=Name],[l=Name]),[url=…][/url], and[b]/[i]/[u].ArtistBioSection.swift:51-60calls the async resolver path:[a12345]resolves to a tappable Discogs artist link viaDiscogsAPIEntityResolver.shared.DiscogsMarkupParser.swift:354(return nil // Skip unresolved). The literal[a12345]is never rendered, in either path.bioTokens(V1 proxy atapps/backend/controllers/proxy.controller.ts:221,504), iOS renders pre-parsed tokens directly with no parsing needed.dj-site (
src/components/experiences/modern/Rightbar/panels/album/AlbumCard.tsx:143):<DiscogsMarkup tokens={bioTokens} />when available; falls back to the rawartistBiostring. The fallback is the only consumer that benefits from the strip — and only as a cosmetic cleanup for a rare path.Cost of the current strip
[a=Stereolab]→Stereolab(plain text): iOS could have made it a tappable artist link via the async resolver.[a8390436]→ either the literal (pre-fix(metadata): strip Discogs numeric-id entity tokens in cleanDiscogsBio #1358) or eliminated (had fix(metadata): strip Discogs numeric-id entity tokens in cleanDiscogsBio #1358 merged): iOS could have resolved it to a real artist name + tappable link.[url=https://…]text[/url]→text: URL is dropped on the floor.So on the iOS V2 flowsheet path (no
bioTokensserved, rawartist_bioonly), every strip is destroying link information that the iOS parser is fully equipped to consume.V2 flowsheet doesn't serve
bioTokensyetapps/backend/controllers/proxy.controller.ts:221,504populatesbioTokensfromartwork.profile_tokenson the iOS V1 proxy path. V2 flowsheet (/v2/flowsheet) servesartist_bioonly — nobioTokens. V2 clients therefore pay the strip's cost with no offsetting benefit.Remediation options
bioTokenseverywhere artist_bio is served. Extend the V2 flowsheet DTO with the pre-parsed token array LML already produces (profile_tokens). Stop callingcleanDiscogsBioat write time; let clients render tokens directly. Needs a@wxyc/sharedDTO change and a coordinated iOS + dj-site consumer migration. Long-term right answer.AlbumCardelse branch, any internal admin view) show literal markup. Minor cosmetic regression for fallbacks in exchange for restored link info on primary paths.cleanDiscogsBionoting it's a fallback-friendly text representation and that primary clients should consumebioTokensinstead.Scope of audit
artist_bio:AlbumCard(bioTokens preferred, artistBio fallback)artistBiobioTokens.@wxyc/sharedDTO withbioTokenson the V2 flowsheet shape; propagate fromflowsheet.profile_tokens(or wherever LML's tokens are persisted) through to the read path.Out of scope
[a8390436]) #1354 — iOS parses the tokens; that issue isn't a bug.