Skip to content

fix(library-search): replace alias LATERAL JOIN with UNION ALL (#1318)#1403

Merged
jakebromberg merged 3 commits into
mainfrom
fix/1318-alt1-union-all
Jun 14, 2026
Merged

fix(library-search): replace alias LATERAL JOIN with UNION ALL (#1318)#1403
jakebromberg merged 3 commits into
mainfrom
fix/1318-alt1-union-all

Conversation

@jakebromberg

Copy link
Copy Markdown
Member

Closes #1318

Summary

  • Replace the alias-aware LEFT JOIN LATERAL (correlated on library.artist_id) with a WITH alias_hits AS (...) CTE plus a UNION ALL split. The CTE runs the trigram bitmap scan over artist_search_alias exactly once via the GIN artist_search_alias_variant_trgm_idx; the LATERAL form had picked the PK btree and filtered variant % q row-by-row, with prod EXPLAIN ANALYZE showing loops=38627 to return a 24-row page.
  • Branch (a) of the UNION ALL is byte-identical to the alias-OFF path so LIMIT pushdown and the per-column ILIKE / GIN trigram plan are preserved. Branch (b) inner-joins alias_hits on artist_id and dedupes against (a) via NOT (a's WHERE), built from the same buildWhereClause(conditions, false) fragment so the two branches can't drift on what counts as a match.
  • All three call sites that consumed buildAliasLateralFragments are updated in one PR (they share the helper, coordinated edit):
    • apps/backend/services/library-search.service.ts searchLibrary (catalog /library/query, with offset pagination + COUNT(*) wrapper)
    • apps/backend/services/library.service.ts searchLibraryByTrigramBoth (Both-mode trigram tier)
    • apps/backend/services/library.service.ts searchByArtist (request-line single-column trigram)

Trade-offs (per issue body)

  • The trigram OR-predicate is evaluated twice (once positively in branch (a), once negated in branch (b)). Each predicate is bitmap-indexable on its own, so the total cost is dominated by the substrate scan + LIMIT pushdown on the outer LAV, not the correlated per-row scan that the LATERAL paid.
  • The catalog /library/query COUNT(*) query now wraps the UNION ALL in a subquery (SELECT COUNT(*) FROM ((branchA) UNION ALL (branchB)) alias_search). API contract is unchanged; the consumer-side reshape lives entirely inside the service.
  • Output row shape is unchanged. toAlbumSearchResultRow / attachAliasHint still emit matched_via_alias from the same nullable (alias_max_sim, alias_matched_variant, alias_matched_source) triple — the alias branch projects them, the non-alias branch emits NULL::real, NULL::text, NULL::text placeholders so the UNION shape lines up.
  • Production flag remains OFF: CATALOG_SEARCH_ALIAS_ENABLED defaults to false in config/catalogSearchAlias.ts and this PR does not change that. PR PR 6 — flip CATALOG_SEARCH_ALIAS_ENABLED=true on prod (deploy-only) #1274 (the env flip) can flip once this lands.

Test plan

  • npm run typecheck — all workspaces green
  • npm run lint — 0 errors (525 pre-existing warnings, untouched)
  • npm run format:check — all files Prettier-clean
  • npm run build — all workspaces build
  • npm run test:unit — 3151 tests / 228 suites all pass
  • New regression-pin tests in tests/unit/services/library-search-alias.test.ts assert the new SQL shape (CTE keyword alias_hits, UNION ALL, no LEFT JOIN LATERAL / alias_hit ON true) across all three call sites
  • Prod EXPLAIN ANALYZE confirmation that Bitmap Index Scan on artist_search_alias_variant_trgm_idx appears in the alias-on plan and the four-query latency mix (Loren Connors / oh sees / Bonnie / music) lands within 1.1x of the alias-off baseline — to be measured against the deployed branch as part of pre-flag-flip validation for BS#1274.

The alias-aware LATERAL JOIN steered the planner onto the artist_search_alias PK btree and applied `variant % q` as a post-filter, never touching the GIN trigram index. Prod EXPLAIN ANALYZE showed the LATERAL being invoked once per candidate library row (`loops=38627` to return a 24-row page), with the alias-on path running 3-6.5x slower than the alias-off path on selective queries.

Replace the LATERAL with a CTE `alias_hits` that runs the trigram bitmap scan once over the substrate and groups by artist_id, then split the outer query into a UNION ALL: branch (a) is byte-identical to the alias-OFF path (so LIMIT pushdown stays intact and the per-column GIN trigram / ILIKE plan is unchanged); branch (b) inner-joins alias_hits on artist_id and dedupes against (a) via NOT-of-(a)'s predicate. The branch-A WHERE is built once via `buildWhereClause(conditions, false)` and the NOT is computed from the same SQL fragment, so the two branches can't drift on what counts as a match.

All three call sites that used buildAliasLateralFragments are updated in one PR (the helper is replaced by buildAliasHitsCte + ALIAS_HITS_PROJECTION / _NULLS):

- library-search.service.ts searchLibrary (catalog /library/query endpoint, with offset pagination + COUNT(*) wrapper)
- library.service.ts searchLibraryByTrigramBoth (Both-mode trigram tier)
- library.service.ts searchByArtist (request-line single-column trigram)

The flag remains OFF in production (per BS#1274). Output row shape is unchanged — toAlbumSearchResultRow / attachAliasHint still emit matched_via_alias from the same nullable column triple.

Unit tests pin the new SQL shape (alias_hits CTE + UNION ALL keyword, no LEFT JOIN LATERAL / `alias_hit ON true`) and keep the existing row-shape assertions intact.
CI integration tests surfaced `PostgresError: invalid UNION/INTERSECT/EXCEPT ORDER BY clause` against the live database. Postgres forbids expression-shaped ORDER BY directly on a UNION result — only column names from the union's SELECT list are allowed. Both `searchLibraryByTrigramBoth` and `searchByArtist` order by `GREATEST(similarity(artist_name, $q), ..., COALESCE(alias_max_sim, 0))`, which is an expression, not a bare column reference.

Wrap each UNION ALL in a `SELECT * FROM (...) alias_search` subquery so the outer ORDER BY operates on the subquery's column projection — at that scope `similarity(artist_name, $q)` is a legal ORDER BY term.

Site 1 (`library-search.service.ts`) is unaffected: that path's ORDER BY uses only bare column names (`add_date`, `artist_name`, `id`, etc.) which are legal directly on a union.
…ATERAL docstrings

Cleanup pass after /code-review max on BS#1318. Three coordinated changes:

(1) Drop the aliasActive parameter chain through buildWhereClause -> buildConditionFragment -> buildAllFieldMatch. Both call sites passed false; the aliasOr branch inside buildAllFieldMatch referenced alias_hit.max_sim — an alias the UNION ALL form never creates. Dead today but a latent trap: if a future caller had ever flipped the flag to true, Postgres would have errored 'column alias_hit.max_sim does not exist'. (2) Update stale doc comments in library.service.ts (attachAliasHint, TaggedLibraryViewEntry, AliasHitFields, LIBRARY_VIEW_JOINS_RAW) and library-search.service.ts (the hasAllFieldCondition gate rationale) that still narrate the removed LATERAL JOIN. The post-#1318 reality is the alias_hits CTE + UNION ALL; the historical narration in buildAliasHitsCte and searchByArtist intentionally keeps the LATERAL contrast for design-intent context. (3) Test name + comment in library-search-alias.test.ts updated to match (the assertion logic was unaffected).

No behavior change. Touches 4 service-layer functions plus 4 comment blocks.
@jakebromberg jakebromberg merged commit c22a9f6 into main Jun 14, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

alias-aware LATERAL JOIN: GIN trigram index unused; 3-6.5x p95 regression on selective queries

1 participant