Skip to content

feat(db): expose relevance scores from match_agent_memories_hybrid#444

Open
gianpaj wants to merge 1 commit into
mainfrom
claude/memory-hybrid-relevance-scores
Open

feat(db): expose relevance scores from match_agent_memories_hybrid#444
gianpaj wants to merge 1 commit into
mainfrom
claude/memory-hybrid-relevance-scores

Conversation

@gianpaj

@gianpaj gianpaj commented Jun 26, 2026

Copy link
Copy Markdown
Owner

What

Adds a migration that surfaces relevance scores from the existing Postgres RPC public.match_agent_memories_hybrid, plus a pgTAP test. This is additive and backward compatible: the RPC keeps memory_type and content as its first two columns and the same argument signature, ranking behaviour is unchanged (same RRF weights 0.7 vector / 0.3 text, k=60, same ORDER BY f.score DESC / LIMIT), and four trailing columns are added — cosine_distance, rrf_score, vector_rank, text_rank. It only exposes values the function already computed and discarded, so it unblocks retrieval-quality monitoring in sexycall (src/observability.py) without re-ranking anything. Callers that read memory_type / content (e.g. sexycall's format_memories) are unaffected.

Files

  • apps/web/supabase/migrations/20260626130000_match_agent_memories_hybrid_scores.sql
  • apps/web/supabase/tests/match_agent_memories_hybrid_scores.test.sql

Note on DROP vs CREATE OR REPLACE

The task asked for CREATE OR REPLACE, but adding columns to a RETURNS TABLE changes the function's result type, and Postgres rejects that via CREATE OR REPLACE FUNCTION (ERROR: cannot change return type of existing function). The migration therefore DROPs and recreates only the function, keyed on its exact argument signature — the agent_memories table and its rows are never touched, the arguments stay identical, and EXECUTE is re-granted to authenticated, service_role. Security posture is unchanged (language sql, security invoker, set search_path = public, extensions).

Dependency / stacking

⚠️ This is stacked on #437 (claude/supabase-memory), which introduces agent_memories and the original two-column function. The new migration's timestamp (20260626130000) sorts after #437's …110000_create_agent_memories.sql and …120000_prune_agent_memories.sql, so it applies in the correct order once both are merged. It will not apply on main until #437 lands.

Testing

CI (Vitest + Redis) has no Postgres, so the test is a standalone pgTAP script run via supabase test db. Validated locally on Postgres 16 + pgvector + pgTAP by applying #437's migration then this one: the migration applies cleanly and all 11 pgTAP assertions pass. Coverage:

  • (a) same rows in the same RRF order; content stays the second column
  • (b) cosine_distance non-null & ascending for embedded rows, NULL for the null-embedding row
  • (c) rrf_score non-null and monotonically non-increasing across the result set
  • (d) a text-only match has vector_rank IS NULL (with a text_rank) and a vector-only match has text_rank IS NULL (with a vector_rank)

🤖 Generated with Claude Code


Generated by Claude Code

Surface the cosine distance, fused RRF score, and per-branch vector/text
ranks the hybrid retrieval RPC already computes but discarded, so the
voice agent (sexycall observability) can log retrieval quality.

Additive and backward compatible: memory_type/content stay the first two
columns, the argument signature is unchanged, and ranking is identical
(same 0.7/0.3 RRF weights, k=60, same ORDER BY/LIMIT) — only four
trailing score columns are added. Adding columns to a RETURNS TABLE is a
return-type change that CREATE OR REPLACE cannot do, so the migration
drops and recreates only the function (the agent_memories table and its
data are untouched) and re-grants EXECUTE.

Includes a pgTAP test (supabase/tests) asserting order preservation,
cosine_distance null/ascending behaviour, monotonic rrf_score, and
text-only vs vector-only rank nullability.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01XcFmiddpzKxYxkPbDnYfYv
@bolt-new-by-stackblitz

Copy link
Copy Markdown

Review PR in StackBlitz Codeflow Run & review this pull request in StackBlitz Codeflow.

@vercel

vercel Bot commented Jun 26, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
sexyvoice Ready Ready Preview, Comment Jun 26, 2026 6:37pm
sexyvoice-docs Ready Ready Preview, Comment Jun 26, 2026 6:37pm

Request Review

@claude

claude Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Claude finished @gianpaj's task in 1m 31s —— View job


Review: expose relevance scores from match_agent_memories_hybrid

  • Read changed files (migration + pgTAP test)
  • Verify backward compatibility
  • Check SQL correctness, security, and performance
  • Review the pgTAP test coverage
  • Post review feedback

Overall this is a clean, well-scoped, genuinely additive change. The migration only exposes values the function already computed, ranking is untouched (same RRF weights, k=60, same ORDER BY f.score DESC / LIMIT), and the first two columns keep their position. The DROP + recreate approach is the correct workaround for "cannot change return type of existing function", the re-GRANT is in place, and the pgTAP coverage is thoughtful. No blocking issues. A few notes below — none are blockers.

Observations

1. cosine_distance and vector_rank can disagree (worth documenting for the observability consumer). cosine_distance is recomputed in the final SELECT as m.embedding <=> query_embedding, which is non-NULL for any row with an embedding — independent of whether that row ranked in the vector branch. So a row matched only via the text branch but that has an embedding will report a non-NULL cosine_distance while vector_rank IS NULL. The implication vector_rank IS NULL ⇒ cosine_distance IS NULL does not hold (it only holds when the embedding itself is NULL, as with fav_color). This is fine — the distance is genuinely meaningful — but src/observability.py should not treat the two as equivalent signals when interpreting retrieval quality. The test happens not to exercise this case (its text-only row, fav_color, has a NULL embedding). A column comment noting "non-null whenever the row has an embedding, regardless of vector_rank" would prevent a misread downstream.

2. search_path = public, extensions vs the repo convention. CLAUDE.md / .cursor/rules call for set search_path = '' with fully-qualified names. This function uses public, extensions and leaves several calls unqualified (websearch_to_tsquery, ts_rank_cd, row_number, least, greatest, coalesce, and the <=> operator / vector type resolving via extensions). Since the stated goal is to preserve the existing function's posture byte-for-byte, keeping this as-is is the right call for this PR — flagging only so it's a conscious choice and not silently inherited tech debt. Not something to change here.

3. CREATE OR REPLACE after DROP is redundant. After the DROP FUNCTION IF EXISTS, the OR REPLACE can never fire — a plain CREATE FUNCTION would be equivalent and marginally clearer about intent (you're always creating fresh here). Purely cosmetic.

4. DROP without CASCADE is the safe choice — keep it. If anything ever comes to depend on this function, the migration fails loudly rather than silently dropping dependents. Good as-is; just noting it's intentional-looking and correct.

Test coverage

The 11 assertions are well-chosen and the RRF math checks out for the seed:

  • preferred_name: vector_rank 1, text_rank NULL0.7/61 ≈ 0.01148
  • hobby: vector_rank 2, text_rank NULL0.7/62 ≈ 0.01129
  • fav_color: vector_rank NULL, text_rank 10.3/61 ≈ 0.00492

…which yields exactly the asserted preferred_name, hobby, fav_color order. Two minor notes:

  • The row_number() over () as ord in the temp table relies on the function's output order being preserved into CREATE TABLE AS. In practice Postgres preserves the function's internal ORDER BY f.score DESC, but it's not contractually guaranteed for a SELECT without its own ORDER BY. It works and is fine for a test; just a known fragility.
  • Coverage gap (matching note Bump nodemailer from 6.9.8 to 6.9.9 #1): no row exercises "has an embedding and matched only via text" to pin down the cosine_distance non-NULL / vector_rank NULL combination. Worth adding if observability.py ends up relying on that distinction.

Stacking

As called out in the description, this won't apply on main until #437 lands (the timestamp 20260626130000 correctly sorts after …110000 / …120000). The DROP IF EXISTS is keyed on the exact arg signature (vector, text, uuid, text, int), so it only succeeds if #437's function has that identical signature — worth a final check that the two signatures match byte-for-byte before merge, since a mismatch would silently no-op the DROP and then CREATE a second overload.

No callers exist in apps/web (the consumer is the separate sexycall repo), so the column-additive change is safe on the web side.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a migration to drop and recreate the public.match_agent_memories_hybrid function, exposing trailing relevance score columns (cosine distance, RRF score, vector rank, and text rank) to improve observability of retrieval quality. It also adds a comprehensive pgTAP test suite to verify the correctness of these new columns and ensure ranking behavior remains unchanged. The review feedback highlights a critical improvement opportunity: qualifying the vector type as extensions.vector in the DROP FUNCTION, CREATE FUNCTION, and GRANT statements to prevent potential schema-resolution failures when extensions is not in the active search_path.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +29 to +31
DROP FUNCTION IF EXISTS public.match_agent_memories_hybrid(
vector, text, uuid, text, int
);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In Supabase, the vector extension and its associated types (like vector) are typically installed in the extensions schema. If the migration runner or session executing this SQL does not have extensions in its active search_path, referencing vector unqualified will fail with a type "vector" does not exist error. Qualifying it as extensions.vector ensures the migration is robust and schema-safe.

DROP FUNCTION IF EXISTS public.match_agent_memories_hybrid(
  extensions.vector, text, uuid, text, int
);

Comment on lines +33 to +39
CREATE OR REPLACE FUNCTION public.match_agent_memories_hybrid(
query_embedding vector(1536),
query_text text,
p_user_id uuid,
p_character_id text DEFAULT '__shared__',
match_count int DEFAULT 5
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The SET search_path attribute on the function definition only applies during the execution of the function, not during its creation. During creation, PostgreSQL parses the argument types using the session's current search_path. If extensions is not in the session's search path, the CREATE FUNCTION statement will fail. Qualifying the argument type as extensions.vector(1536) prevents this failure.

CREATE OR REPLACE FUNCTION public.match_agent_memories_hybrid(
  query_embedding extensions.vector(1536),
  query_text      text,
  p_user_id       uuid,
  p_character_id  text DEFAULT '__shared__',
  match_count     int  DEFAULT 5
)

Comment on lines +105 to +107
GRANT EXECUTE ON FUNCTION public.match_agent_memories_hybrid(
vector, text, uuid, text, int
) TO authenticated, service_role;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Similar to the DROP FUNCTION statement, the GRANT statement is executed in the session context. If extensions is not in the session's active search_path, referencing vector unqualified will fail. Qualifying it as extensions.vector ensures the grant is applied successfully.

GRANT EXECUTE ON FUNCTION public.match_agent_memories_hybrid(
  extensions.vector, text, uuid, text, int
) TO authenticated, service_role;

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 740df31383

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

WITH vector_ranked AS (
SELECT m.id,
row_number() OVER (ORDER BY m.embedding <=> query_embedding ASC) AS rank
FROM public.agent_memories m

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Add the missing agent_memories prerequisite migration

In the reviewed parent commit (12474fe), git ls-tree ... | rg 'agent_memories|match_agent_memories|20260626110000' finds no prerequisite migration or table definition, and a repo-wide search only finds this new migration/test. Because this SQL function body references public.agent_memories, a fresh Supabase migration run from main will reach this migration before the table/function introduced by the referenced #437 exists, so database resets/deploys fail instead of applying cleanly. Please include or correctly stack the prerequisite migration before this one.

Useful? React with 👍 / 👎.

@argos-ci

argos-ci Bot commented Jun 26, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Argos notifications ↗︎

Build Status Details Updated (UTC)
default (Inspect) ⚠️ Changes detected (Review) 1 changed Jun 26, 2026, 6:38 PM

@pullfrog pullfrog Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ No new issues found.

Reviewed changes — recreates the public.match_agent_memories_hybrid RPC to expose four already-computed-but-discarded relevance columns, plus a standalone pgTAP test.

  • Add four trailing output columnscosine_distance, rrf_score, vector_rank, text_rank are appended after memory_type / content; the fused CTE now carries v.rank / t.rank forward and the final SELECT re-derives the cosine distance per surviving row.
  • DROP + recreate keyed on the exact arg signature — required because adding RETURNS TABLE columns changes the result type, which CREATE OR REPLACE FUNCTION rejects; EXECUTE is re-granted to authenticated, service_role and the security posture (language sql, security invoker, set search_path = public, extensions) is preserved.
  • Add pgTAP test (match_agent_memories_hybrid_scores.test.sql) — 11 assertions covering row/order preservation, cosine_distance null-vs-ascending behaviour, rrf_score monotonicity, and per-branch rank nullability; not wired into Vitest CI (no Postgres), run via supabase test db.

I compared the recreated body against #437's current function definition (claude/supabase-memory): the two CTEs, the RRF score formula (0.7/(60+v.rank) + 0.3/(60+t.rank)), ORDER BY f.score DESC, and LIMIT greatest(match_count, 1) are byte-for-byte identical — ranking is genuinely unchanged, so the additive/backward-compatible claim holds. The RPC is absent from types.d.ts and unreferenced by any web-app .ts, so the new columns have no typed-surface impact.

ℹ️ Recreated body duplicates #437's function while both PRs are in flight

This migration hardcodes a full copy of match_agent_memories_hybrid and depends on #437 (claude/supabase-memory), which is still open and was already rebased once (its timestamps moved from 20260623… to 20260626…). The "additive only" guarantee rests on this copy staying a strict superset of #437's body, but nothing in the web-app CI verifies that — the pgTAP test isn't wired into Vitest (no Postgres), so a divergence would only surface when someone runs supabase test db manually.

Technical details
# Recreated body duplicates #437's function while both PRs are in flight

## Affected sites
- `apps/web/supabase/migrations/20260626130000_match_agent_memories_hybrid_scores.sql` — full recreate of the RPC, must remain a strict superset of #437's `20260626110000_create_agent_memories.sql` definition.

## Required outcome
- Before merging, re-confirm #437's final merged function body still matches the two CTEs / RRF formula / ORDER BY / LIMIT recreated here (verified identical at head 740df31 vs the current `claude/supabase-memory`).
- Merge order must be #437 first, then #444 — the migration's CREATE references `public.agent_memories` and will fail if applied on `main` before #437 lands. The author already documents this.

## Open questions for the human
- Is there appetite to wire `supabase test db` into CI later so this class of drift is caught automatically? Out of scope for this PR; noted for awareness.

Pullfrog  | ⚠️ this action is pinned to a commit SHA, which freezes the cleanup step — switch to @v0 or keep the SHA fresh with Dependabot | View workflow run | Using Claude Opus𝕏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants