Skip to content

fix(notifications): heal NULL DeviceRegistration.accountId push drops#335

Open
lourou wants to merge 1 commit into
otr-devfrom
fix/push-notif-null-accountid
Open

fix(notifications): heal NULL DeviceRegistration.accountId push drops#335
lourou wants to merge 1 commit into
otr-devfrom
fix/push-notif-null-accountid

Conversation

@lourou

@lourou lourou commented Jun 25, 2026

Copy link
Copy Markdown
Member

Push notifications: heal NULL DeviceRegistration.accountId drops

Problem: pushes were silently dropped for devices whose DeviceRegistration.accountId is NULL — the isAccountIdMismatch guard (added in #272, fails closed) drops them. ~121 active devices affected (prod DB), plus a legacy tail.

Root cause: accountId was never reliably written to DeviceRegistration/v2/device/register is AppCheck-only and doesn't set it, and the /auth/token SIWE backfill updateMany no-ops when the device row doesn't exist yet (register/auth are independent and unordered).

Fix: subscribe.ts stamps DeviceRegistration.accountId from the authenticated subscriber, in the same transaction as the ClientIdentifier upsert (current-subscriber-wins; fails closed; no request-schema change → backwards-compatible). Self-heals every active device on its next authenticated subscribe.

Backfill migration intentionally dropped: an earlier revision included a one-shot data migration to backfill existing NULLs; it was removed because inferring current ownership from historical ClientIdentifier rows is unsound under iOS token recycling (cross-account delivery risk flagged by macroscope + the Claude review). We rely on the leak-safe subscribe-side self-heal instead.

Guard unchanged (kept fail-closed). Tests added; tsc clean.


View with Codesmith Autofix with Codesmith
Need help on this PR? Tag /codesmith with what you need. Autofix is disabled.

Note

Fix NULL DeviceRegistration.accountId by stamping account during push subscription

  • Adds persistSubscriptionIdentity in subscribe.ts to upsert ClientIdentifier and conditionally update DeviceRegistration.accountId where it is NULL or stale, within a single transaction.
  • The subscribe handler now wraps both writes in prisma.$transaction, ensuring the account adoption and identity upsert are atomic.
  • accountId is omitted from ClientIdentifier create/update when undefined, preventing unguarded NULL writes.
  • A new test suite in notifications-subscribe-identity.test.ts covers authenticated/unauthenticated paths and verifies transactional wrapping.

Macroscope summarized b126653.

Summary by CodeRabbit

  • Bug Fixes
    • Improved notification subscription handling to save subscription and device identity together in one transaction.
    • When signed in, device records are now linked to the current account more reliably, reducing mismatched subscription ownership.
    • Added stronger validation around transaction behavior to ensure both updates use the same database session.

@lourou lourou requested a review from a team as a code owner June 25, 2026 16:08
@claude

claude Bot commented Jun 25, 2026

Copy link
Copy Markdown

Claude finished @lourou's task —— View job


Code Review Complete ✅

I've thoroughly reviewed this PR. The implementation is well-designed and thoughtful, with excellent documentation and test coverage. Below are my findings across the requested areas:


Code Quality & Best Practices ✨

Strengths:

  • Excellent documentation: The 22-line docstring in persistSubscriptionIdentity (lines 26-51) clearly explains the "why" behind the approach, safety properties, and edge cases
  • Clean abstraction: The SubscriptionIdentityTx interface (lines 15-24) enables unit testing without database dependencies - very smart
  • Defensive programming: The conditional spread ...(accountId !== undefined ? { accountId } : {}) ensures legacy JWTs don't clobber existing values
  • Atomic operations: Using updateMany with a conditional predicate (lines 80-86) avoids read-modify-write races
  • Comprehensive tests: 222 lines of test coverage including authenticated/unauthenticated paths, transaction wiring verification, and edge case handling

Minor suggestions:

  1. TypeScript style consistency: The function signature includes an explicit return type (line 53), but per CLAUDE.md guidelines, functions in src/**/*.ts should let TypeScript infer the return type. Consider removing : Promise<void> to match project conventions.

Potential Bugs & Issues 🐛

Critical finding from existing review comments:

  • Migration removed (good decision): The PR description mentions an earlier revision included a backfill migration that was correctly removed due to iOS token recycling risks. Macroscope's review comment flagged this as a cross-account delivery risk - the author made the right call to remove it and rely on the subscribe-side self-heal instead.

No new bugs identified in the current implementation:

  • The OR: [{ accountId: null }, { accountId: { not: accountId } }] predicate correctly handles both NULL and mismatched cases
  • Transaction boundaries are properly enforced
  • Compensation logic exists for remote call failures (lines 206-220, 242-257)
  • The fail-closed guard in webhook.ts:isAccountIdMismatch ensures safety

Performance Considerations ⚡

Efficient implementation:

  • Single conditional UPDATE: The updateMany approach (lines 80-86) is optimal - it's a single SQL statement that only touches rows needing changes, avoiding a SELECT-then-UPDATE pattern
  • No N+1 queries: All operations are atomic within the transaction
  • Index coverage: The schema shows @@index([accountId]) on both DeviceRegistration (line 49) and ClientIdentifier (line 64), ensuring the WHERE clause is indexed

Scale considerations:

  • The transaction wraps two simple operations (one upsert, one conditional update) - minimal lock contention
  • For devices already on the correct accountId, the updateMany becomes a no-op thanks to the { not: accountId } predicate

Security Concerns 🔒

Strong security posture:

  1. Fail-closed guard: The isAccountIdMismatch function in webhook.ts:60-88 treats NULL on either side as untrusted and drops the push - this is the correct security boundary
  2. JWT-sourced accountId: Line 232 shows accountId = res.locals.accountId comes from authenticated JWT, not request body - prevents spoofing
  3. Current-subscriber-wins adoption: The docstring (line 38-40) correctly notes this is safe because the guard fails closed - a wrong accountId can only DROP pushes, never deliver cross-account
  4. No clobbering of authenticated state: Legacy JWTs (accountId: undefined) skip both writes entirely (lines 79-87, 66-70), preserving any prior SIWE-authenticated values

The removed migration was the right security decision - inferring ownership from historical ClientIdentifier rows is unsound under iOS token recycling, as multiple reviewers flagged.


Test Coverage 🧪

Excellent coverage (222 lines):

  1. Authenticated subscribe (test lines 38-70): Validates both ClientIdentifier.accountId and DeviceRegistration.accountId are stamped correctly
  2. Conditional update predicate (test lines 72-94): Confirms the WHERE clause matches NULL and differing accounts, excludes matching ones
  3. Unauthenticated/legacy path (test lines 96-120): Ensures accountId: undefined doesn't clobber existing values and skips device write entirely
  4. Transaction wiring (test lines 122-139, 192-222): Proves both writes use the same transaction client (not global prisma)
  5. Handler-level integration (test lines 192-222): Verifies the subscribe handler actually wraps calls in prisma.$transaction

Coverage gaps (minor):

  • No test for the case where DeviceRegistration.accountId starts NULL and gets stamped - though the predicate test covers the WHERE clause logic
  • No test for the compensation/rollback paths (lines 242-257) when DB transaction fails - these are integration-level concerns

Additional Observations 📋

Backwards compatibility (per CLAUDE.md requirements):

  • ✅ No changes to request schema - fully backwards-compatible with shipped clients
  • ✅ The fix is self-healing on the next authenticated subscribe - no forced updates needed

Code follows repository conventions:

  • Uses existing prisma import from @/utils/prisma
  • Matches error handling patterns in webhook.ts
  • Logging structure consistent with existing handlers (lines 120-132, 259-266)

Documentation quality:

  • The PR description clearly explains problem, root cause, fix, and why the migration was dropped
  • Inline comments explain the "why" not just the "what"
  • Test file includes a clear header comment explaining the contract being tested (lines 8-11)

Summary

This is a high-quality fix that addresses a real production issue (~121 affected devices) with a thoughtful, security-conscious approach:

  • ✅ Self-heals NULL accountId devices on their next authenticated subscribe
  • ✅ Fail-closed security posture prevents cross-account leaks
  • ✅ Excellent test coverage and documentation
  • ✅ No backwards-incompatible changes
  • ✅ Smart decision to remove the unsafe backfill migration

Recommendation: The code is ready to merge pending any additional reviewer-specific concerns. The only nitpick is the explicit return type annotation, which is a minor style inconsistency with project conventions but doesn't affect correctness.


@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

The subscribe handler now persists subscription identity inside one Prisma transaction. A new helper upserts ClientIdentifier, conditionally updates DeviceRegistration.accountId, and tests cover authenticated, unauthenticated, and handler transaction wiring paths.

Changes

Subscription identity transaction flow

Layer / File(s) Summary
Transaction helper and handler
src/api/v2/notifications/handlers/subscribe.ts
SubscriptionIdentityTx and persistSubscriptionIdentity upsert ClientIdentifier and conditionally update DeviceRegistration.accountId inside the provided transaction client; the subscribe handler now calls the helper through prisma.$transaction.
Identity persistence tests
tests/notifications-subscribe-identity.test.ts
Vitest coverage exercises authenticated and unauthenticated persistSubscriptionIdentity cases and the subscribe handler’s single-transaction wiring.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant SubscribeHandler as "subscribe handler"
  participant PrismaTransaction as "prisma.$transaction"
  participant PersistSubscriptionIdentity as persistSubscriptionIdentity
  participant ClientIdentifierUpsert as "clientIdentifier.upsert"
  participant DeviceRegistrationUpdateMany as "deviceRegistration.updateMany"
  Client->>SubscribeHandler: submit subscription identity
  SubscribeHandler->>PrismaTransaction: call transaction with tx callback
  PrismaTransaction->>PersistSubscriptionIdentity: invoke callback with tx
  PersistSubscriptionIdentity->>ClientIdentifierUpsert: upsert deviceId and accountId
  opt accountId provided
    PersistSubscriptionIdentity->>DeviceRegistrationUpdateMany: updateMany accountId adoption predicate
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • xmtplabs/convos-backend#272: Also modifies /v2/notifications/subscribe to persist ClientIdentifier.accountId, matching the transactional identity persistence path here.

Poem

I hopped through one transaction, snug and neat,
Upserting identities on rabbit feet.
A twitch for clientIdentifier, a thump for the rest,
DeviceRegistration got its accountId quest.
Under the moon, my whiskers approved the greet.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly matches the main fix: handling NULL DeviceRegistration.accountId to prevent notification drops.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/push-notif-null-accountid

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@macroscopeapp

macroscopeapp Bot commented Jun 25, 2026

Copy link
Copy Markdown

Approvability

Verdict: Needs human review

This PR changes runtime behavior in the notification subscribe flow by now writing DeviceRegistration.accountId when it's NULL or differs from the subscriber. This gates push delivery logic and affects which notifications get delivered vs dropped. The changed files are owned by xmtplabs/engineering, who should review this behavioral change.

You can customize Macroscope's approvability policy. Learn more.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
src/api/v2/notifications/handlers/subscribe.ts (2)

63-67: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Omit accountId from create when it is absent.

The update branch already avoids writing accountId for legacy JWTs; the create branch still passes accountId: undefined, relying on Prisma to omit it. Use the same conditional spread here and update the test expectation accordingly.

Suggested fix
     create: {
       id: clientId,
       deviceId,
-      accountId,
+      ...(accountId !== undefined ? { accountId } : {}),
     },
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/api/v2/notifications/handlers/subscribe.ts` around lines 63 - 67, The
subscribe handler’s create payload is still always including accountId, which
can pass undefined for legacy JWTs; update the create branch in subscribe.ts to
use the same conditional spread approach as the update branch so accountId is
only present when defined, and adjust the related subscribe test expectation to
match the omitted field behavior.

53-58: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Let TypeScript infer this helper’s return type.

Remove : Promise<void> to match the project’s src/**/* TypeScript style. As per coding guidelines, "src/**/*.{ts,tsx}: Don't specify return type on functions. Prefer inferring the value in TypeScript".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/api/v2/notifications/handlers/subscribe.ts` around lines 53 - 58, The
helper persistSubscriptionIdentity currently hardcodes a Promise<void> return
annotation, which conflicts with the TypeScript style used under src/**. Remove
the explicit return type from the function signature and let TypeScript infer
it, keeping the change localized to persistSubscriptionIdentity and any similar
helper signatures in this area if present.

Source: Coding guidelines

tests/notifications-subscribe-identity.test.ts (1)

121-138: 📐 Maintainability & Code Quality | 🔵 Trivial | 🏗️ Heavy lift

Cover the handler’s $transaction wiring, not just the helper.

This proves the helper uses the injected tx, but it would still pass if subscribe stopped wrapping it in prisma.$transaction. Add a handler-level test that mocks prisma.$transaction and asserts the callback uses that transaction client.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/notifications-subscribe-identity.test.ts` around lines 121 - 138, The
current test only verifies persistSubscriptionIdentity uses the injected tx, but
it does not prove subscribe still wraps the helper in prisma.$transaction. Add a
handler-level test around subscribe that mocks prisma.$transaction and asserts
the callback receives and uses the transaction client, so both writes are
verified through the same transaction path. Use the existing subscribe and
persistSubscriptionIdentity symbols to place the new assertion alongside the
current transaction-client coverage.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dev/scripts/backfillDeviceAccountId.ts`:
- Around line 94-101: The shutdown flow in main() currently ignores the Prisma
disconnect promise by using void in the finally block, which can hide cleanup
failures. Update the promise chain around main() so prisma.$disconnect() is
returned from the finally handler and any disconnect error is handled in the
following catch, keeping the existing logger and process exit behavior intact.

In
`@prisma/migrations/20260625120000_backfill_device_registration_account_id/migration.sql`:
- Around line 41-45: The backfill query in the migration can still select a
unanimous `ClientIdentifier.accountId` value that does not exist in `Account`,
which will later fail when updating `DeviceRegistration.accountId`. Update the
`SELECT` used by the backfill to keep the unanimity check in `ClientIdentifier`,
then add an `Account` join or `EXISTS` filter before the update so only account
IDs present in `Account` are applied. Use the migration’s backfill query block
and the `ClientIdentifier`/`Account` tables as the fix point.

---

Nitpick comments:
In `@src/api/v2/notifications/handlers/subscribe.ts`:
- Around line 63-67: The subscribe handler’s create payload is still always
including accountId, which can pass undefined for legacy JWTs; update the create
branch in subscribe.ts to use the same conditional spread approach as the update
branch so accountId is only present when defined, and adjust the related
subscribe test expectation to match the omitted field behavior.
- Around line 53-58: The helper persistSubscriptionIdentity currently hardcodes
a Promise<void> return annotation, which conflicts with the TypeScript style
used under src/**. Remove the explicit return type from the function signature
and let TypeScript infer it, keeping the change localized to
persistSubscriptionIdentity and any similar helper signatures in this area if
present.

In `@tests/notifications-subscribe-identity.test.ts`:
- Around line 121-138: The current test only verifies
persistSubscriptionIdentity uses the injected tx, but it does not prove
subscribe still wraps the helper in prisma.$transaction. Add a handler-level
test around subscribe that mocks prisma.$transaction and asserts the callback
receives and uses the transaction client, so both writes are verified through
the same transaction path. Use the existing subscribe and
persistSubscriptionIdentity symbols to place the new assertion alongside the
current transaction-client coverage.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a396357d-3e2c-4872-b794-9a9345de702c

📥 Commits

Reviewing files that changed from the base of the PR and between 203e50b and 73b0a00.

📒 Files selected for processing (4)
  • dev/scripts/backfillDeviceAccountId.ts
  • prisma/migrations/20260625120000_backfill_device_registration_account_id/migration.sql
  • src/api/v2/notifications/handlers/subscribe.ts
  • tests/notifications-subscribe-identity.test.ts

Comment thread dev/scripts/backfillDeviceAccountId.ts Outdated
@lourou lourou requested a review from fbac June 25, 2026 17:16
@lourou lourou force-pushed the fix/push-notif-null-accountid branch from 73b0a00 to 2751220 Compare June 25, 2026 17:20
@lourou lourou force-pushed the fix/push-notif-null-accountid branch from 2751220 to 8203c79 Compare June 26, 2026 13:15
A NULL DeviceRegistration.accountId was silently dropped by the
isAccountIdMismatch webhook guard (it fails closed unless the
ClientIdentifier and DeviceRegistration accountIds are equal and
non-null), so an affected device received no pushes at all.

On an authenticated subscribe, write the subscriber's accountId onto
the device transactionally alongside the ClientIdentifier upsert:
set DeviceRegistration.accountId when it is NULL or differs from the
subscriber ("current-subscriber-wins"). Only an accountId-bearing JWT
mutates anything, so legacy/non-SIWE builds never clobber a prior or
backfilled value. The guard still fails closed, so a stale device
accountId can only ever drop a push, never deliver cross-account; the
state self-heals on the next authenticated subscribe.

Logic is extracted into persistSubscriptionIdentity and unit-tested
(adopt-on-null, overwrite-on-differ, untouched-without-accountId,
same-transaction), plus a handler-level test asserting subscribe wraps
both writes in one prisma.$transaction.
@lourou lourou force-pushed the fix/push-notif-null-accountid branch from 8203c79 to b126653 Compare June 26, 2026 13:34
@lourou

lourou commented Jun 26, 2026

Copy link
Copy Markdown
Member Author

@claude re the iOS token-recycling leak (Critical #2) and the Account FK guard (Critical #1): both are about the backfill migration, which we've removed from the branch. We took the "skip the migration, rely on the subscribe-time fix" option — so the token-recycling leak and the FK concern are both moot. Updated the PR description to match. Guard stays fail-closed.

@lourou

lourou commented Jun 26, 2026

Copy link
Copy Markdown
Member Author

Addressed in b126653 — conditional-spread on the create branch, dropped the explicit return type, and added a handler-level $transaction test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants