Skip to content

Idempotency key metadata silently dropped for runs when >1000 idempotency keys are created in a single run (in-memory LRU catalog eviction) #4046

Description

@brentshulman-silkline

Summary

When a single run creates more than 1,000 idempotency keys (e.g. fanning out a large batchTrigger where each item uses idempotencyKeys.create(...)), the original key + scope metadata is silently dropped for all but the most‑recently‑created 1,000 keys. The resulting child runs end up with no idempotencyKeyOptions, so ctx.run.idempotencyKey returns the raw hash and the dashboard / run records / analytics (idempotency_key, idempotency_key_scope) show an empty idempotency key + scope — even though the idempotency hash itself is applied and dedup still works.

Root cause: the fixed-size in-memory LRUIdempotencyKeyCatalog (default maxSize = 1000).

Environment

  • @trigger.dev/sdk + @trigger.dev/core v4.4.6 (self-hosted)
  • Logic is present on current main.

Root cause

  1. createIdempotencyKey() is local — it SHA‑256 hashes the key and registers the original { key, scope } in a process-global LRU catalog:
    • packages/core/src/v3/idempotencyKeys.tsidempotencyKeyCatalog.registerKeyOptions(idempotencyKey, { key: userKey, scope })
  2. The catalog evicts the oldest entries beyond a default cap of 1,000:
    • packages/core/src/v3/idempotency-key-catalog/lruIdempotencyKeyCatalog.tsconstructor(maxSize = 1_000); registerKeyOptions() deletes the oldest while this.cache.size > this.maxSize.
  3. When the SDK builds batch items, each item resolves its options via a catalog lookup:
    • packages/trigger-sdk/src/v3/shared.ts:
      const itemIdempotencyKey = await makeIdempotencyKey(item.options?.idempotencyKey);
      const idempotencyKeyOptions = itemIdempotencyKey
        ? getIdempotencyKeyOptions(itemIdempotencyKey)   // <- LRU lookup; undefined once evicted
        : undefined;
      // ...
      options: { idempotencyKey: finalIdempotencyKey?.toString(), idempotencyKeyOptions, ... }
    • The hash (idempotencyKey) is always sent, but idempotencyKeyOptions is undefined for any key that has been evicted from the LRU.

So when a run calls idempotencyKeys.create() N > 1000 times (common when fanning out a large per-item‑keyed batchTrigger), only the most-recently-created 1,000 keys still resolve options; the first N − 1000 lose their { key, scope }.

Reproduction

Inside a task, create > 1000 keys and batch-trigger:

const items = await Promise.all(
  Array.from({ length: 3000 }, async (_, i) => ({
    payload: { i },
    options: {
      idempotencyKey: await idempotencyKeys.create([`item-${i}`], { scope: "global" }),
    },
  }))
);

await childTask.batchTrigger(items);

Observe the created child runs: the first ~2,000 show an empty idempotency key + scope (and ctx.run.idempotencyKey returns the hash), while the last ~1,000 show the original key + global scope. The boundary lands exactly at the LRU capacity (1,000), and it is the most recent 1,000 that retain metadata.

Expected vs. actual

  • Expected: key + scope reported for every run created with an idempotency key, regardless of how many keys were created in the run.
  • Actual: min(N, total) − 1000 runs report no key/scope.

Real-world case that led us here: a fan‑out of ~6,235 per-item global keys produced exactly 1,000 runs with key metadata and ~5,235 with none — reproducibly, across two self-hosted instances, with the keyed runs always being the most recent ~1,000.

Impact

Observability (in our testing, deduplication itself still worked because the hash is applied):

  • Dashboard idempotency-key filtering shows no key for >1000-key runs.
  • ctx.run.idempotencyKey falls back to the hash instead of the original key.
  • Analytics idempotency_key / idempotency_key_scope come up empty.
  • idempotencyKeys.reset(...) from a raw key relies on the same catalog.

It is silent and size-dependent: works fine under 1,000 keys per run, silently degrades above it, with no warning.

Suggested fixes

  • Avoid round-tripping per-key metadata through a process-global fixed-size LRU at batch-build time — e.g. encode the scope (and/or original key) into the returned IdempotencyKey, or carry the options alongside the key, so getIdempotencyKeyOptions() doesn't depend on catalog residency.
  • Or make the catalog capacity configurable and/or scale it with batch size.
  • At minimum, document the 1,000-keys-per-run limit and emit a warning when it is exceeded.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions