[Bug + Feature] Historical backfill exposes three temporal-correctness gaps

## Summary

While attempting to **backfill historical data** into Graphiti —- ingesting data from old session log files, so that the resulting facts are stamped with the time the underlying event actually happened, not the time ingestion ran —- three independent gaps were uncovered. The bi-temporal model is the differentiator that makes backfilling work: agents can reason about "what was true when," and contradictions get resolved via temporal supersedence. However, in the case of importing structured data, we know precisely when an event happened, so we need not rely on inference to discern the dates from the episode body, if we are able to pass it in as a parameter.

I propose modifying the MCP interface to add `reference_time`, and updating the prompt to tell the model to use it, and closing gaps in the existing prompt that caused the model to guess at dates of some related edges. I also suggest an update to `delete_episode` so that it does not leave dangling references, pointing at no-longer existing nodes, in the graph.

## The use case

A "memory consolidation" script runs on old Claude session logs, to extract useful context information and import it into the graph using `add_memory` on the Graphiti MCP server for each one. Each save passes the session's actual date as the temporal reference.

For this to work end-to-end three things are needed to behave correctly:

1. The MCP `add_memory` tool must accept and propagate a caller-supplied `reference_time`.
2. In case of mistakes or corrections and we want to delete an entry, the MCP `delete_episode` tool must cascade to the edges and entities derived from the episode (so a bad ingest can be cleanly undone and re-tried).
3. The fact-extraction LLM must honor `REFERENCE_TIME` when text contains no explicit date, instead of guessing.

Currently, Graphiti lacks all three.

---

## Gap 1: MCP `add_memory` throws away `reference_time`

`graphiti_core.Graphiti.add_episode` already accepts a `reference_time` parameter and uses it correctly. The MCP wrapper at `mcp_server/src/graphiti_mcp_server.py` does not expose it — `add_memory` has no such parameter, and the internal queue service hardcodes `reference_time=datetime.now(timezone.utc)`.

Result: every MCP-driven ingest is stamped with the ingestion time, instead of the time when the underlying event happened. This blocks the backfill use case.

**Proposed fix:**
- Add optional `reference_time: str | None = None` (ISO-8601) to the MCP `add_memory` tool.
- Parse the string to `datetime` inside the tool (MCP can't carry `datetime` across the JSON boundary).
- Plumb through `QueueService.add_episode`, defaulting to `datetime.now(timezone.utc)` when the caller omits it — preserves current behavior for every existing caller.
- Unit test that an explicit `reference_time` reaches the underlying `add_episode` call and that `None` falls back to a fresh `now()`.

## Gap 2: MCP `delete_episode` is shallow — leaves orphan edges and entities

The MCP `delete_episode` tool calls `EpisodicNode.delete(client.driver)`, which drops only the `Episodic` node. The `RELATES_TO` edges that were extracted from the episode remain in the graph, with stale `episodes` arrays pointing at the now-deleted UUID. `Entity` nodes that were mentioned **only** by the deleted episode are also left behind as orphans.

`graphiti_core.Graphiti.remove_episode` already exists and does the cascading delete correctly: it removes the episode, the edges where this episode is the first provenance entry, and any entity nodes mentioned only by the deleted episode.

This blocks the use case at "if I make a bad ingest, I can't cleanly undo it and retry."

**Proposed fix:**
- One-line change in `mcp_server/src/graphiti_mcp_server.py`: replace the `EpisodicNode.get_by_uuid` + `episodic_node.delete` pair with `await client.remove_episode(uuid)`.

## Gap 3: Fact extractor hallucinates today's date for ambiguous past-tense facts

When an episode is ingested with a historical `reference_time` (so `episode.valid_at` is correctly set in the past), graphiti-core correctly passes `latest_episode.valid_at` to the fact-extraction LLM as `REFERENCE_TIME` (`graphiti_core/utils/maintenance/edge_operations.py:195`). But the LLM frequently ignores it and stamps extracted edges with **today @ 00:00 UTC** as `valid_at`.

Root cause is **prompt-shape**, not plumbing. The current prompt at `graphiti_core/prompts/extract_edges.py` contains contradicting rules:

```
- If the fact is ongoing (present tense), set `valid_at` to the timestamp of the episode the fact originates from. If no per-episode timestamp is available, use REFERENCE_TIME.
- ...
- Leave both fields `null` if no explicit or resolvable time is stated.
```

For past-tense completed actions without an in-text date ("Bob called Roger"), the "ongoing" trigger doesn't apply and the LLM falls to the "leave null" escape. gpt-4.1 / gpt-5-class models reliably read this as "guess" and choose today's midnight rather than emit null. The same loophole exists in the `extract_timestamps` and `extract_timestamps_batch` prompts.

**Scale observed in the wild** (1384 edges from 264 historical-backfill episodes on a production graph):

| Field | Pattern | Count | % |
|---|---|---:|---:|
| `valid_at` | Today midnight UTC (LLM hallucination) | 771 | **56%** |
| `valid_at` | Legitimate in-text date | 610 | 44% |
| `valid_at` | null | 3 | 0.2% |
| `invalid_at` | null (correct) | 1016 | 73% |
| `invalid_at` | Today midnight UTC (LLM hallucination) | 100 | 7% |
| `invalid_at` | Today with microseconds (graphiti's auto-invalidation chain — legitimate) | 60 | 4% |
| `invalid_at` | Legitimate in-text date | 208 | 15% |

Detection signature is **clean midnight UTC on the same date as `created_at`** (hour=minute=second=nanosecond=0). Microsecond-precision today values are NOT hallucinations — they're graphiti's own supersede-chain timestamps.

This is the gap that breaks the use case at step 3: even with `reference_time` plumbed and `delete_episode` cascading, ambiguous past-tense facts still collapse to today, erasing the temporal axis.

**Proposed fix:** Tighten the DATETIME RULES in all three prompts (`edge`, `extract_timestamps`, `extract_timestamps_batch`):

1. Remove the "leave null" escape for `valid_at`. Make `REFERENCE_TIME`, which is always set by Graphiti, the mandatory fallback whenever the text has no resolvable date.
2. Keep `invalid_at` null-by-default — only set when the text explicitly states the fact ended or was superseded.
3. Add an explicit prohibition: *"NEVER use today's date, the current date, 'now', or any inferred 'current' time as `valid_at` or `invalid_at`. REFERENCE_TIME is the ONLY acceptable default — do not substitute your own notion of the present."*

Prompt-only change; no code logic or schema impact.

---

## Why these three are related but separable

All three are needed end-to-end for historical backfill, but each fix is **independently valuable** and **independently mergeable**:

- The `reference_time` MCP param is useful to any caller that wants to backfill — even if the prompt still hallucinates, at least the *episode* lands at the right time.
- The `delete_episode` cascade is a correctness fix that applies to any caller, not just backfill — it stops silent graph corruption from any ingest that needs to be retried.
- The prompt fix improves temporal accuracy for **all** ingests, not just historical ones. (Even today-aligned ingests benefit when REFERENCE_TIME genuinely *is* today and the LLM should anchor on it.)

Submitting as separate PRs keeps review focused and lets each merge on its own timeline.

## PR checklist

- [x] **PR 1 — `feat(mcp): add reference_time param to add_memory`** -- [#1490](https://github.com/getzep/graphiti/pull/1490)
- [x] **PR 2 — `fix(mcp): delete_episode cascade to edges and orphan entities`** -- [#1491](https://github.com/getzep/graphiti/pull/1491)
- [x] **PR 3 — `fix(prompts): force valid_at fallback to REFERENCE_TIME, ban "today"`** -- [#1492](https://github.com/getzep/graphiti/pull/1492)

## Notes for operators with existing bad data

Repairing rows already affected by Gap 3 is per-operator (not part of any of the three PRs). The signature is narrow enough to fix with two Cypher statements:

```cypher
// Fix valid_at hallucinations: clean-midnight on same date as created_at
MATCH (e:Episodic)
MATCH ()-[r:RELATES_TO]->()
WHERE r.episodes[0] = e.uuid
  AND date(r.valid_at) = date(r.created_at)
  AND r.valid_at.hour = 0 AND r.valid_at.minute = 0
  AND r.valid_at.second = 0 AND r.valid_at.nanosecond = 0
  AND r.valid_at <> e.valid_at
SET r.valid_at = e.valid_at;

// Null out invalid_at hallucinations: clean-midnight only (preserves graphiti's
// auto-invalidation chain, which uses microsecond precision)
MATCH (e:Episodic)
MATCH ()-[r:RELATES_TO]->()
WHERE r.episodes[0] = e.uuid
  AND date(r.invalid_at) = date(r.created_at)
  AND r.invalid_at.hour = 0 AND r.invalid_at.minute = 0
  AND r.invalid_at.second = 0 AND r.invalid_at.nanosecond = 0
SET r.invalid_at = NULL;
```

The clean-midnight precision filter on Fix 1 is important — a looser `date()`-only filter overwrites legitimate same-day text extractions.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug + Feature] Historical backfill exposes three temporal-correctness gaps #1489

Summary

The use case

Gap 1: MCP `add_memory` throws away `reference_time`

Gap 2: MCP `delete_episode` is shallow — leaves orphan edges and entities

Gap 3: Fact extractor hallucinates today's date for ambiguous past-tense facts

Why these three are related but separable

PR checklist

Notes for operators with existing bad data

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Field	Pattern	Count	%
`valid_at`	Today midnight UTC (LLM hallucination)	771	56%
`valid_at`	Legitimate in-text date	610	44%
`valid_at`	null	3	0.2%
`invalid_at`	null (correct)	1016	73%
`invalid_at`	Today midnight UTC (LLM hallucination)	100	7%
`invalid_at`	Today with microseconds (graphiti's auto-invalidation chain — legitimate)	60	4%
`invalid_at`	Legitimate in-text date	208	15%

[Bug + Feature] Historical backfill exposes three temporal-correctness gaps #1489

Description

Summary

The use case

Gap 1: MCP add_memory throws away reference_time

Gap 2: MCP delete_episode is shallow — leaves orphan edges and entities

Gap 3: Fact extractor hallucinates today's date for ambiguous past-tense facts

Why these three are related but separable

PR checklist

Notes for operators with existing bad data

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Gap 1: MCP `add_memory` throws away `reference_time`

Gap 2: MCP `delete_episode` is shallow — leaves orphan edges and entities