Summary
While attempting to backfill historical data into Graphiti —- ingesting data from old session log files, so that the resulting facts are stamped with the time the underlying event actually happened, not the time ingestion ran —- three independent gaps were uncovered. The bi-temporal model is the differentiator that makes backfilling work: agents can reason about "what was true when," and contradictions get resolved via temporal supersedence. However, in the case of importing structured data, we know precisely when an event happened, so we need not rely on inference to discern the dates from the episode body, if we are able to pass it in as a parameter.
I propose modifying the MCP interface to add reference_time, and updating the prompt to tell the model to use it, and closing gaps in the existing prompt that caused the model to guess at dates of some related edges. I also suggest an update to delete_episode so that it does not leave dangling references, pointing at no-longer existing nodes, in the graph.
The use case
A "memory consolidation" script runs on old Claude session logs, to extract useful context information and import it into the graph using add_memory on the Graphiti MCP server for each one. Each save passes the session's actual date as the temporal reference.
For this to work end-to-end three things are needed to behave correctly:
- The MCP
add_memory tool must accept and propagate a caller-supplied reference_time.
- In case of mistakes or corrections and we want to delete an entry, the MCP
delete_episode tool must cascade to the edges and entities derived from the episode (so a bad ingest can be cleanly undone and re-tried).
- The fact-extraction LLM must honor
REFERENCE_TIME when text contains no explicit date, instead of guessing.
Currently, Graphiti lacks all three.
Gap 1: MCP add_memory throws away reference_time
graphiti_core.Graphiti.add_episode already accepts a reference_time parameter and uses it correctly. The MCP wrapper at mcp_server/src/graphiti_mcp_server.py does not expose it — add_memory has no such parameter, and the internal queue service hardcodes reference_time=datetime.now(timezone.utc).
Result: every MCP-driven ingest is stamped with the ingestion time, instead of the time when the underlying event happened. This blocks the backfill use case.
Proposed fix:
- Add optional
reference_time: str | None = None (ISO-8601) to the MCP add_memory tool.
- Parse the string to
datetime inside the tool (MCP can't carry datetime across the JSON boundary).
- Plumb through
QueueService.add_episode, defaulting to datetime.now(timezone.utc) when the caller omits it — preserves current behavior for every existing caller.
- Unit test that an explicit
reference_time reaches the underlying add_episode call and that None falls back to a fresh now().
Gap 2: MCP delete_episode is shallow — leaves orphan edges and entities
The MCP delete_episode tool calls EpisodicNode.delete(client.driver), which drops only the Episodic node. The RELATES_TO edges that were extracted from the episode remain in the graph, with stale episodes arrays pointing at the now-deleted UUID. Entity nodes that were mentioned only by the deleted episode are also left behind as orphans.
graphiti_core.Graphiti.remove_episode already exists and does the cascading delete correctly: it removes the episode, the edges where this episode is the first provenance entry, and any entity nodes mentioned only by the deleted episode.
This blocks the use case at "if I make a bad ingest, I can't cleanly undo it and retry."
Proposed fix:
- One-line change in
mcp_server/src/graphiti_mcp_server.py: replace the EpisodicNode.get_by_uuid + episodic_node.delete pair with await client.remove_episode(uuid).
Gap 3: Fact extractor hallucinates today's date for ambiguous past-tense facts
When an episode is ingested with a historical reference_time (so episode.valid_at is correctly set in the past), graphiti-core correctly passes latest_episode.valid_at to the fact-extraction LLM as REFERENCE_TIME (graphiti_core/utils/maintenance/edge_operations.py:195). But the LLM frequently ignores it and stamps extracted edges with today @ 00:00 UTC as valid_at.
Root cause is prompt-shape, not plumbing. The current prompt at graphiti_core/prompts/extract_edges.py contains contradicting rules:
- If the fact is ongoing (present tense), set `valid_at` to the timestamp of the episode the fact originates from. If no per-episode timestamp is available, use REFERENCE_TIME.
- ...
- Leave both fields `null` if no explicit or resolvable time is stated.
For past-tense completed actions without an in-text date ("Bob called Roger"), the "ongoing" trigger doesn't apply and the LLM falls to the "leave null" escape. gpt-4.1 / gpt-5-class models reliably read this as "guess" and choose today's midnight rather than emit null. The same loophole exists in the extract_timestamps and extract_timestamps_batch prompts.
Scale observed in the wild (1384 edges from 264 historical-backfill episodes on a production graph):
| Field |
Pattern |
Count |
% |
valid_at |
Today midnight UTC (LLM hallucination) |
771 |
56% |
valid_at |
Legitimate in-text date |
610 |
44% |
valid_at |
null |
3 |
0.2% |
invalid_at |
null (correct) |
1016 |
73% |
invalid_at |
Today midnight UTC (LLM hallucination) |
100 |
7% |
invalid_at |
Today with microseconds (graphiti's auto-invalidation chain — legitimate) |
60 |
4% |
invalid_at |
Legitimate in-text date |
208 |
15% |
Detection signature is clean midnight UTC on the same date as created_at (hour=minute=second=nanosecond=0). Microsecond-precision today values are NOT hallucinations — they're graphiti's own supersede-chain timestamps.
This is the gap that breaks the use case at step 3: even with reference_time plumbed and delete_episode cascading, ambiguous past-tense facts still collapse to today, erasing the temporal axis.
Proposed fix: Tighten the DATETIME RULES in all three prompts (edge, extract_timestamps, extract_timestamps_batch):
- Remove the "leave null" escape for
valid_at. Make REFERENCE_TIME, which is always set by Graphiti, the mandatory fallback whenever the text has no resolvable date.
- Keep
invalid_at null-by-default — only set when the text explicitly states the fact ended or was superseded.
- Add an explicit prohibition: "NEVER use today's date, the current date, 'now', or any inferred 'current' time as
valid_at or invalid_at. REFERENCE_TIME is the ONLY acceptable default — do not substitute your own notion of the present."
Prompt-only change; no code logic or schema impact.
Why these three are related but separable
All three are needed end-to-end for historical backfill, but each fix is independently valuable and independently mergeable:
- The
reference_time MCP param is useful to any caller that wants to backfill — even if the prompt still hallucinates, at least the episode lands at the right time.
- The
delete_episode cascade is a correctness fix that applies to any caller, not just backfill — it stops silent graph corruption from any ingest that needs to be retried.
- The prompt fix improves temporal accuracy for all ingests, not just historical ones. (Even today-aligned ingests benefit when REFERENCE_TIME genuinely is today and the LLM should anchor on it.)
Submitting as separate PRs keeps review focused and lets each merge on its own timeline.
PR checklist
Notes for operators with existing bad data
Repairing rows already affected by Gap 3 is per-operator (not part of any of the three PRs). The signature is narrow enough to fix with two Cypher statements:
// Fix valid_at hallucinations: clean-midnight on same date as created_at
MATCH (e:Episodic)
MATCH ()-[r:RELATES_TO]->()
WHERE r.episodes[0] = e.uuid
AND date(r.valid_at) = date(r.created_at)
AND r.valid_at.hour = 0 AND r.valid_at.minute = 0
AND r.valid_at.second = 0 AND r.valid_at.nanosecond = 0
AND r.valid_at <> e.valid_at
SET r.valid_at = e.valid_at;
// Null out invalid_at hallucinations: clean-midnight only (preserves graphiti's
// auto-invalidation chain, which uses microsecond precision)
MATCH (e:Episodic)
MATCH ()-[r:RELATES_TO]->()
WHERE r.episodes[0] = e.uuid
AND date(r.invalid_at) = date(r.created_at)
AND r.invalid_at.hour = 0 AND r.invalid_at.minute = 0
AND r.invalid_at.second = 0 AND r.invalid_at.nanosecond = 0
SET r.invalid_at = NULL;
The clean-midnight precision filter on Fix 1 is important — a looser date()-only filter overwrites legitimate same-day text extractions.
Summary
While attempting to backfill historical data into Graphiti —- ingesting data from old session log files, so that the resulting facts are stamped with the time the underlying event actually happened, not the time ingestion ran —- three independent gaps were uncovered. The bi-temporal model is the differentiator that makes backfilling work: agents can reason about "what was true when," and contradictions get resolved via temporal supersedence. However, in the case of importing structured data, we know precisely when an event happened, so we need not rely on inference to discern the dates from the episode body, if we are able to pass it in as a parameter.
I propose modifying the MCP interface to add
reference_time, and updating the prompt to tell the model to use it, and closing gaps in the existing prompt that caused the model to guess at dates of some related edges. I also suggest an update todelete_episodeso that it does not leave dangling references, pointing at no-longer existing nodes, in the graph.The use case
A "memory consolidation" script runs on old Claude session logs, to extract useful context information and import it into the graph using
add_memoryon the Graphiti MCP server for each one. Each save passes the session's actual date as the temporal reference.For this to work end-to-end three things are needed to behave correctly:
add_memorytool must accept and propagate a caller-suppliedreference_time.delete_episodetool must cascade to the edges and entities derived from the episode (so a bad ingest can be cleanly undone and re-tried).REFERENCE_TIMEwhen text contains no explicit date, instead of guessing.Currently, Graphiti lacks all three.
Gap 1: MCP
add_memorythrows awayreference_timegraphiti_core.Graphiti.add_episodealready accepts areference_timeparameter and uses it correctly. The MCP wrapper atmcp_server/src/graphiti_mcp_server.pydoes not expose it —add_memoryhas no such parameter, and the internal queue service hardcodesreference_time=datetime.now(timezone.utc).Result: every MCP-driven ingest is stamped with the ingestion time, instead of the time when the underlying event happened. This blocks the backfill use case.
Proposed fix:
reference_time: str | None = None(ISO-8601) to the MCPadd_memorytool.datetimeinside the tool (MCP can't carrydatetimeacross the JSON boundary).QueueService.add_episode, defaulting todatetime.now(timezone.utc)when the caller omits it — preserves current behavior for every existing caller.reference_timereaches the underlyingadd_episodecall and thatNonefalls back to a freshnow().Gap 2: MCP
delete_episodeis shallow — leaves orphan edges and entitiesThe MCP
delete_episodetool callsEpisodicNode.delete(client.driver), which drops only theEpisodicnode. TheRELATES_TOedges that were extracted from the episode remain in the graph, with staleepisodesarrays pointing at the now-deleted UUID.Entitynodes that were mentioned only by the deleted episode are also left behind as orphans.graphiti_core.Graphiti.remove_episodealready exists and does the cascading delete correctly: it removes the episode, the edges where this episode is the first provenance entry, and any entity nodes mentioned only by the deleted episode.This blocks the use case at "if I make a bad ingest, I can't cleanly undo it and retry."
Proposed fix:
mcp_server/src/graphiti_mcp_server.py: replace theEpisodicNode.get_by_uuid+episodic_node.deletepair withawait client.remove_episode(uuid).Gap 3: Fact extractor hallucinates today's date for ambiguous past-tense facts
When an episode is ingested with a historical
reference_time(soepisode.valid_atis correctly set in the past), graphiti-core correctly passeslatest_episode.valid_atto the fact-extraction LLM asREFERENCE_TIME(graphiti_core/utils/maintenance/edge_operations.py:195). But the LLM frequently ignores it and stamps extracted edges with today @ 00:00 UTC asvalid_at.Root cause is prompt-shape, not plumbing. The current prompt at
graphiti_core/prompts/extract_edges.pycontains contradicting rules:For past-tense completed actions without an in-text date ("Bob called Roger"), the "ongoing" trigger doesn't apply and the LLM falls to the "leave null" escape. gpt-4.1 / gpt-5-class models reliably read this as "guess" and choose today's midnight rather than emit null. The same loophole exists in the
extract_timestampsandextract_timestamps_batchprompts.Scale observed in the wild (1384 edges from 264 historical-backfill episodes on a production graph):
valid_atvalid_atvalid_atinvalid_atinvalid_atinvalid_atinvalid_atDetection signature is clean midnight UTC on the same date as
created_at(hour=minute=second=nanosecond=0). Microsecond-precision today values are NOT hallucinations — they're graphiti's own supersede-chain timestamps.This is the gap that breaks the use case at step 3: even with
reference_timeplumbed anddelete_episodecascading, ambiguous past-tense facts still collapse to today, erasing the temporal axis.Proposed fix: Tighten the DATETIME RULES in all three prompts (
edge,extract_timestamps,extract_timestamps_batch):valid_at. MakeREFERENCE_TIME, which is always set by Graphiti, the mandatory fallback whenever the text has no resolvable date.invalid_atnull-by-default — only set when the text explicitly states the fact ended or was superseded.valid_atorinvalid_at. REFERENCE_TIME is the ONLY acceptable default — do not substitute your own notion of the present."Prompt-only change; no code logic or schema impact.
Why these three are related but separable
All three are needed end-to-end for historical backfill, but each fix is independently valuable and independently mergeable:
reference_timeMCP param is useful to any caller that wants to backfill — even if the prompt still hallucinates, at least the episode lands at the right time.delete_episodecascade is a correctness fix that applies to any caller, not just backfill — it stops silent graph corruption from any ingest that needs to be retried.Submitting as separate PRs keeps review focused and lets each merge on its own timeline.
PR checklist
feat(mcp): add reference_time param to add_memory-- #1490fix(mcp): delete_episode cascade to edges and orphan entities-- #1491fix(prompts): force valid_at fallback to REFERENCE_TIME, ban "today"-- #1492Notes for operators with existing bad data
Repairing rows already affected by Gap 3 is per-operator (not part of any of the three PRs). The signature is narrow enough to fix with two Cypher statements:
The clean-midnight precision filter on Fix 1 is important — a looser
date()-only filter overwrites legitimate same-day text extractions.