feat(memory_fs): LLM synthesis of MEMORY/SKILL from descriptions#436
Open
sairin1202 wants to merge 2 commits into
Open
feat(memory_fs): LLM synthesis of MEMORY/SKILL from descriptions#436sairin1202 wants to merge 2 commits into
sairin1202 wants to merge 2 commits into
Conversation
…ions Add an opt-in synthesis mode where MEMORY.md and the skill/ tree are generated directly from the shared per-source descriptions by an LLM, instead of being rendered from already-extracted records. - New `MemorySynthesizer` (+ prompts in `memu.prompts.memory_fs`) turns the description trunk into a consolidated memory doc and a JSON list of skills. - Exporter gains `memory_body` / `skills` overrides; INDEX.md stays deterministic in both modes. - Gated by `memory_files_config.synthesize` (default off) and `synthesis_llm_profile`; existing memorize/extract pipeline is untouched. Tests use a fake LLM client (no network). Co-authored-by: Cursor <cursoragent@cursor.com>
Mirror the "submit the changed part of the file system" model: - First run (no MEMORY.md/skill tree) initializes the tree from all in-scope descriptions; subsequent runs incrementally merge only the changed sources' descriptions into the existing MEMORY.md body and skill bodies (upsert by slug). - Add MemorySynthesizer.update() with MEMORY_UPDATE_PROMPT / SKILL_UPDATE_PROMPT. - Add exporter read helpers (artifacts_exist / read_memory_body / read_skills) to feed the update path from disk. - MemoryService._build_memory_files() centralizes the init-vs-update decision; export_memory_files() keeps doing a full (re)initialization. - Gate an optional post-memorize hook behind memory_files_config.update_on_memorize that drives the builder with the just-created resources; best-effort so an export error never fails memorize. INDEX.md stays deterministic (recomputed from the current source set). Co-authored-by: Cursor <cursoragent@cursor.com>
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Makes the MEMORY and SKILL bypasses synthesize directly from the shared multimodal description trunk via the LLM, instead of rendering already-extracted records. This realizes the "description → 3 sibling bypasses" model end-to-end.
INDEX.mdstays a deterministic table of contents.Opt-in and additive: the existing
memorize/extract/category pipeline is untouched; synthesis only changes whatexport_memory_files()writes.How it works
memu.memory_fs.MemorySynthesizer(prompts inmemu.prompts.memory_fs):MEMORY.md: one LLM pass turns all per-source descriptions into a consolidated memory document (Profile / Preferences / Goals / Key Events).skill/<name>/SKILL.md: one LLM pass extracts skills as a JSON array of{name, body}, each written as its own doc (kebab-case slug, collision-safe).MemoryFileExporter.export(...)gains optionalmemory_body/skillsoverrides; when present they replace the deterministic rendering.INDEX.mdis always deterministic.memory_files_config.synthesize(default off) +synthesis_llm_profile. Diff/manifest, scoping, and per-service lock all carry over.Test plan
tests/test_memory_fs_synthesis.py(5): synthesizer parse/empty/helpers, exporter override path, full service wiring with a fake LLM (no network)tests/test_memory_files.py(7) still green91 passed, 1 skipped;ruff+mypyclean on changed filesNotes / follow-ups
Resource.caption. A richer persisted per-source description (full preprocessed text, not just the caption) would improve synthesis quality — deferred.EntityCoordinatorto run off dirty descriptions only.Made with Cursor