synthesisengineering
diff --git a/‎.github/dependabot.yml‎
Lines changed: 62 additions & 0 deletions b/‎.github/dependabot.yml‎
Lines changed: 62 additions & 0 deletions
diff --git a/‎CHANGELOG.md‎
Lines changed: 273 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 273 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 7 additions & 1 deletion b/‎README.md‎
Lines changed: 7 additions & 1 deletion
@@ -0,0 +1,62 @@
+# Dependabot configuration for Ragbot.
+#
+# Two ecosystems are scanned weekly: pip at the repo root for the Python
+# substrate, and npm at web/ for the Next.js frontend. The pip side has no
+# active ignores at present; the npm side carries narrow ignores for
+# transitive dev-only packages whose upstreams have not yet released a fix
+# carrying the patched version. Every ignore cites the manual triage doc
+# under docs/security/ so the rationale persists across reviewers.
+#
+# Triage policy: real-exposure alerts get pin-bumped on the next manual
+# pass; transitive-only alerts get overrides in web/package.json
+# (preferred) or ignores here (when an override is infeasible). The full
+# decision log lives at docs/security/dependabot-triage-2026-05.md.
+#
+# When new alerts arrive, do not add an ignore entry without first
+# verifying the override path is closed. Silent dismissal is forbidden.
+
+version: 2
+updates:
+  # --- Python substrate -----------------------------------------------------
+  - package-ecosystem: "pip"
+    directory: "/"
+    schedule:
+      interval: "weekly"
+      day: "monday"
+      time: "09:00"
+      timezone: "America/New_York"
+    # Major-version bumps for non-security updates require human review.
+    # Security updates are exempt from this constraint and will be opened
+    # regardless of strategy.
+    versioning-strategy: "increase"
+    open-pull-requests-limit: 10
+    labels:
+      - "dependencies"
+      - "python"
+
+  # --- Web frontend ---------------------------------------------------------
+  - package-ecosystem: "npm"
+    directory: "/web"
+    schedule:
+      interval: "weekly"
+      day: "monday"
+      time: "09:00"
+      timezone: "America/New_York"
+    versioning-strategy: "increase"
+    open-pull-requests-limit: 10
+    labels:
+      - "dependencies"
+      - "javascript"
+    # Ignores are intentionally empty as of the 2026-05 triage. Every open
+    # alert at that time was fixed by either a pin bump (next 16.0.10 to
+    # 16.2.6) or a transitive override (flatted, minimatch, picomatch, ajv,
+    # brace-expansion, postcss — see web/package.json `overrides`). If a
+    # future scan surfaces an alert that cannot be resolved by an override,
+    # add an entry below citing docs/security/dependabot-triage-YYYY-MM.md.
+    #
+    # Example ignore template (do not enable without a triage entry):
+    # ignore:
+    #   - dependency-name: "example-pkg"
+    #     versions: ["1.x"]
+    #     # Rationale: transitive via foo-bar; upstream has no fix.
+    #     # See docs/security/dependabot-triage-YYYY-MM.md#example-pkg.
@@ -0,0 +1,273 @@
+# Changelog
+
+All notable changes to Ragbot are recorded here. The format follows
+[Keep a Changelog](https://keepachangelog.com/en/1.1.0/) loosely, and
+versioning follows [Semantic Versioning](https://semver.org/).
+
+For the prose narratives accompanying major releases, see
+[`docs/release-notes-v3.4.0.md`](docs/release-notes-v3.4.0.md) and
+the equivalents for prior versions when added.
+
+## v3.4.0 — 2026-05-14
+
+Ragbot becomes the conversational reference runtime of synthesis engineering.
+
+v3.4 is the next-major-features release for Ragbot. It moves the project
+from a polished 2024-paradigm chat-with-RAG product to a 2026-shaped
+conversational AI runtime: explicit agent loop, first-class MCP in both
+directions, an executable skills runtime, cross-workspace synthesis with
+visible confidentiality boundaries, durable memory beyond vector RAG,
+and the production-grade signals (observability, replay, eval harness,
+background tasks, scheduled routines) that make the architecture legible
+to engineering leadership. The synthesis-engineering positioning that the
+architecture had quietly named for a year is now visible in the README,
+the ragbot.ai homepage, and the in-product chrome.
+
+### Headliners
+
+**Agent loop runtime.** Ragbot is now an execution surface, not just a
+chat surface. A hand-rolled graph-state agent loop replaces the
+single-turn `prompt → retrieve → call LLM → return` path. The agent can
+decide between answering directly, dispatching retrieval, calling a
+tool, running a skill, or fanning out to sub-agents. Plan-and-Execute is
+the default compound-question pattern with explicit replanning on
+failure. Permission gates fail closed at the tool boundary. State
+checkpoints are durable and replayable. The chat-only no-tools mode
+remains available for users who want it.
+
+**First-class MCP — client and server.** As a client, Ragbot covers all
+six MCP primitives (tools, resources, prompts, Roots, Sampling,
+Elicitation) and supports MCP Tasks for long-running calls. As a server,
+Ragbot exposes workspace search, document retrieval, skill execution,
+and audit recent as MCP tools and resources so Claude Code, Cursor,
+ChatGPT desktop, and other MCP-aware clients can call into Ragbot's
+knowledge surface. OAuth 2.1 with Dynamic Client Registration is
+supported for remote servers; bearer-token auth and per-token
+`allowed_tools` glob filtering are supported on the server side.
+
+**Skills as runtime.** `SKILL.md` is now Ragbot's native extensibility
+format with progressive disclosure (names + descriptions in the system
+prompt, full body on selection, scripts and templates on tool call).
+Skills written for Claude Code, Codex CLI, Cursor, or Gemini CLI run on
+Ragbot without modification. Six starter skills ship in the box:
+`workspace-search-with-citations`, `draft-and-revise`,
+`fact-check-claims`, `summarize-document`, `agent-self-review`, and
+`cross-workspace-synthesis`. The `npx skills add
+synthesisengineering/synthesis-skills` install path turns the 32 public
+synthesis-skills into runnable capabilities, not just searchable
+content.
+
+**Synthesis ecosystem positioning and rebrand.** Ragbot is officially
+the reference runtime for the conversational interaction primitive
+inside synthesis engineering, with sibling reference implementations
+covering direct manipulation (synthesis-console), procedural execution
+(Ragenie), and the portable capability format (synthesis-skills). The
+README hero, the ragbot.ai homepage, the in-product footer chrome, and
+`llms.txt` all lead with the synthesis framing. Vermillion (#b8312f) is
+Ragbot's accent ink in the historical-inks palette of the synthesis
+family, joining Prussian blue (synthesis-engineering), iron-gall green
+(synthesis-coding), walnut brown (synthesis-writing), and lapis
+ultramarine (Ragenie).
+
+**Cross-workspace synthesis.** Multi-workspace chat is now first-class.
+Select 2+ workspaces in the UI; the agent sees a per-workspace context
+budget and per-workspace confidentiality tag. Per-workspace
+`routing.yaml` governs which models can be called for each workspace
+(local-only for `client-confidential`, frontier for `personal`, and so
+on). Every cross-workspace operation is logged to an append-only audit
+trail with timestamp, workspaces involved, tools called, and model
+used. The `cross-workspace-synthesize` starter skill walks the agent
+through per-workspace budget math, the four-level confidentiality
+strictness order with pairwise mix table, and the
+`[workspace:document_id]` citation format.
+
+### Production-grade signals
+
+- **Memory beyond RAG.** A three-layer memory stack: vector RAG over
+  pgvector (existing), entity-graph memory with provenance and temporal
+  validity (new `nodes` and `edges` tables), and session/working memory
+  (per-user persistent prefs and in-flight context). A consolidation
+  pass between sessions distills durable facts from the previous
+  session and writes them into the entity graph. Pluggable: Mem0 and
+  Letta integrations available as optional swap-ins behind the
+  abstraction.
+- **MCP server.** `synthesis_engine.mcp_server` exposes Ragbot via both
+  stdio (for desktop integrations) and HTTP/SSE (via
+  `StreamableHTTPSessionManager`) transports. Per-token `allowed_tools`
+  glob filtering and bearer-token auth configured via
+  `~/.synthesis/mcp-server.yaml`. Five exposed tools: `workspace_search`,
+  `workspace_search_multi` (confidentiality gate fires before retrieval),
+  `document_get`, `skill_run`, `agent_run_start`. Three exposed
+  resources: workspaces, skills, audit-recent.
+- **Replay CLI.** `ragbot agent replay <task_id>` deterministically
+  re-runs a session from any checkpoint, with `--show-trace` and
+  `--save-output` flags. `ragbot agent list-sessions` and `ragbot agent
+  checkpoints <task_id>` for inspection. A stable hash over
+  `{current_state, final_answer, step_results}` (timestamps excluded)
+  gates regression detection.
+- **Eval regressions.** `tests/evals/regressions/` captures canonical
+  bug shapes (sub-agent dispatch max-parallel, disabled sandbox
+  actionable error, permission deny blocks tool, cross-workspace
+  air-gapped isolation, replay determinism canary). `make eval` runs the
+  full eval suite with a scorecard renderer.
+- **Background tasks.** `synthesis_engine.tasks` provides a
+  `BackgroundTaskManager` with JSONL persistence at
+  `~/.synthesis/tasks/{id}.jsonl`, cooperative cancellation
+  (`TaskCancelled` raised at safe points; no force-kill), crash recovery
+  on startup, webhook delivery per-task, and three notifier adapters
+  (macOS via `osascript`, email via SMTP, Slack via MCP). A scheduler is
+  opt-in via `RAGBOT_SCHEDULER=1`, reading `~/.synthesis/schedules.yaml`.
+- **Keyboard shortcuts.** A coherent shortcut layer covers the 2026
+  expected interactions: ⌘K (model picker), ⌘J (workspace switch), ⌘/
+  (message history search), ⌘N (new chat), ⌘B (background current run),
+  ⌘. (cancel), ⌘? (help overlay with focus trap and Escape close).
+  Platform-aware key matching (Meta on macOS, Ctrl elsewhere); strict
+  exact-modifier matching so ⌘⌥K doesn't accidentally fire ⌘K.
+- **Observability.** OpenTelemetry traces by default with semantic
+  GenAI attributes on every model call, retrieval step, guardrail
+  check, and tool dispatch. `OTEL_EXPORTER_OTLP_ENDPOINT` ships traces
+  to Phoenix, Langfuse, Datadog, or Honeycomb. Prometheus exposition at
+  `/api/metrics` and cache-stats JSON at `/api/metrics/cache`. Prompt
+  caching with `cache_control` annotations on the static system-prompt
+  prefix.
+
+### Open-weights additions
+
+`engines.yaml` adds the four open-weights families that became serious
+local agent defaults in 2026:
+
+- **Llama 4** (Meta) — sizes documented in the new sizing matrix at
+  [`docs/open-weights-sizing.md`](docs/open-weights-sizing.md).
+- **Qwen3** (Alibaba) — the practical local agent default; 27B size is
+  the recommended balance of capability and footprint on Apple Silicon
+  with the MLX backend.
+- **DeepSeek-V3** — strong reasoning at competitive sizes.
+- **Mistral Large** — Mistral's open-weights flagship.
+
+Updated **Gemma 4** entries with notes on the Ollama 0.19 MLX backend
+(~2x decode speedup on Apple Silicon). The full sizing matrix maps
+model families to recommended hardware tiers (laptop, prosumer desktop,
+Mac Studio-class, workstation), VRAM/unified-memory requirements, and
+target inference profiles.
+
+### Breaking changes
+
+v3.4 is the next-major-features release. Breaking changes are
+intentional and visible. If you are upgrading from v3.3, the items
+below require migration steps.
+
+- **`synthesis_engine` is now a public substrate library.** The runtime
+  code under `src/synthesis_engine/` is the supported import surface for
+  building synthesis-engineering products on top of Ragbot's primitives.
+  `src/ragbot/` is now ragbot-runtime-specific code only. Imports under
+  `from ragbot.X` for substrate types have moved to `from
+  synthesis_engine.X`. The `RagbotError` exception base class has been
+  renamed to `SynthesisError` across the substrate; all five subclasses
+  are renamed correspondingly.
+- **`routing.yaml` is a new per-workspace convention.** Each workspace
+  may declare a `routing.yaml` at its root with
+  `allowed_models`/`denied_models` globs, a `confidentiality` tag
+  (`public` / `personal` / `client-confidential` / `air-gapped`), and a
+  `fallback_behavior` (`DENY` / `DOWNGRADE_TO_LOCAL` / `WARN`). The
+  cross-workspace agent runtime enforces the strictest applicable
+  policy. Workspaces with no `routing.yaml` default to `personal` with
+  no model restrictions, preserving prior behavior.
+- **`identity.yaml` is a new convention at `~/.synthesis/`.** Declares
+  `personal_workspaces` (the list of workspaces treated as universal
+  for skill scoping) and `personal_remote_patterns` (used by the
+  synthesis-git-hooks policy to classify a repo as personal vs strict).
+  Required for the new workspace-scoped skills discovery to know which
+  workspaces are personal vs scoped.
+- **Agent loop API signature.** `AgentLoop.run()` accepts new kwargs:
+  `workspaces` (list, not single value), `workspace_roots`,
+  `routing_enforced`, and `cross_workspace_budget_tokens`. The legacy
+  single-workspace shape continues to work — a single-element list
+  preserves prior behavior — but consumers calling the agent loop
+  directly should update to the multi-workspace shape.
+
+### Migration notes
+
+If you are upgrading from v3.3:
+
+1. **Update imports.** Replace `from ragbot.X` substrate imports with
+   `from synthesis_engine.X`. The same applies to `RagbotError` →
+   `SynthesisError`.
+2. **Create `~/.synthesis/identity.yaml`.** A minimal example:
+   ```yaml
+   personal_workspaces:
+     - acme-user
+   personal_remote_patterns:
+     - "github.com:acme-user/"
+   ```
+   The `synthesis-git-hooks` skill in `synthesis-skills` includes a
+   commented example config. Install via `npx skills add
+   synthesisengineering/synthesis-skills --skill synthesis-git-hooks`.
+3. **Add `routing.yaml` to confidential workspaces.** A minimal example
+   for a `client-confidential` workspace:
+   ```yaml
+   confidentiality: client-confidential
+   allowed_models:
+     - "ollama/*"
+     - "ollama_chat/*"
+   denied_models:
+     - "claude-*"
+     - "gpt-*"
+     - "gemini-*"
+   fallback_behavior: DENY
+   ```
+4. **Install the synthesis-git-hooks engine.** Replace any repo-local
+   `.githooks/pre-commit` with the universal engine at
+   `~/.synthesis/git-hooks/` (installed via the synthesis-git-hooks
+   skill above). Configure `~/.synthesis/git-hook-config.yaml` with your
+   client names, internal URLs, and personal-remote patterns.
+5. **Install the anti-shortcut catalog.** `~/.synthesis/anti-shortcut-catalog.yaml`
+   is consumed by the synthesis-anti-shortcuts skill and by the
+   pre-commit hooks to detect lazy-shortcut costume vocabulary in code
+   and prose. Install via `npx skills add
+   synthesisengineering/synthesis-skills --skill synthesis-anti-shortcuts`.
+
+### Acknowledgments
+
+Thanks to everyone who reviewed proposals, surfaced issues, and pushed
+back on lazy shortcuts during the v3.4 development cycle. The lessons
+distilled during development inform the
+[synthesis-anti-shortcuts](https://github.com/synthesisengineering/synthesis-skills/tree/main/synthesis-anti-shortcuts)
+skill, which any SKILL.md-compatible AI coding agent can install.
+
+---
+
+## v3.3.0 — 2026-05
+
+Local Gemma 4 via Ollama as first-class. Redesigned single-rich-dropdown
+model picker with Pinned/Recent, type-ahead search, `⌘K` shortcut, and
+capability badges. User preferences API persisting to
+`~/.synthesis/ragbot.yaml`. Bug fix for non-flagship GPT-5.x and Gemini
+returning empty content on long-context RAG calls. LiteLLM pinned
+`>=1.83.0` to exclude the March-2026 supply-chain incident range.
+
+## v3.2.0 — 2026-04
+
+Demo mode (`RAGBOT_DEMO=1`) with hard-isolated discovery and a bundled
+sample workspace and skill. `/health` and `/api/config` report
+`demo_mode`. Twenty new tests locking in the discovery isolation
+contract.
+
+## v3.1.0 — 2026-04
+
+LLM backend abstraction (`RAGBOT_LLM_BACKEND={litellm|direct}`). Web UI
+controls for reasoning effort and the cross-workspace skills toggle.
+`/api/chat` accepts `thinking_effort` and `additional_workspaces`.
+
+## v3.0.0 — 2026-04
+
+Pgvector by default with native FTS via tsvector + GIN. Agent Skills as
+first-class content with `ragbot skills {list,info,index}` CLI.
+Workspace-rooted layout discovered across `~/workspaces/*/ai-knowledge-*`
+and via `~/.synthesis/console.yaml`. Reasoning-effort wiring for Claude
+4.x adaptive thinking, GPT-5.5 reasoning, and Gemini 3.x thinking
+levels.
+
+## Earlier versions
+
+Pre-v3.0 history is recorded in commit messages and the README "What's
+New" sections. Older releases predate this CHANGELOG file.
@@ -487,10 +487,16 @@ inherits_from:
 Supported AI Models
 -------------------
 
-Ragbot supports models from Anthropic (Claude), OpenAI (GPT and reasoning models), Google (Gemini), and local models via Ollama (Gemma 4 family ships out of the box). The authoritative list — model IDs, context windows, thinking-mode support, tier badges, and defaults — lives in [engines.yaml](engines.yaml). v3.3's redesigned model picker reads from the same file at runtime, so what you see in the UI matches what is configured in the repo.
+Ragbot supports models from Anthropic (Claude), OpenAI (GPT and reasoning models), Google (Gemini), and local open-weights models via Ollama. The authoritative list — model IDs, context windows, thinking-mode support, tier badges, and defaults — lives in [engines.yaml](engines.yaml). v3.3's redesigned model picker reads from the same file at runtime, so what you see in the UI matches what is configured in the repo.
 
 Adding or updating models is an `engines.yaml` change, not a code change. See the [v3.3 release notes](#whats-new-in-v33) for the local-model integration details.
 
+### Open-weights model support
+
+Ragbot v3.4 ships expanded local-model coverage out of the box. The `ollama` engine in [engines.yaml](engines.yaml) includes Gemma 4 (E4B, 26B MoE, 31B Dense), Llama 4 (Scout, Maverick), Qwen3.6 (27B Dense, 35B-A3B MoE), DeepSeek V3.2, and Mistral (Small 4, Medium 3.5, Large 3). Each entry carries MLX-backend notes, license info, recommended quantization tags, and a real-world parameter count.
+
+**Choosing a model for your Mac.** Open-weights inference on Apple Silicon is bound by unified memory and bandwidth. A 16 GB Mac mini runs Gemma 4 E4B; a Mac Studio with 256 GB unified memory runs DeepSeek V3.2 and Mistral Large 3. The [sizing matrix at `docs/sizing-matrix.md`](docs/sizing-matrix.md) maps every model in `engines.yaml` to each Mac hardware profile (Mac mini 16/32/64 GB, MacBook Air 24/32 GB, MacBook Pro M4 Pro 48 / M5 Max 128 GB, Mac Studio M3 Ultra 192/256 GB), with per-model memory footprints at FP16 and Q4, expected MLX tokens-per-second, and `comfortable / tight / Q4-only / won't fit` verdicts. Start there if you're trying to pick a model for your Mac, or pick a Mac for a model.
+
 Installation, Configuration, and Personalization
 ------------------------------------------------
 Read the [installation guide](INSTALL.md) and the [configuration and personaliation guide](CONFIGURE.md).