Skip to content

Commit 4b593ba

Browse files
committed
Phase 5 — open-weights, Dependabot, release prep, integration cleanup
1 parent 9e6c49c commit 4b593ba

31 files changed

Lines changed: 5981 additions & 1156 deletions

.github/dependabot.yml

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
# Dependabot configuration for Ragbot.
2+
#
3+
# Two ecosystems are scanned weekly: pip at the repo root for the Python
4+
# substrate, and npm at web/ for the Next.js frontend. The pip side has no
5+
# active ignores at present; the npm side carries narrow ignores for
6+
# transitive dev-only packages whose upstreams have not yet released a fix
7+
# carrying the patched version. Every ignore cites the manual triage doc
8+
# under docs/security/ so the rationale persists across reviewers.
9+
#
10+
# Triage policy: real-exposure alerts get pin-bumped on the next manual
11+
# pass; transitive-only alerts get overrides in web/package.json
12+
# (preferred) or ignores here (when an override is infeasible). The full
13+
# decision log lives at docs/security/dependabot-triage-2026-05.md.
14+
#
15+
# When new alerts arrive, do not add an ignore entry without first
16+
# verifying the override path is closed. Silent dismissal is forbidden.
17+
18+
version: 2
19+
updates:
20+
# --- Python substrate -----------------------------------------------------
21+
- package-ecosystem: "pip"
22+
directory: "/"
23+
schedule:
24+
interval: "weekly"
25+
day: "monday"
26+
time: "09:00"
27+
timezone: "America/New_York"
28+
# Major-version bumps for non-security updates require human review.
29+
# Security updates are exempt from this constraint and will be opened
30+
# regardless of strategy.
31+
versioning-strategy: "increase"
32+
open-pull-requests-limit: 10
33+
labels:
34+
- "dependencies"
35+
- "python"
36+
37+
# --- Web frontend ---------------------------------------------------------
38+
- package-ecosystem: "npm"
39+
directory: "/web"
40+
schedule:
41+
interval: "weekly"
42+
day: "monday"
43+
time: "09:00"
44+
timezone: "America/New_York"
45+
versioning-strategy: "increase"
46+
open-pull-requests-limit: 10
47+
labels:
48+
- "dependencies"
49+
- "javascript"
50+
# Ignores are intentionally empty as of the 2026-05 triage. Every open
51+
# alert at that time was fixed by either a pin bump (next 16.0.10 to
52+
# 16.2.6) or a transitive override (flatted, minimatch, picomatch, ajv,
53+
# brace-expansion, postcss — see web/package.json `overrides`). If a
54+
# future scan surfaces an alert that cannot be resolved by an override,
55+
# add an entry below citing docs/security/dependabot-triage-YYYY-MM.md.
56+
#
57+
# Example ignore template (do not enable without a triage entry):
58+
# ignore:
59+
# - dependency-name: "example-pkg"
60+
# versions: ["1.x"]
61+
# # Rationale: transitive via foo-bar; upstream has no fix.
62+
# # See docs/security/dependabot-triage-YYYY-MM.md#example-pkg.

CHANGELOG.md

Lines changed: 273 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,273 @@
1+
# Changelog
2+
3+
All notable changes to Ragbot are recorded here. The format follows
4+
[Keep a Changelog](https://keepachangelog.com/en/1.1.0/) loosely, and
5+
versioning follows [Semantic Versioning](https://semver.org/).
6+
7+
For the prose narratives accompanying major releases, see
8+
[`docs/release-notes-v3.4.0.md`](docs/release-notes-v3.4.0.md) and
9+
the equivalents for prior versions when added.
10+
11+
## v3.4.0 — 2026-05-14
12+
13+
Ragbot becomes the conversational reference runtime of synthesis engineering.
14+
15+
v3.4 is the next-major-features release for Ragbot. It moves the project
16+
from a polished 2024-paradigm chat-with-RAG product to a 2026-shaped
17+
conversational AI runtime: explicit agent loop, first-class MCP in both
18+
directions, an executable skills runtime, cross-workspace synthesis with
19+
visible confidentiality boundaries, durable memory beyond vector RAG,
20+
and the production-grade signals (observability, replay, eval harness,
21+
background tasks, scheduled routines) that make the architecture legible
22+
to engineering leadership. The synthesis-engineering positioning that the
23+
architecture had quietly named for a year is now visible in the README,
24+
the ragbot.ai homepage, and the in-product chrome.
25+
26+
### Headliners
27+
28+
**Agent loop runtime.** Ragbot is now an execution surface, not just a
29+
chat surface. A hand-rolled graph-state agent loop replaces the
30+
single-turn `prompt → retrieve → call LLM → return` path. The agent can
31+
decide between answering directly, dispatching retrieval, calling a
32+
tool, running a skill, or fanning out to sub-agents. Plan-and-Execute is
33+
the default compound-question pattern with explicit replanning on
34+
failure. Permission gates fail closed at the tool boundary. State
35+
checkpoints are durable and replayable. The chat-only no-tools mode
36+
remains available for users who want it.
37+
38+
**First-class MCP — client and server.** As a client, Ragbot covers all
39+
six MCP primitives (tools, resources, prompts, Roots, Sampling,
40+
Elicitation) and supports MCP Tasks for long-running calls. As a server,
41+
Ragbot exposes workspace search, document retrieval, skill execution,
42+
and audit recent as MCP tools and resources so Claude Code, Cursor,
43+
ChatGPT desktop, and other MCP-aware clients can call into Ragbot's
44+
knowledge surface. OAuth 2.1 with Dynamic Client Registration is
45+
supported for remote servers; bearer-token auth and per-token
46+
`allowed_tools` glob filtering are supported on the server side.
47+
48+
**Skills as runtime.** `SKILL.md` is now Ragbot's native extensibility
49+
format with progressive disclosure (names + descriptions in the system
50+
prompt, full body on selection, scripts and templates on tool call).
51+
Skills written for Claude Code, Codex CLI, Cursor, or Gemini CLI run on
52+
Ragbot without modification. Six starter skills ship in the box:
53+
`workspace-search-with-citations`, `draft-and-revise`,
54+
`fact-check-claims`, `summarize-document`, `agent-self-review`, and
55+
`cross-workspace-synthesis`. The `npx skills add
56+
synthesisengineering/synthesis-skills` install path turns the 32 public
57+
synthesis-skills into runnable capabilities, not just searchable
58+
content.
59+
60+
**Synthesis ecosystem positioning and rebrand.** Ragbot is officially
61+
the reference runtime for the conversational interaction primitive
62+
inside synthesis engineering, with sibling reference implementations
63+
covering direct manipulation (synthesis-console), procedural execution
64+
(Ragenie), and the portable capability format (synthesis-skills). The
65+
README hero, the ragbot.ai homepage, the in-product footer chrome, and
66+
`llms.txt` all lead with the synthesis framing. Vermillion (#b8312f) is
67+
Ragbot's accent ink in the historical-inks palette of the synthesis
68+
family, joining Prussian blue (synthesis-engineering), iron-gall green
69+
(synthesis-coding), walnut brown (synthesis-writing), and lapis
70+
ultramarine (Ragenie).
71+
72+
**Cross-workspace synthesis.** Multi-workspace chat is now first-class.
73+
Select 2+ workspaces in the UI; the agent sees a per-workspace context
74+
budget and per-workspace confidentiality tag. Per-workspace
75+
`routing.yaml` governs which models can be called for each workspace
76+
(local-only for `client-confidential`, frontier for `personal`, and so
77+
on). Every cross-workspace operation is logged to an append-only audit
78+
trail with timestamp, workspaces involved, tools called, and model
79+
used. The `cross-workspace-synthesize` starter skill walks the agent
80+
through per-workspace budget math, the four-level confidentiality
81+
strictness order with pairwise mix table, and the
82+
`[workspace:document_id]` citation format.
83+
84+
### Production-grade signals
85+
86+
- **Memory beyond RAG.** A three-layer memory stack: vector RAG over
87+
pgvector (existing), entity-graph memory with provenance and temporal
88+
validity (new `nodes` and `edges` tables), and session/working memory
89+
(per-user persistent prefs and in-flight context). A consolidation
90+
pass between sessions distills durable facts from the previous
91+
session and writes them into the entity graph. Pluggable: Mem0 and
92+
Letta integrations available as optional swap-ins behind the
93+
abstraction.
94+
- **MCP server.** `synthesis_engine.mcp_server` exposes Ragbot via both
95+
stdio (for desktop integrations) and HTTP/SSE (via
96+
`StreamableHTTPSessionManager`) transports. Per-token `allowed_tools`
97+
glob filtering and bearer-token auth configured via
98+
`~/.synthesis/mcp-server.yaml`. Five exposed tools: `workspace_search`,
99+
`workspace_search_multi` (confidentiality gate fires before retrieval),
100+
`document_get`, `skill_run`, `agent_run_start`. Three exposed
101+
resources: workspaces, skills, audit-recent.
102+
- **Replay CLI.** `ragbot agent replay <task_id>` deterministically
103+
re-runs a session from any checkpoint, with `--show-trace` and
104+
`--save-output` flags. `ragbot agent list-sessions` and `ragbot agent
105+
checkpoints <task_id>` for inspection. A stable hash over
106+
`{current_state, final_answer, step_results}` (timestamps excluded)
107+
gates regression detection.
108+
- **Eval regressions.** `tests/evals/regressions/` captures canonical
109+
bug shapes (sub-agent dispatch max-parallel, disabled sandbox
110+
actionable error, permission deny blocks tool, cross-workspace
111+
air-gapped isolation, replay determinism canary). `make eval` runs the
112+
full eval suite with a scorecard renderer.
113+
- **Background tasks.** `synthesis_engine.tasks` provides a
114+
`BackgroundTaskManager` with JSONL persistence at
115+
`~/.synthesis/tasks/{id}.jsonl`, cooperative cancellation
116+
(`TaskCancelled` raised at safe points; no force-kill), crash recovery
117+
on startup, webhook delivery per-task, and three notifier adapters
118+
(macOS via `osascript`, email via SMTP, Slack via MCP). A scheduler is
119+
opt-in via `RAGBOT_SCHEDULER=1`, reading `~/.synthesis/schedules.yaml`.
120+
- **Keyboard shortcuts.** A coherent shortcut layer covers the 2026
121+
expected interactions: ⌘K (model picker), ⌘J (workspace switch), ⌘/
122+
(message history search), ⌘N (new chat), ⌘B (background current run),
123+
⌘. (cancel), ⌘? (help overlay with focus trap and Escape close).
124+
Platform-aware key matching (Meta on macOS, Ctrl elsewhere); strict
125+
exact-modifier matching so ⌘⌥K doesn't accidentally fire ⌘K.
126+
- **Observability.** OpenTelemetry traces by default with semantic
127+
GenAI attributes on every model call, retrieval step, guardrail
128+
check, and tool dispatch. `OTEL_EXPORTER_OTLP_ENDPOINT` ships traces
129+
to Phoenix, Langfuse, Datadog, or Honeycomb. Prometheus exposition at
130+
`/api/metrics` and cache-stats JSON at `/api/metrics/cache`. Prompt
131+
caching with `cache_control` annotations on the static system-prompt
132+
prefix.
133+
134+
### Open-weights additions
135+
136+
`engines.yaml` adds the four open-weights families that became serious
137+
local agent defaults in 2026:
138+
139+
- **Llama 4** (Meta) — sizes documented in the new sizing matrix at
140+
[`docs/open-weights-sizing.md`](docs/open-weights-sizing.md).
141+
- **Qwen3** (Alibaba) — the practical local agent default; 27B size is
142+
the recommended balance of capability and footprint on Apple Silicon
143+
with the MLX backend.
144+
- **DeepSeek-V3** — strong reasoning at competitive sizes.
145+
- **Mistral Large** — Mistral's open-weights flagship.
146+
147+
Updated **Gemma 4** entries with notes on the Ollama 0.19 MLX backend
148+
(~2x decode speedup on Apple Silicon). The full sizing matrix maps
149+
model families to recommended hardware tiers (laptop, prosumer desktop,
150+
Mac Studio-class, workstation), VRAM/unified-memory requirements, and
151+
target inference profiles.
152+
153+
### Breaking changes
154+
155+
v3.4 is the next-major-features release. Breaking changes are
156+
intentional and visible. If you are upgrading from v3.3, the items
157+
below require migration steps.
158+
159+
- **`synthesis_engine` is now a public substrate library.** The runtime
160+
code under `src/synthesis_engine/` is the supported import surface for
161+
building synthesis-engineering products on top of Ragbot's primitives.
162+
`src/ragbot/` is now ragbot-runtime-specific code only. Imports under
163+
`from ragbot.X` for substrate types have moved to `from
164+
synthesis_engine.X`. The `RagbotError` exception base class has been
165+
renamed to `SynthesisError` across the substrate; all five subclasses
166+
are renamed correspondingly.
167+
- **`routing.yaml` is a new per-workspace convention.** Each workspace
168+
may declare a `routing.yaml` at its root with
169+
`allowed_models`/`denied_models` globs, a `confidentiality` tag
170+
(`public` / `personal` / `client-confidential` / `air-gapped`), and a
171+
`fallback_behavior` (`DENY` / `DOWNGRADE_TO_LOCAL` / `WARN`). The
172+
cross-workspace agent runtime enforces the strictest applicable
173+
policy. Workspaces with no `routing.yaml` default to `personal` with
174+
no model restrictions, preserving prior behavior.
175+
- **`identity.yaml` is a new convention at `~/.synthesis/`.** Declares
176+
`personal_workspaces` (the list of workspaces treated as universal
177+
for skill scoping) and `personal_remote_patterns` (used by the
178+
synthesis-git-hooks policy to classify a repo as personal vs strict).
179+
Required for the new workspace-scoped skills discovery to know which
180+
workspaces are personal vs scoped.
181+
- **Agent loop API signature.** `AgentLoop.run()` accepts new kwargs:
182+
`workspaces` (list, not single value), `workspace_roots`,
183+
`routing_enforced`, and `cross_workspace_budget_tokens`. The legacy
184+
single-workspace shape continues to work — a single-element list
185+
preserves prior behavior — but consumers calling the agent loop
186+
directly should update to the multi-workspace shape.
187+
188+
### Migration notes
189+
190+
If you are upgrading from v3.3:
191+
192+
1. **Update imports.** Replace `from ragbot.X` substrate imports with
193+
`from synthesis_engine.X`. The same applies to `RagbotError`
194+
`SynthesisError`.
195+
2. **Create `~/.synthesis/identity.yaml`.** A minimal example:
196+
```yaml
197+
personal_workspaces:
198+
- acme-user
199+
personal_remote_patterns:
200+
- "github.com:acme-user/"
201+
```
202+
The `synthesis-git-hooks` skill in `synthesis-skills` includes a
203+
commented example config. Install via `npx skills add
204+
synthesisengineering/synthesis-skills --skill synthesis-git-hooks`.
205+
3. **Add `routing.yaml` to confidential workspaces.** A minimal example
206+
for a `client-confidential` workspace:
207+
```yaml
208+
confidentiality: client-confidential
209+
allowed_models:
210+
- "ollama/*"
211+
- "ollama_chat/*"
212+
denied_models:
213+
- "claude-*"
214+
- "gpt-*"
215+
- "gemini-*"
216+
fallback_behavior: DENY
217+
```
218+
4. **Install the synthesis-git-hooks engine.** Replace any repo-local
219+
`.githooks/pre-commit` with the universal engine at
220+
`~/.synthesis/git-hooks/` (installed via the synthesis-git-hooks
221+
skill above). Configure `~/.synthesis/git-hook-config.yaml` with your
222+
client names, internal URLs, and personal-remote patterns.
223+
5. **Install the anti-shortcut catalog.** `~/.synthesis/anti-shortcut-catalog.yaml`
224+
is consumed by the synthesis-anti-shortcuts skill and by the
225+
pre-commit hooks to detect lazy-shortcut costume vocabulary in code
226+
and prose. Install via `npx skills add
227+
synthesisengineering/synthesis-skills --skill synthesis-anti-shortcuts`.
228+
229+
### Acknowledgments
230+
231+
Thanks to everyone who reviewed proposals, surfaced issues, and pushed
232+
back on lazy shortcuts during the v3.4 development cycle. The lessons
233+
distilled during development inform the
234+
[synthesis-anti-shortcuts](https://github.com/synthesisengineering/synthesis-skills/tree/main/synthesis-anti-shortcuts)
235+
skill, which any SKILL.md-compatible AI coding agent can install.
236+
237+
---
238+
239+
## v3.3.0 — 2026-05
240+
241+
Local Gemma 4 via Ollama as first-class. Redesigned single-rich-dropdown
242+
model picker with Pinned/Recent, type-ahead search, `⌘K` shortcut, and
243+
capability badges. User preferences API persisting to
244+
`~/.synthesis/ragbot.yaml`. Bug fix for non-flagship GPT-5.x and Gemini
245+
returning empty content on long-context RAG calls. LiteLLM pinned
246+
`>=1.83.0` to exclude the March-2026 supply-chain incident range.
247+
248+
## v3.2.0 — 2026-04
249+
250+
Demo mode (`RAGBOT_DEMO=1`) with hard-isolated discovery and a bundled
251+
sample workspace and skill. `/health` and `/api/config` report
252+
`demo_mode`. Twenty new tests locking in the discovery isolation
253+
contract.
254+
255+
## v3.1.0 — 2026-04
256+
257+
LLM backend abstraction (`RAGBOT_LLM_BACKEND={litellm|direct}`). Web UI
258+
controls for reasoning effort and the cross-workspace skills toggle.
259+
`/api/chat` accepts `thinking_effort` and `additional_workspaces`.
260+
261+
## v3.0.0 — 2026-04
262+
263+
Pgvector by default with native FTS via tsvector + GIN. Agent Skills as
264+
first-class content with `ragbot skills {list,info,index}` CLI.
265+
Workspace-rooted layout discovered across `~/workspaces/*/ai-knowledge-*`
266+
and via `~/.synthesis/console.yaml`. Reasoning-effort wiring for Claude
267+
4.x adaptive thinking, GPT-5.5 reasoning, and Gemini 3.x thinking
268+
levels.
269+
270+
## Earlier versions
271+
272+
Pre-v3.0 history is recorded in commit messages and the README "What's
273+
New" sections. Older releases predate this CHANGELOG file.

README.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -487,10 +487,16 @@ inherits_from:
487487
Supported AI Models
488488
-------------------
489489

490-
Ragbot supports models from Anthropic (Claude), OpenAI (GPT and reasoning models), Google (Gemini), and local models via Ollama (Gemma 4 family ships out of the box). The authoritative list — model IDs, context windows, thinking-mode support, tier badges, and defaults — lives in [engines.yaml](engines.yaml). v3.3's redesigned model picker reads from the same file at runtime, so what you see in the UI matches what is configured in the repo.
490+
Ragbot supports models from Anthropic (Claude), OpenAI (GPT and reasoning models), Google (Gemini), and local open-weights models via Ollama. The authoritative list — model IDs, context windows, thinking-mode support, tier badges, and defaults — lives in [engines.yaml](engines.yaml). v3.3's redesigned model picker reads from the same file at runtime, so what you see in the UI matches what is configured in the repo.
491491

492492
Adding or updating models is an `engines.yaml` change, not a code change. See the [v3.3 release notes](#whats-new-in-v33) for the local-model integration details.
493493

494+
### Open-weights model support
495+
496+
Ragbot v3.4 ships expanded local-model coverage out of the box. The `ollama` engine in [engines.yaml](engines.yaml) includes Gemma 4 (E4B, 26B MoE, 31B Dense), Llama 4 (Scout, Maverick), Qwen3.6 (27B Dense, 35B-A3B MoE), DeepSeek V3.2, and Mistral (Small 4, Medium 3.5, Large 3). Each entry carries MLX-backend notes, license info, recommended quantization tags, and a real-world parameter count.
497+
498+
**Choosing a model for your Mac.** Open-weights inference on Apple Silicon is bound by unified memory and bandwidth. A 16 GB Mac mini runs Gemma 4 E4B; a Mac Studio with 256 GB unified memory runs DeepSeek V3.2 and Mistral Large 3. The [sizing matrix at `docs/sizing-matrix.md`](docs/sizing-matrix.md) maps every model in `engines.yaml` to each Mac hardware profile (Mac mini 16/32/64 GB, MacBook Air 24/32 GB, MacBook Pro M4 Pro 48 / M5 Max 128 GB, Mac Studio M3 Ultra 192/256 GB), with per-model memory footprints at FP16 and Q4, expected MLX tokens-per-second, and `comfortable / tight / Q4-only / won't fit` verdicts. Start there if you're trying to pick a model for your Mac, or pick a Mac for a model.
499+
494500
Installation, Configuration, and Personalization
495501
------------------------------------------------
496502
Read the [installation guide](INSTALL.md) and the [configuration and personaliation guide](CONFIGURE.md).

0 commit comments

Comments
 (0)