Personal assistant governance: prompt library + tenant policy + runtime enforcement#417
Merged
Merged
Conversation
Adds a separate prompt_library table for editable prompt templates that can later be shared across personal chat assistants. Kept distinct from the existing prompts table because that one is an immutable history log (each update inserts a new row); mixing both lifecycles would corrupt audit + insights flows. Includes: - alembic migration with tenant FK and unique (tenant_id, name) constraint - prompt_library package (domain entity + repo protocol + sqlalchemy impl + tenant-scoped admin service + presentation router with full CRUD) - admin-only routes under /api/v1/admin/prompt-library/ - unit tests for entity invariants and service auth/duplicate handling - integration test scaffolding for HTTP round-trip created_by_user_id is ON DELETE RESTRICT so we never lose authorship when a user leaves. Subsequent phases will add the cross-service guard that prevents deleting a prompt referenced by a personal chat policy.
Adds the PersonalChatPolicy entity that backs admin configuration for default (personal-chat) assistants. Stores three independent restriction flags so admins can enable / disable model, MCP, and prompt enforcement separately — and explicit per-dimension flags let us distinguish "no restriction" from "deny-all" (empty whitelist). Includes: - alembic migration with personal_chat_policies + two m2m tables (completion models + mcp servers) - DB-level CHECK constraint: prompt_enforcement_enabled implies a prompt is selected (mirrors the service invariant, belt-and-suspenders) - partial unique index ensuring at most one model is flagged is_default - domain entity with invariants on set_models_restriction (>=1 model when enabled, unique IDs, single default), set_mcp_restriction (empty list allowed = deny-all), and set_prompt_enforcement - application service with get-or-create semantics (auto-created policy has all *_restriction_enabled=False so user-facing behaviour is unchanged until admin acts) - tenant-scope validation: completion models, MCP servers, and prompt library entries are all verified against the caller's tenant before being persisted - GET / PUT endpoints under /api/v1/admin/personal-chat-policy/ - cross-service guard in PromptLibraryService.delete: if a policy references the prompt, raise a friendly 409 with context instead of letting the FK ON DELETE RESTRICT surface as a 500 - unit tests for entity invariants No assistants are affected yet — runtime enforcement lands in the following commits.
Adds the central decision logic that translates a tenant policy into the effective configuration for a given assistant, and surfaces it in the public API so the frontend can render lock state. Resolver design: - policy_resolver.resolve() is a pure function in the domain layer: no awaits, no DB access. Same inputs always yield the same output, making it trivially testable. - Returns an EffectiveConfig dataclass with separate flags for model, MCP, and prompt enforcement plus the filtered lists. - Short-circuits to "all disabled" for non-default assistants and when no policy exists, so any caller can invoke it safely. - Filters policy whitelist entries against the tenant's actual model / MCP catalogue, so stale policy associations don't crash the chat. EffectiveConfigService (application layer) composes the resolver inputs from live repos/services. Kept separate from the resolver so the pure function stays pure. Wire-up: - AssistantPublic.effective_config exposes EffectiveConfigPublic to the frontend (locked_model, available_models, available_mcp_servers, prompt_locked). Note that the admin prompt text is never leaked — only the locked boolean. - assistant_router.get_assistant resolves and attaches effective_config when the assistant is_default. - space_assembler accepts default_assistant_effective_config and space_router populates it from the same service. This is the path the chat UI reads — space.default_assistant.effective_config — so the lock state surfaces without an extra round trip. Subsequent commits add the matching write-path guards and ask-time runtime enforcement.
… ask Wires policy enforcement into the write paths (PUT assistant + the separate single-MCP-add endpoint) and at ask-time. UI filtering alone is not enough — stale entity state or direct API callers could otherwise bypass the policy. Backend changes: - Assistant.ask gains optional completion_model_override, mcp_servers_override and prompt_override parameters. When set, they take precedence over the values stored on the entity. When None, fall back to the entity's own values — so non-default assistants are entirely unaffected. - AssistantService accepts effective_config_service (optional) and, on ask, resolves the effective config for default assistants. Stale model assignment falls back to the policy default and then the first allowed model; refuses only when the whitelist is empty. - assistant_router.update_assistant rejects (400) attempts to set a completion_model_id or mcp_server_ids outside the policy whitelist on a default assistant. - assistant_router.add_mcp_to_assistant gets the same MCP guard so the single-server endpoint isn't a back door past the bulk one. - Container: reorder mcp_server_settings_service / personal_chat_policy_service / effective_config_service to appear before assistant_service so the DI chain resolves top-to-bottom without forward refs. Net effect: when policy enforcement is active, the chat receives exactly what the policy allows — even if assistant.completion_model or assistant.mcp_servers in the DB has gone stale relative to the latest policy state, and even if a caller skips the UI entirely.
…end)
Adds the admin-facing configuration surface plus the user-facing lock
indicator that the backend phases enable.
SDK:
- intric.promptLibrary.{list,get,create,update,delete}
- intric.personalChatPolicy.{get,update}
- Regenerated schema.d.ts to pick up the new paths and the
effective_config field on AssistantPublic.
Admin UI (/admin/personal-chat):
- Tabbed layout — Konfiguration | Promptbibliotek
- /prompts: list / create / edit / delete promptbibliotek-entries.
Delete shows a 409 explanation if the prompt is referenced by the
policy.
- /configuration: three cards (models / MCP / prompt) each gated by a
master switch. Save is blocked until the dimension is internally
valid (>=1 model, prompt chosen, etc). Confirmation dialog fires for
the load-bearing transitions: collapsing to a single model, switching
to deny-all MCP, enabling prompt enforcement.
- New "Personlig chatt" entry in the admin sidebar.
Chat lock display:
- DefaultAssistantModelSwitcher reads
$currentSpace.default_assistant.effective_config. When models_enforced
and a single locked_model is set, hides the dropdown entirely and
shows the locked model name with a "(låst)" hint and an "låst av
administratör" tooltip. When the policy allows multiple models, the
dropdown shows only the whitelisted ones (intersected with the
space's available models so we never offer something the tenant
doesn't have).
Align governance code with the user-facing product name (m.personal_assistant() / "Personlig assistent" / README "Personal Assistant Interface") so admins and end users see the same name for the same thing. Backend: rename personal_chat_policy module, PersonalChatPolicy* classes, DB tables (personal_chat_policies → personal_assistant_policies plus three join tables), alembic migrations (in-place — branch is local only), tests. Frontend: /admin/personal-chat → /admin/personal-assistant route, personal-chat-policy.js endpoint, personalChatPolicy SDK key, all Swedish "Personlig chatt" / "personliga chatten" labels → "Personlig assistent" / "personliga assistenten". schema.d.ts regenerated. Also addresses two pre-existing issues surfaced during verification: * test_policy_resolver _mk_model fixture was missing the provider_id attribute the resolver expects after the providers join-table was added. * configuration/+page.svelte used raw Map/Set inside $derived (must be SvelteMap/SvelteSet for fine-grained reactivity), and +layout.svelte passed a non-RouteId string to <a href> through resolve(); tabs now use literal RouteId strings with `as const` so the type-narrowed resolve() check at the call site holds.
…hat-governance-plan # Conflicts: # frontend/apps/web/src/lib/features/api-keys/ExtendExpirationDialog.svelte # frontend/apps/web/src/routes/(app)/spaces/[spaceId]/assistants/[assistantId]/edit/+page.svelte # frontend/packages/intric-js/src/types/schema.d.ts
…ce_policy and enforce policy on personal assistants Renames the personal_assistant_policy package, table, migrations, services and tests to governance_policy. Wires the effective-config resolution into AssistantService so updates to the default personal assistant are validated against the governance policy: disallowed completion models and MCP servers are rejected, and resolution is skipped when the policy does not apply (non-default assistant, non-personal space, or no service configured).
Adds the configuration admin surface for the renamed governance policy. The page's interactive state lives in policyDraft.svelte.ts (selections, dirty tracking, validation, summaries, confirm-before-apply and save), leaving +page.svelte as thin wiring that binds the sections to the draft. The UI is split into a shared PolicySection shell plus Model / Mcp / Prompt sections and the save bar / confirm dialog, built on shadcn primitives. Renames the intric-js endpoint to governance-policy and regenerates the API types.
…cepted The prompt_library integration token fixtures created JWTs without the test-settings JWT patch, so the auth dependency rejected them with 401 instead of exercising the routes. Mirror the governance_policy fixtures by depending on patch_auth_service_jwt.
…hat-governance-plan
After merging develop, the prompt_library/governance migrations branched from 202605061100 while develop had advanced past it, producing two alembic heads and breaking DB setup for every integration test. Re-point the chain base to develop's current head (20260501_backfill_model_costs) so alembic resolves a single head.
Develop now enforces intric/no-hardcoded-text on all web .svelte files. Move the Swedish strings introduced by the governance UI (prompt library form, locked-model switcher, assistant edit policy hints) into en/sv message catalogs and call them via m.*, reusing existing keys where they already exist (cancel, name, prompt, governance_saving).
…k and preflight Add a single pure resolver (select_effective_completion_model) and route both ask-time enforcement (assistant_service.ask) and read-time preflight (conversation_service) through it, so the projected and the actual model can no longer diverge. Preflight is now governance-aware. Also carries the per-message MCP opt-out (disabled_mcp_server_ids) through the conversation request and enforces the MCP policy whitelist at ask time. Adds unit tests for the resolver and updates preflight tests.
…tive_config Add _assistant_response (assistant_router) and _space_response (space_router) helpers so GET/update assistant and the space endpoints can't return a personal default assistant without its governance effective_config — previously the update endpoint omitted it and the chat UI silently dropped model/MCP filtering. Also extract the ~300-line audit change-tracking out of update_assistant into _build_assistant_update_changes.
…er-message MCP toggle Replace DefaultAssistantModelSwitcher with a shadcn-based composer toolbar: inline model selector (ChatModelSelect, filtered by governance effective_config), an MCP server toggle popover (ChatMcpServers) that sends disabled_mcp_server_ids per message, and refreshed input controls. The preflight estimate now also re-runs on a model switch.
Expose a single active-model source on ChatService (activeModelName plus the existing contextLimit) that prefers a single assistant's own global model over the latest message's model (the latter only applies to group chats); ContextUsageBar reads it instead of recomputing. Compare chat partners by id so switching the personal-assistant model no longer wipes the open conversation, and add one sync point in the chat page so the partner always tracks the canonical default assistant.
…tion UI Add localized strings for the redesigned chat composer and context surfaces, and update the MCP server selection, personal-assistant policy draft, and assistant edit UI accordingly.
Satisfies the route-metadata pre-push guardrail, which requires a description on non-GET route decorators. The endpoint previously lacked one.
…hat-governance-plan
…hat-governance-plan # Conflicts: # backend/src/intric/assistants/api/assistant_router.py # backend/src/intric/assistants/assistant_service.py # frontend/apps/web/src/routes/(app)/spaces/[spaceId]/chat/+page.svelte # frontend/packages/intric-js/src/types/schema.d.ts
…hat-governance-plan
🧹 Dead-code & unused-dependency reportAdvisory — never gates the PR. Whole-repo scan, so some findings may be false positives (dynamic dispatch, framework hooks, runtime-resolved imports). Triage before removing. ✅ No dead code or unused dependencies detected. |
📊 Patch coverageShare of this PR's new/changed lines exercised by tests. Report-only — never gates the PR.
Uncovered lines — Frontend (100 files)
Uncovered lines — Backend (19 files)
|
Replace the stacked MCP tool-call cards in the assistant message with a collapsible chain-of-thought trace: tool calls render as interleaved timeline steps that auto-expand while the assistant works and collapse once the answer streams. Pending tool approvals stay as prominent cards (approval logic unchanged) so a blocking decision is never hidden. Adds ReasoningTrace + ReasoningToolStep; MessageAnswer splits tool calls into traced vs pending. No backend change — uses the tool-call data the stream already provides.
Throwaway /dev routes (excluded from i18n lint via eslint ignore): the original design prototype at /dev/chat-demo and a live preview of the production ReasoningTrace at /dev/reasoning-preview.
Forward the model's reasoning/thinking text (Anthropic extended thinking surfaced by LiteLLM as reasoning_content) to the frontend over SSE as a new "reasoning" event, and render it live in ReasoningTrace: the header shows "Thinking…" while the model reasons and "Thought for Ns" once done, with the thinking text in the collapsible body. Backend: new ResponseType.REASONING + Completion.reasoning_content; capture reasoning deltas in the streaming loop and pass them through every chunk filter (_handle_tool_call, the assistant response stream) to emit SSEReasoning. Frontend: onReasoning accumulates a runtime reasoning field on the message; Message.svelte hides the standalone thinking badge once reasoning streams. Reasoning streams live only — not persisted, so it is absent after a conversation reload, by design.
… deny-set
- Rename the MCP dimension from restrict- to allow-semantics: the toggle
now reads "Allow MCP servers in the personal assistant" and an enabled
grant requires at least one server (deny-all = toggle off). Help and
summary texts no longer claim a non-existent fallback to normal access.
- Replace checkboxes with switches across the governance page (providers,
models, MCP servers) to match the rest of the settings UI.
- Add per-server "on by default" flag: allowed servers can start switched
off in users' chat; the chat input seeds its MCP toggles from
effective_config.default_disabled_mcp_server_ids per conversation.
- Add per-server tool disclosure with switches and bulk on/off. Disabled
tools are stored as a deny-set (governance_policy_disabled_mcp_tools)
so newly synced tools stay allowed; the resolver narrows the tool list
on shallow copies, which enforces the policy in both the chat UI
serialization and the MCP proxy tool registry.
- Migration adds is_default_enabled to governance_policy_mcp_servers and
the disabled-tools join table; API models switch server_ids to
servers[{mcp_server_id, is_default_enabled}] + disabled_tool_ids.
…ss all read paths
get_assistant() carved out the personal default assistant (gated by
PERSONAL_CHAT via can_read_default_assistant, not ASSISTANTS), but
get_assistant_with_effective_config() and get_effective_completion_model()
checked only can_read_assistants(). Both back GET /assistants/{id}/, the update
response, and the preflight model resolution, so:
- a PERSONAL_CHAT-only user got 403 on their own default assistant (and a
desynced UI after a model-picker update that actually persisted), and
- preflight resolved a model for any assistant id with no authorization,
leaking model_name/context_window across spaces and tenants.
Extract _authorize_read_assistant() with the carve-out and apply it on all
three read paths.
update_space (PATCH /spaces/{id}/) returned assembler.from_space_to_model(space)
directly, so a patched personal space serialized its default assistant with
effective_config=None. The frontend overwrites currentSpace with this response,
which silently drops governance filtering (all models/MCP servers shown) until
the next full space GET. Route it through the shared _space_response() helper.
_policy_changes compared completion_models as order-sensitive lists while every other dimension was sorted, so saving the same model set in a different order logged a spurious GOVERNANCE_POLICY_UPDATED event with identical old/new. Normalize model entries by id like the other dimensions.
resolve_for fetched the full completion-model and MCP-server catalogs on every resolution whenever a policy row existed, even with all restriction flags off — and a row is auto-created the first time an admin opens the config page. Since the resolver only reads those catalogs behind the enabled flags, gate each fetch on its flag and run the independent fetches concurrently. Removes two full-table scans per ask/preflight/space-read/assistant-read across the tenant.
create_entry/update_entry pre-checked the name with exists_by_name but had no handling for the uq_prompt_library_tenant_name violation, so two concurrent creates of the same name surfaced the second as a 500. Catch IntegrityError on that constraint and raise the same 400 the pre-check does; re-raise unrelated integrity errors unchanged.
…ssistant heads The branch had two alembic heads after merging develop (202606091000 and a1d4c7e90f23). Add an empty merge revision instead of editing an already-committed migration's down_revision in place, which would leave alembic_version inconsistent on any DB that ran the prior single-parent version.
…ing, guard model select - partnerRuntimeSignature now hashes the MCP and prompt dimensions of effective_config (mcp_enforced, available/default-disabled server ids, prompt_locked), so an MCP/prompt-only policy edit replaces the partner instead of leaving the composer showing stale servers. - Reasoning deltas stream through the same rAF buffer as answer text instead of mutating reactive state per token (avoids re-render storms on thinking-heavy models). - ConversationInput gates the model selector on the SpacesManager context, so mounting the default assistant without it (e.g. a dashboard deep link) no longer throws in ChatModelSelect.
…den orphaned grants Wire the policy draft to the per-server MCP shape (servers + per-server chat default + tool deny-set) and extract disabledToolIdsForSelectedServers as a pure, tested helper. Update the integration assertions to the new mcp_restriction shape (servers/disabled_tool_ids; an enabled grant with no servers is now a 400). Harden against orphaned grants: a server that was allowed in the policy but later disabled for the tenant is filtered out of the selectable set, so seed, dirty baseline and the save payload all intersect with currently-selectable servers. Previously such a grant was invisible in the UI yet still sent on every save, bricking saves with a 400 and inflating the summary count.
Record the branch review findings (auth, effective_config contract, MCP policy, perf, migration hygiene, cleanup) with verification status and a resolution log.
…ector - Add required `description` to prompt-library POST/PUT/DELETE and governance PUT mutation routes so the route-metadata guardrail passes; regenerate schema.d.ts to match the new OpenAPI operation descriptions. - Replace raw `bg-black/10` with the semantic `bg-overlay-default` token in sheet-overlay and drop inert `dark:` variants in radio-group-item and the prompt-enforcement alert to satisfy intric/no-raw-color. - Re-add `name="ask"` to the chat submit/stop buttons that the redesign dropped, fixing the chat E2E selector timeout.
…trace Tool-call argument JSON was accumulated silently server-side, so a turn with many parallel MCP calls left the stream (and UI) frozen for tens of seconds before all steps appeared at once. The adapter now emits a TOOL_CALL event with result_status=pending as soon as a call's name and id arrive in the delta, so steps tick in one by one while arguments are still being generated. - Merge pending entries by tool_call_id in persistence and in the approval flow (backend + frontend) instead of blind appends, and fill in arguments once a later chunk carries them - Map pending to a new "Preparing" step status; show approved calls as running for the whole streaming turn since tool calls can interleave with answer text across MCP rounds - Fold duplicated server/tool name resolution into one helper - Section the reasoning trace into Reasoning/Tools with self-contained tool step cards and an additive tool-count badge
Reasoning/thinking text was streamed live over SSE and discarded when the turn finished, so reloading a conversation lost the trace while tool calls remained visible. Store it in a nullable reasoning column on questions, accumulated in the streaming loop alongside the answer (never mixed into it), and expose it on the Message model so the existing ReasoningTrace renders it from history. Aborted turns keep their partial reasoning via the background-save path. Historical traces fall back to the plain 'Reasoning' label since no live timing exists on reload.
…hat-governance-plan # Conflicts: # backend/src/intric/assistants/api/assistant_assembler.py # backend/src/intric/assistants/api/assistant_models.py # backend/src/intric/assistants/api/assistant_router.py # backend/src/intric/assistants/assistant_service.py # backend/src/intric/database/tables/__init__.py # backend/src/intric/main/container/container.py # backend/tests/integration/audit/test_audit_config_service.py # backend/tests/unit/test_audit_category_mappings.py # backend/tests/unit/test_audit_config_service.py # frontend/apps/web/messages/en.json # frontend/apps/web/messages/sv.json # frontend/apps/web/src/routes/(app)/admin/AdminMenu.svelte # frontend/apps/web/src/routes/(app)/spaces/[spaceId]/assistants/[assistantId]/edit/+page.svelte # frontend/packages/intric-js/src/types/schema.d.ts
…solve duplicate revision ids The develop merge brought in help-assistants migrations that independently reused revision ids 202605211100 and 202605211200, making down_revision pointers ambiguous and breaking 'alembic upgrade head'. Re-point the governance chain linearly on top of the help-assistants chain with fresh ids (prompt_library 202605211500, governance_policy 202605211600), update governance_policy_providers' down_revision accordingly, and drop the now-redundant develop+help-assistants merge migration. Single head: 1d60c8c457d3.
…ing head The dev DB already had the governance chain and the reasoning column applied (stamped at 1d60c8c457d3) but never ran the help-assistants migrations, which arrived via the develop merge. Anchoring that chain below the applied head made alembic treat it as done, so the help-assistants schema (users.is_system_user, org_space_assistant_roles, help_assistant_* tables) was never created and the app crashed on login. Re-point governance back onto backfill_model_costs and chain the help-assistants migrations on top of the reasoning head (1d60c8c457d3) so 'alembic upgrade head' applies them forward. Single head: 202605211400.
…selector (#488) Introduce an ai-elements component layer following Vercel AI Elements naming and compound-component patterns, built on the existing shadcn primitives: - prompt-input family (Root/Body/Footer/Tools/Button/Submit) replaces the hand-rolled form markup in ConversationInput; submit/stop button derives from a shared status context - model-selector family (Trigger/Content/Input/List/Group/Item/Logo/ Name) built on Popover + Command replaces the flat Select in ChatModelSelect: searchable command palette grouped by model vendor with provider logos and a check on the selected model - vendor grouping falls back org -> provider_name -> provider_type; locked-model and single-model states unchanged
The develop merge added required org_space_assistant_role_repo and help_assistant_assignment_history_repo constructor args plus an assert_not_helper_assistant guard in ask(). Pass the new repos and default them to 'not a helper' so the runtime tests stop failing.
This was referenced Jun 18, 2026
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Tenant-level governance for the personal assistant (the default assistant in a user-owned personal space). A central admin can control which completion models, which MCP servers (and which of their tools), and which default system prompt the personal assistant may use — without configuring every user's default assistant individually.
The policy is modelled as data in tables that flows upward through pure, testable layers all the way to the UI, with a single pure resolver as the source of truth for "what is allowed". The same resolver drives both UI hints (read path) and runtime enforcement (write/ask path), so they can never drift apart.
Scope of this PR
The governance feature is the headline, but this branch has accumulated alongside it and the diff (≈204 files) also lands the following. They are grouped here so reviewers know what to expect; the bulk of the description below still focuses on governance.
governance_policy/,prompt_library/,admin/personal-assistant/current_versionpointer per entryprompt_library_versionstable,PromptLibraryVersionquestions.reasoningcolumn,ReasoningTrace/ReasoningToolStep, session/conversation persistencemodel-selector+prompt-inputcomponents, redesigned personal-chat composerlib/components/ai-elements/{model-selector,prompt-input}/,ChatComposersidebar(26 files) +sheet(11 files) used by the admin/navigation layoutlib/components/ui/{sidebar,sheet}/What it does
The policy is per tenant and today targets a single scope:
PERSONAL_DEFAULT_ASSISTANT. It has three independent dimensions, each opt-in:models_restriction_enabledmcp_restriction_enabledprompt_enforcement_enabledKey idea: all three
*_restriction_enabledflags default toFalse. An auto-created "empty" policy produces no visible change — the feature is opt-in per dimension. This deliberately distinguishes "no restriction" from "deny all" (enabled + empty list).For end users
Design principle: policy is data, not code
scopeis stored as a string, not a DB enum, with a composite unique key(tenant_id, scope). New scopes (shared assistants, workspace assistants) can be added as data later — no schema migration.policy_resolver.resolve()is side-effect-free: no DB calls, no awaits. It takes(assistant, scope, policy, tenant context)and returns an immutableEffectiveConfig. The same function powers read-path hints and ask-time enforcement.domain/owns business rules,application/owns I/O orchestration,infrastructure/owns SQL,presentation/owns the HTTP contract. Dependencies point inward; the domain knows nothing about FastAPI or SQLAlchemy.Layer architecture
An admin update travels top-down (UI → API → service → repo → tables). A run travels bottom-up (tables → repo → resolver → enforcement in the chat flow). The resolver sits in the middle and is shared.
Data model
One main table + four M:N join tables, all tenant-isolated. Migrations:
202605211600_create_governance_policy.py,202605221000_governance_policy_providers.py,202606091000_governance_policy_mcp_defaults_and_tools.py.erDiagram tenants ||--o| governance_policies : owns users ||--o{ governance_policies : updated_by prompt_library ||--o{ governance_policies : default_prompt governance_policies ||--o{ governance_policy_completion_models : allows completion_models ||--o{ governance_policy_completion_models : allowed_model governance_policies ||--o{ governance_policy_providers : allows model_providers ||--o{ governance_policy_providers : allowed_provider governance_policies ||--o{ governance_policy_mcp_servers : allows mcp_servers ||--o{ governance_policy_mcp_servers : allowed_server governance_policies ||--o{ governance_policy_disabled_mcp_tools : denies mcp_server_tools ||--o{ governance_policy_disabled_mcp_tools : denied_tool governance_policies { uuid id PK uuid tenant_id FK string scope "PolicyScope value" boolean models_restriction_enabled boolean mcp_restriction_enabled boolean prompt_enforcement_enabled uuid default_prompt_library_id FK "RESTRICT" uuid updated_by_user_id FK "SET NULL" } governance_policy_completion_models { uuid policy_id PK_FK uuid completion_model_id PK_FK boolean is_default } governance_policy_providers { uuid policy_id PK_FK uuid model_provider_id PK_FK } governance_policy_mcp_servers { uuid policy_id PK_FK uuid mcp_server_id PK_FK boolean is_default_enabled "seed only; user can toggle per chat" } governance_policy_disabled_mcp_tools { uuid policy_id PK_FK uuid mcp_tool_id PK_FK }Constraints
UNIQUE (tenant_id, scope)— one policy per scope per tenant.CHECK: NOT prompt_enforcement_enabled OR default_prompt_library_id IS NOT NULL— cannot enforce a prompt without pointing at one.uniq_policy_default_modelon(policy_id) WHERE is_default— at most one default model per policy.Provider whitelist = subscription, not snapshot. The effective allowed model set is the union of explicitly listed models (
_completion_models) and "all org-enabled models under any provider in_providers". Whitelisting a provider therefore automatically grants its future models — no recuration required.MCP tools = allow-by-default, deny-set. A whitelisted server's tools are allowed unless explicitly listed in
governance_policy_disabled_mcp_tools. Tools synced onto a server later are therefore allowed automatically; only the admin's explicit "off" choices persist.Domain model
Plain Python dataclass, no DB/HTTP dependencies. Business rules live in the setters.
backend/src/intric/governance_policy/domain/governance_policy.pySetters validate invariants in the domain, regardless of caller: no duplicates, at most one default model, enabled model restriction requires at least one model or provider, enabled prompt enforcement requires a chosen prompt. Tenant/space membership (that the IDs actually belong to the tenant) is validated in the service layer, since it requires I/O.
The pure resolver — the heart of the design
Side-effect-free:
(assistant, scope, policy, tenant context) → EffectiveConfig. No DB calls, no awaits, trivial to unit-test.backend/src/intric/governance_policy/domain/policy_resolver.py_EMPTY(all*_enforced = False) → "behave as before". No special cases at the call site.EffectiveConfigServicedoes the I/O. The application layer fetches policy + the tenant's models/MCP/prompt text and calls the pure resolver. The resolver stays pure; orchestration lives outside.Service & API contract
GovernancePolicyServicePermission.ADMIN).get_policy()auto-creates an empty policy if none exists;update_policy(models_restriction=, mcp_restriction=, prompt_enforcement=)validates tenant membership, delegates to the domain setters, saves viarepo.save().EffectiveConfigServiceresolve(...).An
Assemblertranslates domain ↔ DTO. InEffectiveConfigPublic(which rides along with the assistant to the UI) the prompt is exposed only as the booleanprompt_locked— never the text itself.The prompt library has its own tenant-scoped admin CRUD at
/api/v1/admin/prompt-library/, kept separate from the immutablepromptshistory table; deleting a prompt referenced by a policy returns a friendly409instead of a raw FK violation.Prompt-library versioning
Each prompt-library entry keeps an append-only history: every edit writes a row to
prompt_library_versions(immutablename/description/text/created_bysnapshot) and bumpsprompt_library.current_version. This gives an audit trail of what a governed prompt said over time, separate from the live editable entry. Migrations:202605211500_create_prompt_library.py,202605221100_prompt_library_versions.py.Frontend
Route
/admin/personal-assistant/configuration(+/admin/personal-assistant/prompts). Thin page, "fat" draft class.+page.ts):Promise.allovergovernancePolicy.get(), models, MCP settings,promptLibrary.list(), model providers;event.depends("admin:governance-policy").PolicyDraft(policyDraft.svelte.ts) owns all interactive state with Svelte 5 runes:$state(per-dimension toggles + selections),$derived(dirty,canSave,effectiveModelIds = explicit ∪ provider,defaultModelId, readable summaries), andsave()(builds confirmations for destructive changes →update()→invalidate()→ reseeds baseline).ModelRestrictionSection,McpRestrictionSection,PromptEnforcementSection, plus a stickyPolicySaveBarandPolicyConfirmDialog.effectiveModelIdslocally for instant UX, but that is only a mirror — truth and enforcement live in the backend resolver. The UI can never "allow" anything on its own.m.governance_*(en + sv).Enforcement — defense in depth (3 layers)
The resolver is shared across all three; UI filtering alone is not enough.
EffectiveConfigPublicso the UI can grey out disallowed models, show the locked state, and a "prompt enforced" badge.prompt_locked: bool— text never leaks._ensure_governance_policy_allows_update()blocks saving an assistant with a disallowed model/MCP. Grandfathering: only newly added MCP servers must pass the policy; existing ones can be re-saved.assistant_service.ask(), model / MCP / prompt are overridden at run time. This is the only layer that guarantees compliance against stale state and direct API callers.Fail-safe: an empty model whitelist + enabled restriction makes the assistant refuse to answer (clear error, ask admin to act) rather than silently falling back to a disallowed model. The
_handle_response()and returnedAssistantResponseuse the effective model, not stale assistant metadata.Decisions worth discussing
These are deliberate trade-offs and open questions (the architecture doc frames them for review):
prompt_locked: bool. Is a boolean enough, or should admins/users see that a prompt applies (name) without content?File overview
Validation
tests/unittests/governance_policy,tests/unittests/prompt_library,tests/unittests/assistants/test_governance_policy_runtime.py.governance_policy+prompt_libraryroute suites pass (admin/non-admin, CRUD, validation, ask-time runtime).bun run check→0 errors, 0 warnings;prettier --check+eslintclean (incl.intric/no-hardcoded-text); i18n en/sv key parity.202606111200) — chain re-pointed onto develop's head after merge.Branch hygiene fixed in this PR
After merging the latest
develop:patch_auth_service_jwt, matching the governance fixtures.intric/no-hardcoded-texton all.svelte; remaining Swedish strings in the governance UI moved intomessages/{en,sv}.jsonand called viam.*.Out of scope
data_retention_daysgovernance (needs a separate path inDataRetentionService).{{user.name}}).