Releases: databrickslabs/ontobricks
v0.5.2
OntoBricks — Release Notes V0.5.2
Release date: 2026-06-22
Type: Patch — two targeted fixes
Test status: 2283 passing, 15 skipped (unit tier).
Summary
v0.5.2 is a stability patch with two independent fixes:
- Deterministic R2RML serialization — mapping exports are now stable across runs, making diff-based version control and automated pipelines reliable.
- MCP server cold-start resilience — the MCP server no longer fails on the first tool call when Databricks Apps is waking from idle; it retries transparently.
No schema changes. No configuration changes. No migration scripts required.
Fixes
1. Deterministic R2RML serialization
Symptom: Two successive exports of the same domain mapping produced R2RML files with attributes in a different order, making version diffs noisy and breaking any pipeline that checksummed the output.
Root cause: R2RMLGenerator._add_entity_mapping() iterated over attribute_mappings (a plain dict) without sorting, so insertion order determined output order — non-deterministic across Python versions and dict mutations.
Fix: attribute_mappings.items() is now wrapped in sorted(), producing stable alphabetical order.
File changed: src/back/core/w3c/r2rml/R2RMLGenerator.py
Tests updated: tests/units/mapping/test_r2rml_generator.py — assertions updated to match sorted order.
2. MCP server retry on 502/503 (Databricks Apps cold-start)
Symptom: The first MCP tool call after an idle period failed immediately with a 502 or 503 error. The MCP client received a hard failure instead of waiting for the app to wake up.
Root cause: _get and _post in the MCP server called raise_for_status() immediately on any non-2xx response, including transient 502/503 responses emitted by the Databricks Apps proxy during cold-start.
Fix: Both _get and _post now retry up to 3 times on 502/503, with progressive back-off (5 s → 10 s → 20 s). Each retry is logged at WARNING level. If all three retries are exhausted the final response still raises via raise_for_status().
File changed: src/mcp-server/server/app.py
Upgrade Notes
New deploys (v0.5.2 from scratch)
No action required. Both fixes are code-only with no schema or configuration changes.
Upgrading from v0.5.1
No migration needed. Deploy the new app bundle — both fixes take effect immediately on restart.
Upgrading from v0.5.0
Apply the v0.5.1 upgrade steps first (no schema migration required), then deploy v0.5.2.
Changes
| Area | File | Change |
|---|---|---|
| Mapping | src/back/core/w3c/r2rml/R2RMLGenerator.py |
sorted() on attribute_mappings.items() for deterministic output |
| Tests | tests/units/mapping/test_r2rml_generator.py |
Assertions aligned with sorted attribute order |
| MCP server | src/mcp-server/server/app.py |
Retry loop (3 attempts, 5/10/20 s back-off) on 502/503 in _get and _post |
| Version | pyproject.toml |
Bumped to 0.5.2 |
What is NOT changed
- Registry schema — no DDL.
- All v0.5.1 features — fully intact.
- R2RML semantic content — only attribute ordering is affected; all triples and mappings are identical.
- MCP tool contracts — no API surface change; retry is fully transparent to callers.
v0.5.1
OntoBricks — Release Notes V0.5.1
Release date: 2026-06-19
Type: Patch — single bug fix
Test status: 2484 passing, 15 skipped on the unit tier; 6 new regression tests added.
Summary
v0.5.1 is a targeted patch that fixes a blocker introduced with the v0.5.0 review workflow: the "Submit for Review" action on the Validation page was permanently blocked even after a successful Digital Twin build, when that build was triggered interactively (via the Build button in the UI or the external REST endpoint POST /dtwin/build).
No schema changes. No configuration changes. No migration scripts required for new deploys.
Bug Fixed
Submit for Review blocked despite a built Digital Twin
Symptom: On the Validation page (/domain/validate), the banner "This version has never been built. Run a Digital Twin build first." appeared and the Submit for Review button was disabled — even though the Consistency-checks panel showed a green "Digital Twin built" tick.
Root cause: The two indicators on the same page read from different sources:
| Indicator | Source |
|---|---|
| Consistency-checks "Digital Twin built" ✅ | Live triple-store state (view exists + has triples) |
| Submit-for-Review gate | domain_versions.last_build column in the registry DB |
The interactive build path (_BuildPipeline._complete_task) never wrote domain.last_build to the domain_versions table. Only the scheduled build path (scheduler._persist_domain_metadata) performed that write. The Submit gate reads the registry column, found it empty, and blocked.
Fix: _BuildPipeline._complete_task() — the single success exit shared by both the UI build (build_kind="session") and the REST build (build_kind="api") — now calls a new _persist_last_build_to_registry() method immediately after recording the build run. This method:
- Resolves the domain folder and version (same logic already used for build-run tracing).
- Stamps
domain.last_buildwith the current UTC timestamp when it is empty (API path). - Calls
RegistryService.from_context(domain, settings)._store.write_version(folder, version, domain_data)to flush the value todomain_versions.last_build. - Is best-effort: any exception is logged as a warning and never propagates — a registry hiccup cannot break a build that otherwise succeeded.
File changed: src/back/objects/digitaltwin/_build_pipeline.py
Tests added: tests/back/core/digitaltwin/test_build_pipeline_units.py::TestPersistLastBuildToRegistry (6 cases)
Upgrade Notes
New deploys (v0.5.1 from scratch)
No action required. The fix is code-only; the domain_versions schema and last_build column are unchanged from v0.5.0.
Upgrading from v0.5.0
No schema migration needed. However, any domain version that was built interactively while running v0.5.0 will have an empty domain_versions.last_build and will still be blocked at Submit for Review after the upgrade.
Two options to unblock existing affected versions:
Option A — Re-run the build (recommended, zero SQL)
On the Validation page for the affected domain + version, click the Build button again. The build re-populates the triple-store (idempotent — full rebuild) and now also writes last_build to the registry. Once the build completes, the Submit for Review button becomes available immediately.
Option B — Direct SQL patch (no rebuild needed)
If you want to unblock Submit for Review without re-running the build (e.g. the triple-store is already healthy and you do not want to re-build), connect to the registry Lakebase database and run:
-- Replace 'your_schema' with your registry schema (e.g. ontobricks_app_demo).
-- Replace 'supplychain' and '1' with your actual domain folder and version.
UPDATE your_schema.domain_versions dv
SET last_build = NOW()::text,
updated_at = NOW()
FROM your_schema.domains d
WHERE dv.domain_id = d.id
AND d.folder = 'supplychain' -- domain folder (sanitised name)
AND dv.version = '1' -- version string
AND dv.last_build = ''; -- only patch truly empty rowsTo patch all versions that have a healthy build artifact but a missing last_build in one shot:
-- Patches every version whose last_build is empty but whose status is not DRAFT
-- (i.e. it was previously submitted or published via a scheduled build workaround).
-- Review the SELECT before running the UPDATE.
SELECT d.folder, dv.version, dv.status, dv.last_build
FROM your_schema.domain_versions dv
JOIN your_schema.domains d ON d.id = dv.domain_id
WHERE dv.last_build = '';
-- Once satisfied, run:
UPDATE your_schema.domain_versions dv
SET last_build = NOW()::text,
updated_at = NOW()
FROM your_schema.domains d
WHERE dv.domain_id = d.id
AND dv.last_build = '';After the SQL update, reload the Validation page — no app restart is needed.
Changes
| Area | File | Change |
|---|---|---|
| Core fix | src/back/objects/digitaltwin/_build_pipeline.py |
Added _persist_last_build_to_registry(); called from _complete_task() |
| Tests | tests/back/core/digitaltwin/test_build_pipeline_units.py |
Added TestPersistLastBuildToRegistry — 6 regression tests |
What is NOT changed
domain_versionsschema — no DDL.- Scheduled build path —
scheduler._persist_domain_metadatais untouched and still the authoritative write for scheduled runs. - Consistency-checks panel — continues to read live triple-store state (unchanged behaviour).
- All other v0.5.0 features — fully intact.
v0.5.0
OntoBricks — Release Notes V0.5.0
Release window: May–June, 2026
Test status: all changes shipped with the suite green (≥ 2532 passing on the unit/integration tiers; the full multi-tier run — unit + integration + property + live + e2e — reached ≥ 2660 passing).
Highlights
- Domain version lifecycle (
DRAFT → IN-REVIEW → PUBLISHED): a proper, server-enforced state machine replaces the old single "Active" (mcp_enabled) toggle. Status gates editability (only DRAFT is editable, any DRAFT version, not just the latest), gates API/MCP data access (only PUBLISHED is served, numeric-latest wins), and is surfaced as a colour-coded badge everywhere a domain + version appears. - Ontology Validation & Review workflow: a new Domain → Validation workspace and a cross-domain My Tasks worklist (Registry + Home) let business users run soft consistency checks, sign off through a per-domain quorum-gated review, and keep a full audit trail. Every lifecycle decision is persisted with actor, transition, and comment; admins can override the quorum and drive any transition.
- Graph Chat streaming (SSE): the agent loop now streams tool calls and the final reply token-by-token instead of blocking until the whole turn finishes — no more static "Thinking…" placeholder.
- Binary-document ingestion for ontology generation: PDFs, Office docs, and images uploaded for OWL / business-rules generation are now converted to markdown on the fly via Databricks
ai_parse_document, exposed as a reusable coreDocumentExtractor. - Ontology Pitfalls as a first-class agent tool: the OWL generator now owns the validate → fix → re-validate loop through a
check_owl_pitfallstool and iterates to a 100 %-clean ontology by default; the precision score was corrected so minor warnings actually move the needle. - Business Views overhaul: a guided New Assistant (build a view from seed entities + their ontology neighbours with a 1–3 hop control), collapse / expand entities, right-click "Hide from view", delete-the-last-view support, and an icon-only toolbar.
- Build-run tracing & Build Analytics: an append-only
build_runsregistry table records one immutable row per build (UI / API / scheduler), surfaced through a Registry "Build Analytics" panel and a domain-wide Audit trail. - Graph / registry Lakebase separation: the graph DB can now live in a different Lakebase project from the registry, via a new
BranchLakebaseAuth, an in-app Create Graph DB provisioner (with auto-granted Postgres superuser for managers), and a Settings → Permissions tab. - CNS test foundations & quality engineering: a comprehensive test strategy landed — coverage gates, test factories, in-process MCP integration tests, Hypothesis property tests, an LLM-agent eval harness + CI gate, ruff + mypy baselines, pre-commit hooks, and a changelog-presence gate. The suite grew from ~1900 to ≥ 2660 cases across multiple tiers.
- Deploy simplified to a single knob: the Lakebase deploy-config surface collapsed from 13 variables to its irreducible core; a second instance now needs only
DEFAULT_APP_NAMEchanged.deploy.shgained colourised step logging, anERRtrap, preflight + resource-existence checks, and a--dry-runmode.
Audit Trail (Cross-Cutting Summary)
Consolidated view of the audit-trail work shipped in 0.5.0. Full detail lives in the Ontology Validation & Review Workflow and Build-Run Tracing & Analytics sections below.
- Review audit log — new append-only
domain_review_eventstable (actor, action,from → to, comment, meta, timestamp) written byReviewServiceon everysubmit / signoff / publish / reopen, plus a chat-style "all comments" history viewer reachable from the worklists. - Audit trail viewer (Domain → Audit trail) — a single domain-wide feed interleaving review events (with comments) and build runs, with All / Status / Builds filter pills and a version dropdown (defaults to the current version).
- Build-run tracing (Runs) — append-only
build_runstable (one immutable row per UI / API / scheduler build; "active" = most recent successful run), surfaced as a per-domain Runs tab and a Registry → Automation Build Analytics panel. - Lifecycle attribution — direct status-dropdown changes (
/domain/set-version-status) also write an audit row taggedmeta.source="lifecycle"; local-dev sign-offs are attributed via a cached SCIM/Melookup so quorum counts correctly without the proxy header. - Schema provisioning —
bootstrap-lakebase-perms.sh/make bootstrap-lakebasecreatedomain_review_eventsandbuild_runs(+ indexes) idempotently as the schema owner.
Domain Version Lifecycle
- New per-version status
DRAFT → IN-REVIEW → PUBLISHED, enforced server-side by a single source of truth (registry/version_lifecycle.py:ALLOWED_TRANSITIONS,is_editable,check_status_transition).- DRAFT → IN-REVIEW (admin/builder; precondition: version has been built; locks editing).
- IN-REVIEW → DRAFT (admin/builder; re-enables editing).
- IN-REVIEW → PUBLISHED (admin/builder).
- PUBLISHED → DRAFT (admin only; reversible publish).
- No direct DRAFT → PUBLISHED; new versions are always DRAFT.
- Editability is now status-only: any DRAFT version is fully editable (older DRAFT versions included); the previous "only the latest version is editable" frontend restriction was removed.
PermissionMiddlewareblocks mutating edits unless the session version is DRAFT (non-mutating compute / validate / generate stay open). - API/MCP serve PUBLISHED only:
find_published_version/load_published_domain_data(numeric-latest PUBLISHED, no fallback); the external/api/v1/graphqlmount is strict PUBLISHED-only;DigitalTwin.resolve_domainrejects a non-PUBLISHED explicit version. domain_versions.statuscolumn (CHECK-constrained + indexed) with a lazy, owner-aware self-heal migration; the retiredmcp_enabledtoggle is left dormant.- Colour-coded status badge wired across navbar, Registry → Browse, Domain → Versions, the Digital Twin / Ontology query headers, and the Load-Domain modal (which now shows
v<n> — Draft/In Review/Publishedinstead of "Latest / Read-Only"). - Digital Twin pages stay fully interactive on PUBLISHED/IN-REVIEW versions (read/analysis surface) — the read-only form gate excludes
body[data-page="digitaltwin"]; real mutations remain server-gated.
Ontology Validation & Review Workflow
- [Audit] New
domain_review_eventsappend-only audit table (actor, action,from → to, comment, meta, timestamp) and a statelessReviewServiceorchestrator:my_tasks,review_detail,submit,signoff,publish,reopen. Approvals reset on resubmit / change-request / publish. - New
/reviewrouter:GET /my-tasks,GET /{folder}/{version}, and POSTsubmit | signoff | publish | reopen, all resolving the caller's role against the target domain. - Domain → Validation workspace: a visual lifecycle diagram (Draft → In Review → Published with the people involved at each stage), status banner with live quorum progress, a soft (advisory) consistency-check summary, header-mounted action buttons, and the audit timeline.
- My Tasks worklist on Registry → Review and on the Home page (revealed only when tasks exist), each handing off to the Validation workspace via a single Validate button rather than driving the workflow inline.
- Per-domain sign-off quorum: stored as a typed
review_quorumcolumn ondomains(default 1), set at domain creation and editable on Domain → Information → Global. The old registry-wideglobal_config.review_quorumis no longer read. - Admin quorum override: admins (app- or domain-level) can publish regardless of quorum; the override is recorded in the published event's meta and surfaced in the UI.
- Review comments everywhere: a shared
ReviewModalshelper adds a comment prompt to every status switch and a chat-style "all comments" history viewer reachable from the worklists. - [Audit] Domain → Audit trail (viewer): a single domain-wide feed interleaving review events (with comments) and build runs, with All / Status / Builds filter pills and a version dropdown (defaults to the current version).
- [Audit] Attribution: direct lifecycle dropdown changes (
/domain/set-version-status) now also write an audit row taggedmeta.source="lifecycle"; local-dev sign-offs are attributed via a cached SCIM/Melookup so quorum counts correctly without the proxy header.
Graph Chat — Streaming (SSE)
run_agent()gained anon_eventcallback fired pertool_call/tool_result/ finaloutput; the legacyon_stepstring callback is preserved.- New
POST /dtwin/assistant/chat/streamSSE endpoint bridges the sync agent thread to an async generator (asyncio.Queue+run_coroutine_threadsafe), streamingstepevents then a finaldoneevent (reply, tool trace, usage, iterations);errorevents carry the exception message. - Frontend renders a live streaming bubble (
createStreamingBubble/updateStreamingBubble/finalizeStreamingBubble/errorStreamingBubble) consuming theReadableStreamand parsing SSE frames.
Ontology Generation — Pitfalls Tool & Iteration Loop
- New
src/agents/tools/pitfalls.py:check_owl_pitfallsagent tool returns{score, is_clean, total_warnings, warnings[…], fix_instruction}; the agent now owns the validate → fix → re-validate cycle in-loop (with amax_fix_roundscap forcing final output when the budget is exhausted). - Default convergence tightened:
score_threshold70 → 100,stop_on_no_criticalTrue → False,max_fix_rounds3 → 5 — the loop now targets a zero-warning ontology. - Precision-score fix: replaced the size-based penalty normalisation (which rounded any non-trivi...
v0.4.0
OntoBricks — Release Notes V0.4.0
Release window: May, 2026
Test status: all changes shipped with the suite green (≥ 2003 passing, 80 skipped).
Highlights
- Lakebase GraphDB engine (full): Postgres-backed triple store via Databricks Lakebase Autoscaling completely replaces LadybugDB as the primary graph backend. Synchronization can be done in two (load) modes
—app_managed(direct streaming into Postgres)
—managed_synced(Lakeflow-managed Unity Catalog synced-table pipeline).
Both modes share the same 3-object Postgres layout (*_sync+*__appcompanion + union view). - Managed Sync pipeline end-to-end:
SyncedTableManagerhandles UC synced-table registration, Lakeflow pipeline polling, ghost control-plane state recovery, union-view creation, and all downstream Digital Twin build steps — with live progress visible in the app log and Build page. - Registry OBX Export / Import: Export one or several registry domains to a single
.obx(JSON) file with per-domain version-mode selection; import with per-domain Skip / Overwrite / Rename conflict resolution. Format-version gating ensures future backward compatibility. - Ontology Pitfalls Detector: D2KLab's OPD (Apache-2.0) integrated as a new Ontology sidebar panel. Detects 19 structural, logical, and semantic pitfalls across four categories, powered by an async TaskManager job. ML-heavy checks can be enabled optionally via
[pitfalls]extra. - HL7 FHIR R5 / R4B / R4 industry import: FHIR added as a fourth importable ontology alongside FIBO, CDISC, and IOF. OWL restriction-based property extraction, six domain buckets (Foundation required + Clinical/Diagnostics/Medications/Workflow/Financial), user-selectable version.
- Ontology labels throughout: Labels (with name fallback) now propagate to the KG detail panel, graph-chat agent responses, MCP tool outputs, ontology viewer link labels, and entity-type / predicate columns.
- Security: Closed urllib3 GHSA-mf9v / GHSA-qccp (#27, #28) by bumping to 2.7.0; GitPython bumped to 3.1.50 for GHSA-x2qx (CVE-2026-42215 follow-on); Mako ≥ 1.3.12, python-multipart ≥ 0.0.27 retained from v0.3.x.
- CI simplification: Sphinx HTML build removed from CI; generated artifacts gitignored;
scripts/build_docs.shretained for local on-demand builds.
Lakebase GraphDB Engine
- New pluggable graph backend (
graph_engine = "lakebase") alongside LadybugDB. Selected in Settings → Graph DB. - Process-wide Postgres connection pool with JWT-aware Lakebase auth (
lakebase/pool.py). LakebaseFlatStoreimplements DDL (triple table +datatype/langRDF columns), CRUD,VACUUM ANALYZEoptimize, bounded-memorybulk_insert_iter, keyed-paginationiter_triples,_sql_relationoverride for physical vs logical table name handling.- Factory-only dispatch (
GraphDBFactory._create_lakebase): Ladybug and Lakebase are mutually exclusive; engine config validated on save. LAKEBASE_AVAILABLEcapability flag;TripleStoreFactoryincludes Lakebase in availability detection.- Reference DDL:
src/back/core/graphdb/lakebase/schema.sql; autodoc page:docs/sphinx/api/app.core.graphdb.lakebase.rst.
App-managed companion layout
app_managedbuilds now use the same 3-object Postgres layout asmanaged_synced:*_sync— bulk warehouse data (streamed by the build pipeline)*__app— companion (reasoning / materialise writes)- union view — single read surface
LakebaseFlatStore.bulk_load_into_sync()for the build pipeline;_writable_table_id()always returns companion (*__app).drop_table()cleans up all 3 objects;optimize_table()vacuums both*_syncand*__app.
Settings — Graph DB tab
- Cascading Lakebase pickers: Project → Branch → Database → Schema (with manual pencil override). UC catalog picker triggers a UC schema picker for Managed Sync configuration.
- Health probe (
GET /settings/graph-engine/lakebase-health): usespg_catalogqueries (privilege-independent) and the sameresolve_lakebase_graph_schemalogic as the build pipeline. - Loading spinner while Graph DB tab data fetches.
- Live UC synced-table name preview (Settings → Managed Sync panel) showing all 4 Postgres / UC object names before the first build.
- Lakebase Objects panel: lists all user-visible schemas, tables, and views in the configured database; Drop button (owner-only, Bootstrap confirm modal) for objects the service principal owns.
- Local Graph Files panel hidden when Lakebase is selected (shows only for LadybugDB).
GET /settings/graph-engineandGET /settings/graph-engine-confignow allowed for all app users (POST remains admin-only).- Graph engine choice persisted in
global_config.configundergraph_engine/graph_engine_configand mirrored into the domain registry entry.
Schema resolution
resolve_lakebase_graph_schema: explicitgraph_engine_config.schemawins; falls back to Registry Volume schema; thenDEFAULT_GRAPH_SCHEMA(ontobricks_graph).resolve_lakebase_graph_database: explicitgraph_engine_config.databasewins; falls back toRegistryCfg.lakebase_database; then auth default.- UC sync FQN (
synced_uc_name) always uses the registry UC schema (RegistryCfg.schema) so the Lakeflow synced object lands in the same Unity Catalog namespace as all other registry artefacts.
Managed Sync / Lakeflow Pipeline
SyncedTableManager: handles UC synced-table registration, trigger (full refresh), Lakeflow pipeline polling (wait_for_completionviaget_update(update_id)+ idle-wait fallback),on_state_changecallback for live task context updates._normalize_statestripsSYNCED_TABLE_prefix from SDK enum names so terminal/in-progress sets match correctly.- Ghost control-plane state recovery: when a previous synced-table was deleted outside the API,
_is_ghost_control_plane_statedetects the conflict;ensure()triesDELETEthen re-CREATE, with_b/c/dfallback names if the primary slot is permanently reserved. ensure_synced_union_view()runs after Lakeflow materializes*_sync; schema-qualifies the_syncreference when Postgres and UC schemas differ; drops existing table with same name before creating view.auto-repair: if the union view is absent (crashed previous build),repair_synced_view_if_possiblerecreates it from the existing*_sync/*__appobjects.- Synced table payload uses
database_instance_name(project) +database_branch+logical_database_name(required by the Lakebase Synced Tables API for Autoscaling projects). LakebaseAuth.branch_nameproperty parses the branch segment from the PGHOST endpoint resource path.- Build pipeline: stores
lakebase_synced_uc/lakebase_pipeline_idin task context; frontend polls/dtwin/sync/pipeline-statusevery 6 s; 30-second terminal-OK grace window before build is declared complete. SELECT DISTINCTfix in R2RML-to-Spark SQL templates to prevent duplicate triples causing Lakeflow PK violations.- UC schema auto-created (
CREATE SCHEMA IF NOT EXISTS) before synced-table registration, withconn.commit()after DDL.
Digital Twin Build — UI & UX
- Build page Graph DB card: compact in-card build note showing
database.schema.table; existence badges fordtLakebaseTableExistsanddtLakebaseSyncedUcExists; Lakeflow line showscatalog.schema.<physical_table>_sync. - Build log card: engine-specific title, Lakebase Pipeline UC FQN,
archivestep hidden for Lakebase builds. - "Backing up graph to registry" archive step removed entirely for Lakebase deploys (no Volume backup needed).
- Post-build session cache (
_populate_session_cache): Lakebase path setsgraph_has_data = final_count > 0,graph_engine,registry_archive_applicable = False; LadybugDB path unchanged. - Per-section
_tscache timestamp prevents cross-section staleness (previously a shared clock caused "Loaded" badge + "not built" text contradiction). - Triplestore stats cache schema version (
_TS_STATS_CACHE_SCHEMA_VERSION = 2) invalidates old formatted strings on upgrade. - Legacy
local_lbug_exists/local_lbug_pathfield names retired; renamed tograph_has_data/graph_displaythroughout backend, build pipeline, and frontend.
Cockpit (Domain Validation)
- Graph DB card parity with the Build page: Database / Schema / Table / UC sync row layout;
psDtLakebaseTableExistsandpsDtLakebaseSyncedUcExistsexistence badges. HomeService.dtwin_detailenriched with all lakebase fields (lakebase_table_exists,lakebase_database,lakebase_schema,lakebase_table,lakebase_synced_uc,lakebase_sync_mode).triple_countprefersdt_existenceoverts_statusfor accuracy.
Registry OBX Export / Import
- New
src/back/objects/registry/obx_format.py:CURRENT_OBX_FORMAT_VERSION = 1, upgrader-chain pattern,build_envelope(),load()with format-version validation andmin_ontobricks_versiongate. - Export modes per domain:
all,active,latest,selected(per-version checkboxes). - Import: preview step shows per-domain conflict flags + suggested rename; apply step resolves each domain with
skip/overwrite/rename. 50 MB upload cap. - UI: Export modal (per-domain checkboxes + version mode selector) and Import modal (2-step: file picker → preview → decisions) on Registry → Browse page.
Ontology Pitfalls Detector
src/back/core/external/pitfalls/subpackage (vendored D2KLab OPD, Apache-2.0):OntologyPatternToolkitwith 19run_p*methods across P1–P4 categories;PitfallsServiceentry point serializes the rdflib Graph to temp TTL and returns grouped results.- Optional
[pitfalls]extra inpyproject.toml: sentence-transformers, scikit-learn, NLTK, SciPy. ML imports insidetry/exceptso taxonomy constants remain accessible wit...
Hot Fix 0.3.1
OntoBricks — Release Notes V3.3.1
Release window: May 2026
Type: Hotfix
Test status: 141 cohort tests passed, 0 failed (49 test_cohort_builder.py, 31 test_dtwin_cohort.py, 34 test_cohort_models.py, 24 test_agent_cohort_tools.py, 3 test_agent_cohort_engine.py).
Highlights
- Cohort Discovery: predicate namespace fix —
hasClaim(and any predicate loaded outside R2RML) now resolves correctly. The engine no longer silently misses triples whose predicate is in ontology-namespace form (#) when the lookup key is in data-namespace form (/). - Cohort Discovery: cross-namespace predicate fallback —
CohortBuildergains a local-name alias map mirroring theSparqlTranslatorapproach, so predicates from a completely foreign namespace (e.g.ontobricks.com/ontology#hasclaimvs.databricks-ontology.com/Cust360Auto/hasclaim) resolve via local-name matching. - Cohort designer UX — attribute dropdowns in the Path "where" filter and the Compatibility section are now scoped to the entity being filtered, not the full ontology property list.
Cohort Discovery — Bug Fixes
Fix 1: ontology-form predicate silent miss (CohortBuilder._outgoing_edge_index)
When data triples were inserted outside the R2RML pipeline (direct insert, W3C OWL round-trip, manual load), their predicates were stored in ontology-namespace form (…#hasClaim) while the lookup key produced by _normalized_links was in data-namespace form (…/hasClaim). This caused a silent neighbours_raw = 0 and an empty cohort.
Changes:
src/back/core/graph_analysis/CohortBuilder.py—_outgoing_edge_indexpromoted from@staticmethodto instance method so it can callself._to_data_uri(pred). Every triple predicate is now normalised to data-namespace form when the index is built.src/front/static/query/js/query-cohorts.js—_renderTraceLinknow guards onin_frontier === 0beforeneighbours_raw === 0. When the starting frontier is empty the diagnostic message now reads "the starting frontier for this hop is empty — all members were eliminated before reaching it. Check the compatibility (Stage 3a) filters or the previous hop's target_class." instead of misleadingly blaming the predicate URI.tests/test_cohort_builder.py— 2 new tests:test_data_with_ontology_form_predicate_is_indexed_correctly,test_trace_shows_nonzero_raw_for_ontology_form_predicate.
Fix 2: cross-namespace predicate — local-name alias fallback
_to_data_uri can only bridge # ↔ / within the same base namespace. When the domain's object property URIs live in a completely different namespace (e.g. inherited shared namespace ontobricks.com/ontology#) the first fix was not sufficient.
Changes:
src/back/core/graph_analysis/CohortBuilder.py:_predicate_alias_map()— scans loaded triples, builds{local_name → canonical_data_namespace_uri}, cached inself._cache["predicate_alias"]and invalidated on triple reload._resolve_predicate(uri)— tries_to_data_urifirst; if the URI is unchanged (foreign namespace) falls back to the alias map by local name._normalized_linksand_normalized_compatupdated to use_resolve_predicateinstead of_to_data_uri.
tests/test_cohort_builder.py— 1 new test:test_via_from_foreign_namespace_resolved_by_local_name(exact replica of theElectricitySuspended/Cust360Autoproduction scenario).
Cohort Designer — UX
Attribute dropdowns scoped to entity
Property dropdowns in the Path "where" filter and the Compatibility section previously listed every property in the ontology regardless of the entity in scope. Users had to scroll through unrelated properties when filtering a specific hop.
Changes:
src/front/static/query/js/query-cohorts.js:- New
_dataPropsForClass(classUri)helper — filters to data properties whosedomainmatches the class, with a full-list fallback when ontology metadata is incomplete. _renderHopWhereRownow calls_dataPropsForClass(targetClassUri)._renderCompatnow calls_dataPropsForClass(this.rule.class_uri).
- New
Modified files
| File | Change |
|---|---|
src/back/core/graph_analysis/CohortBuilder.py |
Predicate normalisation fixes + alias map |
src/front/static/query/js/query-cohorts.js |
Diagnostic guard + scoped attribute dropdowns |
tests/test_cohort_builder.py |
3 new regression tests |
Upgrade notes
No schema, API, or configuration changes. Drop-in replacement for v3.3.0.
If a cohort was returning empty results due to the hasClaim predicate mismatch, re-run Materialise — no manual data migration required.
V0.3.0
OntoBricks — Release Notes V0.3.0
Release window: May, 2026
Test status: all changes shipped with the suite green (2045 passing, 80 CloudFetch probe tests conditionally skipped in CI).
Highlights
- Cohort Discovery — new end-to-end feature for business-friendly entity grouping: rule-based linkage (shared resources via predicates), compatibility constraints, a 6-stage deterministic engine, full Volume + Lakebase persistence, graph-triple + Unity Catalog Delta materialisation, and a natural-language Stage 2 agent that translates free-text prompts into validated
CohortRuleJSON. - Live Digital Twin build log — the Build page now shows a real-time per-step log panel with elapsed timers, phase descriptions, a one-click export to
.log, and honest background-archive handling.TaskManagergainsskip_step/complete_current_stepso skipped phases are labelled correctly. - Real
/healthreadiness probe — 11 checks covering filesystem, Databricks auth, SQL warehouse, registry Volume read/write, registry UC DDL permissions, Lakebase schema/table/sequence grants, and CloudFetch capability./health/detailedretired. New admin Health tab in Settings surfaces the same payload in-app. - Mapping diagnostics: source table permissions — new third section runs a non-destructive
SELECT … LIMIT 0against every Unity Catalog table referenced by the mapping and reportsok/PERMISSION_DENIED/TABLE_OR_VIEW_NOT_FOUNDper table, surfacing missing grants before a build attempt. - Knowledge Graph — right-click Expand neighbours — any node can be expanded N hops in place without re-running a full SPARQL query. Non-blocking spinner, depth picker, camera zoom + highlight on new nodes. KG preview/expand also hardened against timeouts and 502 errors on large graphs.
- Deployment: single source of truth —
scripts/deploy.config.sh+app.yaml.templatereplace scattered literals acrossMakefile,deploy.sh, and bootstrap scripts.app.yamlis now a generated artifact. App nameontobricks-030unified across all tooling. - CloudFetch — runtime capability probe —
DatabricksAuthdetects at runtime whether the Apps sandbox can actually reach the CloudFetch storage host, setsuse_cloud_fetchaccordingly, and exposes the verdict in/health. A new Settings → Global toggle lets admins override the default. - Task duration & UTC timestamps —
TaskManageremitsSTART task/END tasklog lines with compact durations, serialisesduration_secondsinto_dict(), and uses UTC-aware ISO timestamps throughout to prevent timezone-drift bugs in the browser.
Cohort Discovery
Stage 1 — deterministic engine, UI, persistence, materialisation
- New
CohortRulemodel withvalidate(), CRUD endpoints, dry-run + materialise, and a 6-stage pure-Python engine (CohortBuilder) that works against both Delta/Spark SQL and LadybugDB/Cypher backends. - Materialise to graph (idempotent
DELETEthenINSERT, per-rule:inCohort<RuleId>predicate) and to Unity Catalog Delta (partitioned byrule_id, chunked INSERT). - Content-hash cohort URIs (
<base>/cohort/<rule_id>/c-sha256(…)[:8]). - Live preview helpers: class stats, edge count, node count, sample property values,
explain_membership(Why? / Why not?). - 59 new tests across
test_cohort_models.py,test_cohort_builder.py,test_dtwin_cohort.py. - New
docs/cohort_discovery.mdwith mental model, UX walkthrough, 4 worked examples, API summary.
Stage 2 — NL agent for rule generation
agents/agent_cohort/— six read-only tools (list_classes,list_properties_of,count_class_members,sample_values_of,propose_rule,dry_run) wired to Stage 1 endpoints.- One-shot agent loop with 10-iteration cap; never writes — saving still goes through the Builder-protected endpoint.
- UI: Describe tab (NL prompt + agent trace with tool calls, durations, iterations) auto-switches to Build rule tab when a rule is proposed.
- Fix:
list_properties_ofwas returning empty arrays whenrdfs:domainwas stored as a local name or property URIs were missing — resolved with_domain_matches+_ensure_urihelpers and a fallback to the full object-property list. - 22 new tests across
test_agent_cohort_tools.py(19) andtest_agent_cohort_engine.py(3).
Cohort designer — UX refinements
- Three-tab layout (Describe / Build rule / Preview) replaces the full-width drawer; the Preview tab carries a live cohort-count badge.
- Saved rules pane promoted to a persistent right/left rail, always reachable regardless of active tab.
- Dependent dropdowns in the "Link members" section:
viaproperty narrows to predicates whosedomainis the source class andrangeis the chosen shared class; falls back to the full list when ontology metadata is incomplete. - camelCase rule name enforcement: live input sanitisation, paste-to-camelCase,
_isValidRuleNamevalidator,_toCamelCasehelper applied to agent-generated names.id = label(no more slug fork). - Rule-scoped predicate and UC table: membership triples use
:inCohort<RuleId>; Auto-pick proposescohorts_<snake_rule_name>. Both the graph-triples hint and the UC-table hint in the Configure-outputs modal now show the actual predicate / table name for the active rule. - Clickable entity badge in the Preview pane: each cohort member renders as a pill that links to the Sigma graph focused on that node. URI is demoted to inline parenthetical muted text.
- Configure-outputs modal — visible feedback:
Auto-pickandTest write accesswere silently swallowing errors. Both are now four-state (idle → working → success | error) with inline status lines, spinner, toast, and error-envelope surfacing. - Clearer step labels in the designer: "Link members via a shared entity" (was "When are two members linked?"), "Conditions every member must satisfy" (was "Compatibility policies"). Terminology switched from "class" to "entity" throughout user-facing strings.
- Cohort explain fix — namespace drift:
CohortBuilder._members_of_classnow checks all URI variants (ontology form, data form, raw) soexplain_membershipworks regardless of which normalisation path the loader used. Enhanced "not in class" diagnostics report typed-as, untyped, or URI not found with actionable advice. - Rule summary card: removed the redundant
rule_idchip (equal tolabelfor camelCase names); fixed binary UC/graph output display to enumerate all four states (graph + UC / graph only / UC only / no outputs).
Digital Twin Build
- Live build log panel (
#syncBuildLogCard): appears on Build click, grows row-by-row with icon, description, live sub-message, and per-step elapsed timer. Step labels rewritten to plain English ("Preparing mappings…", "Detecting what changed since last build", etc.). TaskManager.skip_step()marks a stepskippedand advancescurrent_stepso the label array stays aligned with execution when phases are conditionally bypassed (e.g. Detecting what changed on first build, Checking source tables on forced full rebuild).- Export build log: one-click export to
digital-twin-build_<timestamp>.log(plain text with a header block, per-step table, and result block). Button enabled from the first poll onwards. - Fast gzip + background archive:
GraphSyncService.sync_to_volumenow usescompresslevel=1(≈ 6–10× faster than the previous level 9). For session builds the Volume upload runs in a daemon thread; the build task completes immediately after the snapshot step with the note "Registry backup continues in the background." - Archive checkbox: new "Archive graph to registry" checkbox on the Build page (default on); when off, the archive step is marked skipped with a plain-English reason.
- Honest timing: archive row shows "Continues after build" with a tooltip instead of a misleading 0ms duration when the upload is backgrounded.
- Background archive task tracked as a separate
registry_archiveTaskManagertask with its own navbar hourglass entry.
Task Manager & Notifications
Task.duration_seconds()— computed live for running/pending, frozen atcompleted_at − started_atfor terminal tasks; serialised into_dict().TaskManagerlifecycle log lines unified toSTART task <id> [<type>] — <name>/END task <id> [<type>] completed|failed|cancelled in <duration>._format_durationproduces compact strings:450ms,2.40s,1m 23.5s,1h 5m.- Navbar hourglass: running rows show a live 1 s ticking elapsed time; bell toasts append
(in 1m 23s). - All task/step timestamps emitted in UTC-aware ISO format (
+00:00);duration_seconds()tolerates mixed naive/aware legacy timestamps. - Removed the broken "Open" link from terminal task completion toasts.
Health & Observability
/health readiness probe (replaces static {"status":"healthy"})
11 probes, each timed and wrapped in _safely_run so one failure never breaks the overall response:
| Probe | What it checks |
|---|---|
runtime |
Python + OntoBricks version |
filesystem.tmp |
Write sentinel + shutil.disk_usage thresholds (warn < 1 GB, error < 100 MB) |
filesystem.session_dir |
Same check against the session directory |
filesystem.log_dir |
Same check against the log directory |
databricks.auth |
has_valid_auth + OAuth token mint in App mode |
databricks.warehouse |
SELECT 1 against the configured warehouse |
databricks.cloudfetch |
Real SQL probe with use_cloud_fetch=True; cached 5 min |
registry.cfg |
Resolved catalog / schema / volume |
registry.volume_read |
VolumeFileService.list_directory on the registry volume |
registry.volume_write |
Write + delete of a sentinel file via the Files API |
registry.uc_schema_ddl |
`CREATE OR REP... |
v0.2.1
HF0.2.1
v0.2.0
OntoBricks — Release Notes V0.2.0
Release window: May, 2026
Test status: all changes shipped with the suite green (≥ 1892 passing).
Highlights
- New end-to-end Permissions model: app-level perms come from Databricks, domain-level perms from the Teams matrix, with a 4-step refactor (declarative guards, body
data-*attrs, CSS gating) and a hardened Viewer / read-only role across every ontology and mapping widget. - Graph Chat (formerly Digital Twin): natural-language chat with the knowledge graph, now session-aware and stable behind the deployed reverse proxy.
- New in-app Help Center accessible from the navbar, including a Starter Guide, Workflow / FAQ accordions, a Data Access / GraphDB engine map (LadybugDB as default), and a refreshed About page.
- Lakebase registry backend wired end-to-end.
- Databricks dev sandbox bundle (
databricks.yml): deploysontobricks-020(main UI) andmcp-ontobricks(MCP); targetsdev(Volume-only) anddev-lakebase(Volume + Lakebase Autoscalingpostgresbinding). Lakebase variables includelakebase_database_resource_segment(thedb-…suffix fromdatabricks postgres list-databases … -o json, not the Postgresdatname) andlakebase_registry_schema(keep in sync withLAKEBASE_SCHEMAinapp.yaml). - Deploy & bootstrap scripts aligned with the bundle:
scripts/deploy.shusesAPP_NAME=ontobricks-020;make bootstrap-perms/make bootstrap-lakebaseand the underlying shell scripts default toontobricks-020,mcp-ontobricks, and the documented Lakebase project / schema-grant flow. - Major domain-switching robustness improvements (no more stale state, full-page loading overlay everywhere, including cross-domain bridges).
- Security: patched two GitPython advisories (GHSA-rpm5-65cw-6hj4 and GHSA-x2qx-6953-8485 / CVE-2026-42284) by pinning
gitpython>=3.1.47via uv constraint — transitive vuln only, no code-path exposure.
Permissions & multi-tenant access control
- App-level permissions now sourced from Databricks; domain-level permissions handled by the Teams matrix.
- First-deploy bootstrap detects and fixes the app SP self-permission chicken-and-egg situation.
- Viewer / read-only role:
- Cascaded to all ontology and mapping widgets.
- OWL preview no longer fails with "Unknown error" in read-only mode.
- Belt-and-suspenders contextmenu blocker on design surfaces.
- Gates data-source reset and all ontology / mapping imports.
- Fixed Registry → Teams sub-menu leaking to non-admin users in the top navbar.
- New three-level permission matrix tests + OWL endpoint contract tests.
- 4-step permissions refactor: declarative guards, body
data-*attributes, CSS-based gating. - Code-review fix-up: navbar role-badge inline CSS moved into
permissions.css.
Graph Chat (renamed from Digital Twin)
- Natural-language chat over the knowledge graph.
- Forwards
X-Forwarded-*headers on loopback to fix a deployed 302 redirect issue. - All tools now use session-aware internal routes.
- Code-review hardening pass: clean layering, consistent error handling, deduplication, class-first refactor.
In-app Help Center
- New navbar Help icon opens a modal with comprehensive documentation.
- Refreshed About page to reflect the current product scope.
- Visual pass:
- Palette switched from blue to red/black (solid red, no gradients).
- OntoBricks logo used in title and welcome hero.
- Modal height locked (no resize when switching menu items).
- Fixed double vertical scrollbar in tall sections (Starter Guide).
- Removed horizontal scroll on the Welcome pipeline.
- Removed grey borders on Workflow / FAQ accordions.
- Starter Guide:
- Added optional "Import Documents" step.
- Rewrote the mapping step (manual or Auto-Map).
- Added Data Access engine-map documentation, then generalized it to GraphDB (LadybugDB as default engine).
Domain switching
- Fixed stale session state leaking between domain switches —
DomainSession.import_from_filenow fully resets ontology, assignment, design layout, domain info, metadata, and triplestore before overlay. - Full-page "Loading {domain}…" overlay now appears for:
- Graph switcher modal.
- Bridge-based switches via URL parameters.
- Cross-Domain Bridge links going through
/resolve(server-side redirect).
UI / UX fixes
- Build sub-menu: fixed unreadable "Mapping" stale-indicator badge.
- Build sub-menu: stopped reporting "Loaded" for the Graph DB digital twin when nothing had actually been built.
- Sidebar: fixed the "Teams" icon misalignment.
- Cockpit: the Active Version tile now reflects the version exposed via API/MCP (the one set in Registry → Browse), not merely the latest version on disk, with a
(not loaded)hint when the loaded version differs.is_activekeeps its legacyis_latestmeaning so the read-only body class still gates writes correctly. - Navbar: the Domain name and version in the top navbar now refresh reliably after every domain mutation (new domain, load from registry, save / rename, version switch / create / rollback, file import). The
/navbar/statesessionStoragecache (15 s TTL) was previously survivingwindow.location.reload(), so the navbar could display the previous domain identity for up to 15 s. Every mutation flow now invalidates the cache before navigating; in-place edits (e.g. saving Domain Information) re-fetch the navbar state immediately. - Domain → Versions: the API/MCP “Active” control is no longer a toggle on this page — it is shown as a read-only badge; changing the active version is done only from Registry → Browse (consistent with registry-centric operations).
- Domain creation: Save to UC is now blocked when the chosen Domain Name already exists in the registry. The duplicate-name check (
/domain/check-name) was already running on every keystroke of the name field, but its result was only advisory — the navbar's Save action still POSTed and the user only saw the conflict after a round-trip. The Save flow now re-runs the check synchronously and refuses with a clear notification + focuses the offending field.
Documentation
- README, docs/features.md, docs/INFO.md, docs/user-guide.md, docs/get-started.md, docs/README.md, and docs/mcp.md updated so operator-facing text matches the above: Ontology Designer, Domain Cockpit Active Version vs loaded vs latest, Registry → Browse for MCP/API active version, new-domain loading overlay, Digital Twin path refresh on committed name/version changes, duplicate-name guard, and navbar identity refresh.
docs/deployment.mdrewritten for the current DAB:dev/dev-lakebasetargets, correctbundle deployment bind/bundle runresource keys and app names,scripts/deploy.shflags (no legacy--all/--mcp-only), Lakebase variable summary, Step 5b forbootstrap-lakebase-perms.sh, full deployment checklist, MCP and troubleshooting sections, and §9 DAB reference aligned with theMakefile.- README Lakebase paragraph: documents
lakebase_database_resource_segmentand thelist-databaseslookup pattern.
Tasks & Notifications
- Tasks panel now shows only currently running tasks; finished tasks are moved to the Notifications drawer.
Backend & Databricks Apps bundle (operator-facing)
- Lakebase registry backend wired end-to-end (runtime + optional Volume toggle unchanged).
databricks.yml:ontobricks_dev_app/mcp_ontobricks_appresource keys; workspace app namesontobricks-020andmcp-ontobricks;dev-lakebasetarget adds the Appspostgresresource whosedatabasepath ends withlakebase_database_resource_segment(db-…from the Postgres APInamefield).scripts/deploy.sh: default targetdev-lakebase;APP_NAMEset toontobricks-020so post-deploybootstrap-app-permissions.shandbootstrap-lakebase-perms.shresolve the correct service principal.scripts/bootstrap-lakebase-perms.sh: default Lakebase projectontobricks-app, default Postgres DBontobricks_registry(dedicateddatnamealigned with the bundle bind), schemaontobricks_registry; default granteesontobricks-020andmcp-ontobricks. Use-d databricks_postgresif the registry schema still lives in the shared default DB. Retarget with-i/-d/-s/-awhen your workspace differs.scripts/bootstrap-app-permissions.sh: default app listontobricks-020mcp-ontobricks(matches the bundle).
Security
- Patched two GitPython advisories pulled in transitively via
mlflow-skinny:- GHSA-rpm5-65cw-6hj4 — command injection via
upload_pack/
receive_packkwargs onRepo.clone_from,Remote.fetch,
Remote.pull,Remote.push(affected[3.1.30, 3.1.47)). - GHSA-x2qx-6953-8485 / CVE-2026-42284 — argument injection via
multi_optionsshlex.splitbypass in_clone()/
Submodule.update(affected<= 3.1.44).
- GHSA-rpm5-65cw-6hj4 — command injection via
- Both fixed by adding
gitpython>=3.1.47to
[tool.uv].constraint-dependenciesinpyproject.toml; lockfile
bumpedgitpython 3.1.46 → 3.1.47. OntoBricks itself does not import
gitanywhere, so there is no code-path exposure — this only closes
the SCA finding on the lockfile / installed env.
Upgrade notes
- Databricks Apps sandbox name: if you still point scripts or docs at
ontobricks-dev, switch toontobricks-020(the name indatabricks.ymlforontobricks_dev_app) fordatabricks apps get,bootstrap-app-permissions.sh, andbootstrap-lakebase-perms.sh -a …, or pass-aexplicitly. - Lakebase bundle variables: the monolithic branc...
V0.1.1
This is the first official release.