Skip to content

Releases: databrickslabs/ontobricks

v0.5.2

24 Jun 15:26
0f63466

Choose a tag to compare

v0.5.2 Pre-release
Pre-release

OntoBricks — Release Notes V0.5.2

Release date: 2026-06-22
Type: Patch — two targeted fixes
Test status: 2283 passing, 15 skipped (unit tier).


Summary

v0.5.2 is a stability patch with two independent fixes:

  1. Deterministic R2RML serialization — mapping exports are now stable across runs, making diff-based version control and automated pipelines reliable.
  2. MCP server cold-start resilience — the MCP server no longer fails on the first tool call when Databricks Apps is waking from idle; it retries transparently.

No schema changes. No configuration changes. No migration scripts required.


Fixes

1. Deterministic R2RML serialization

Symptom: Two successive exports of the same domain mapping produced R2RML files with attributes in a different order, making version diffs noisy and breaking any pipeline that checksummed the output.

Root cause: R2RMLGenerator._add_entity_mapping() iterated over attribute_mappings (a plain dict) without sorting, so insertion order determined output order — non-deterministic across Python versions and dict mutations.

Fix: attribute_mappings.items() is now wrapped in sorted(), producing stable alphabetical order.

File changed: src/back/core/w3c/r2rml/R2RMLGenerator.py
Tests updated: tests/units/mapping/test_r2rml_generator.py — assertions updated to match sorted order.


2. MCP server retry on 502/503 (Databricks Apps cold-start)

Symptom: The first MCP tool call after an idle period failed immediately with a 502 or 503 error. The MCP client received a hard failure instead of waiting for the app to wake up.

Root cause: _get and _post in the MCP server called raise_for_status() immediately on any non-2xx response, including transient 502/503 responses emitted by the Databricks Apps proxy during cold-start.

Fix: Both _get and _post now retry up to 3 times on 502/503, with progressive back-off (5 s → 10 s → 20 s). Each retry is logged at WARNING level. If all three retries are exhausted the final response still raises via raise_for_status().

File changed: src/mcp-server/server/app.py


Upgrade Notes

New deploys (v0.5.2 from scratch)

No action required. Both fixes are code-only with no schema or configuration changes.

Upgrading from v0.5.1

No migration needed. Deploy the new app bundle — both fixes take effect immediately on restart.

Upgrading from v0.5.0

Apply the v0.5.1 upgrade steps first (no schema migration required), then deploy v0.5.2.


Changes

Area File Change
Mapping src/back/core/w3c/r2rml/R2RMLGenerator.py sorted() on attribute_mappings.items() for deterministic output
Tests tests/units/mapping/test_r2rml_generator.py Assertions aligned with sorted attribute order
MCP server src/mcp-server/server/app.py Retry loop (3 attempts, 5/10/20 s back-off) on 502/503 in _get and _post
Version pyproject.toml Bumped to 0.5.2

What is NOT changed

  • Registry schema — no DDL.
  • All v0.5.1 features — fully intact.
  • R2RML semantic content — only attribute ordering is affected; all triples and mappings are identical.
  • MCP tool contracts — no API surface change; retry is fully transparent to callers.

v0.5.1

19 Jun 12:57
2b1eac4

Choose a tag to compare

v0.5.1 Pre-release
Pre-release

OntoBricks — Release Notes V0.5.1

Release date: 2026-06-19
Type: Patch — single bug fix
Test status: 2484 passing, 15 skipped on the unit tier; 6 new regression tests added.


Summary

v0.5.1 is a targeted patch that fixes a blocker introduced with the v0.5.0 review workflow: the "Submit for Review" action on the Validation page was permanently blocked even after a successful Digital Twin build, when that build was triggered interactively (via the Build button in the UI or the external REST endpoint POST /dtwin/build).

No schema changes. No configuration changes. No migration scripts required for new deploys.


Bug Fixed

Submit for Review blocked despite a built Digital Twin

Symptom: On the Validation page (/domain/validate), the banner "This version has never been built. Run a Digital Twin build first." appeared and the Submit for Review button was disabled — even though the Consistency-checks panel showed a green "Digital Twin built" tick.

Root cause: The two indicators on the same page read from different sources:

Indicator Source
Consistency-checks "Digital Twin built" ✅ Live triple-store state (view exists + has triples)
Submit-for-Review gate domain_versions.last_build column in the registry DB

The interactive build path (_BuildPipeline._complete_task) never wrote domain.last_build to the domain_versions table. Only the scheduled build path (scheduler._persist_domain_metadata) performed that write. The Submit gate reads the registry column, found it empty, and blocked.

Fix: _BuildPipeline._complete_task() — the single success exit shared by both the UI build (build_kind="session") and the REST build (build_kind="api") — now calls a new _persist_last_build_to_registry() method immediately after recording the build run. This method:

  1. Resolves the domain folder and version (same logic already used for build-run tracing).
  2. Stamps domain.last_build with the current UTC timestamp when it is empty (API path).
  3. Calls RegistryService.from_context(domain, settings)._store.write_version(folder, version, domain_data) to flush the value to domain_versions.last_build.
  4. Is best-effort: any exception is logged as a warning and never propagates — a registry hiccup cannot break a build that otherwise succeeded.

File changed: src/back/objects/digitaltwin/_build_pipeline.py
Tests added: tests/back/core/digitaltwin/test_build_pipeline_units.py::TestPersistLastBuildToRegistry (6 cases)


Upgrade Notes

New deploys (v0.5.1 from scratch)

No action required. The fix is code-only; the domain_versions schema and last_build column are unchanged from v0.5.0.

Upgrading from v0.5.0

No schema migration needed. However, any domain version that was built interactively while running v0.5.0 will have an empty domain_versions.last_build and will still be blocked at Submit for Review after the upgrade.

Two options to unblock existing affected versions:

Option A — Re-run the build (recommended, zero SQL)

On the Validation page for the affected domain + version, click the Build button again. The build re-populates the triple-store (idempotent — full rebuild) and now also writes last_build to the registry. Once the build completes, the Submit for Review button becomes available immediately.

Option B — Direct SQL patch (no rebuild needed)

If you want to unblock Submit for Review without re-running the build (e.g. the triple-store is already healthy and you do not want to re-build), connect to the registry Lakebase database and run:

-- Replace 'your_schema' with your registry schema (e.g. ontobricks_app_demo).
-- Replace 'supplychain' and '1' with your actual domain folder and version.
UPDATE your_schema.domain_versions dv
SET    last_build  = NOW()::text,
       updated_at  = NOW()
FROM   your_schema.domains d
WHERE  dv.domain_id = d.id
  AND  d.folder     = 'supplychain'   -- domain folder (sanitised name)
  AND  dv.version   = '1'             -- version string
  AND  dv.last_build = '';            -- only patch truly empty rows

To patch all versions that have a healthy build artifact but a missing last_build in one shot:

-- Patches every version whose last_build is empty but whose status is not DRAFT
-- (i.e. it was previously submitted or published via a scheduled build workaround).
-- Review the SELECT before running the UPDATE.
SELECT d.folder, dv.version, dv.status, dv.last_build
FROM   your_schema.domain_versions dv
JOIN   your_schema.domains         d  ON d.id = dv.domain_id
WHERE  dv.last_build = '';

-- Once satisfied, run:
UPDATE your_schema.domain_versions dv
SET    last_build = NOW()::text,
       updated_at = NOW()
FROM   your_schema.domains d
WHERE  dv.domain_id = d.id
  AND  dv.last_build = '';

After the SQL update, reload the Validation page — no app restart is needed.


Changes

Area File Change
Core fix src/back/objects/digitaltwin/_build_pipeline.py Added _persist_last_build_to_registry(); called from _complete_task()
Tests tests/back/core/digitaltwin/test_build_pipeline_units.py Added TestPersistLastBuildToRegistry — 6 regression tests

What is NOT changed

  • domain_versions schema — no DDL.
  • Scheduled build path — scheduler._persist_domain_metadata is untouched and still the authoritative write for scheduled runs.
  • Consistency-checks panel — continues to read live triple-store state (unchanged behaviour).
  • All other v0.5.0 features — fully intact.

v0.5.0

17 Jun 12:13
f1706e4

Choose a tag to compare

v0.5.0 Pre-release
Pre-release

OntoBricks — Release Notes V0.5.0

Release window: May–June, 2026
Test status: all changes shipped with the suite green (≥ 2532 passing on the unit/integration tiers; the full multi-tier run — unit + integration + property + live + e2e — reached ≥ 2660 passing).


Highlights

  • Domain version lifecycle (DRAFT → IN-REVIEW → PUBLISHED): a proper, server-enforced state machine replaces the old single "Active" (mcp_enabled) toggle. Status gates editability (only DRAFT is editable, any DRAFT version, not just the latest), gates API/MCP data access (only PUBLISHED is served, numeric-latest wins), and is surfaced as a colour-coded badge everywhere a domain + version appears.
  • Ontology Validation & Review workflow: a new Domain → Validation workspace and a cross-domain My Tasks worklist (Registry + Home) let business users run soft consistency checks, sign off through a per-domain quorum-gated review, and keep a full audit trail. Every lifecycle decision is persisted with actor, transition, and comment; admins can override the quorum and drive any transition.
  • Graph Chat streaming (SSE): the agent loop now streams tool calls and the final reply token-by-token instead of blocking until the whole turn finishes — no more static "Thinking…" placeholder.
  • Binary-document ingestion for ontology generation: PDFs, Office docs, and images uploaded for OWL / business-rules generation are now converted to markdown on the fly via Databricks ai_parse_document, exposed as a reusable core DocumentExtractor.
  • Ontology Pitfalls as a first-class agent tool: the OWL generator now owns the validate → fix → re-validate loop through a check_owl_pitfalls tool and iterates to a 100 %-clean ontology by default; the precision score was corrected so minor warnings actually move the needle.
  • Business Views overhaul: a guided New Assistant (build a view from seed entities + their ontology neighbours with a 1–3 hop control), collapse / expand entities, right-click "Hide from view", delete-the-last-view support, and an icon-only toolbar.
  • Build-run tracing & Build Analytics: an append-only build_runs registry table records one immutable row per build (UI / API / scheduler), surfaced through a Registry "Build Analytics" panel and a domain-wide Audit trail.
  • Graph / registry Lakebase separation: the graph DB can now live in a different Lakebase project from the registry, via a new BranchLakebaseAuth, an in-app Create Graph DB provisioner (with auto-granted Postgres superuser for managers), and a Settings → Permissions tab.
  • CNS test foundations & quality engineering: a comprehensive test strategy landed — coverage gates, test factories, in-process MCP integration tests, Hypothesis property tests, an LLM-agent eval harness + CI gate, ruff + mypy baselines, pre-commit hooks, and a changelog-presence gate. The suite grew from ~1900 to ≥ 2660 cases across multiple tiers.
  • Deploy simplified to a single knob: the Lakebase deploy-config surface collapsed from 13 variables to its irreducible core; a second instance now needs only DEFAULT_APP_NAME changed. deploy.sh gained colourised step logging, an ERR trap, preflight + resource-existence checks, and a --dry-run mode.

Audit Trail (Cross-Cutting Summary)

Consolidated view of the audit-trail work shipped in 0.5.0. Full detail lives in the Ontology Validation & Review Workflow and Build-Run Tracing & Analytics sections below.

  • Review audit log — new append-only domain_review_events table (actor, action, from → to, comment, meta, timestamp) written by ReviewService on every submit / signoff / publish / reopen, plus a chat-style "all comments" history viewer reachable from the worklists.
  • Audit trail viewer (Domain → Audit trail) — a single domain-wide feed interleaving review events (with comments) and build runs, with All / Status / Builds filter pills and a version dropdown (defaults to the current version).
  • Build-run tracing (Runs) — append-only build_runs table (one immutable row per UI / API / scheduler build; "active" = most recent successful run), surfaced as a per-domain Runs tab and a Registry → Automation Build Analytics panel.
  • Lifecycle attribution — direct status-dropdown changes (/domain/set-version-status) also write an audit row tagged meta.source="lifecycle"; local-dev sign-offs are attributed via a cached SCIM /Me lookup so quorum counts correctly without the proxy header.
  • Schema provisioningbootstrap-lakebase-perms.sh / make bootstrap-lakebase create domain_review_events and build_runs (+ indexes) idempotently as the schema owner.

Domain Version Lifecycle

  • New per-version status DRAFT → IN-REVIEW → PUBLISHED, enforced server-side by a single source of truth (registry/version_lifecycle.py: ALLOWED_TRANSITIONS, is_editable, check_status_transition).
    • DRAFT → IN-REVIEW (admin/builder; precondition: version has been built; locks editing).
    • IN-REVIEW → DRAFT (admin/builder; re-enables editing).
    • IN-REVIEW → PUBLISHED (admin/builder).
    • PUBLISHED → DRAFT (admin only; reversible publish).
    • No direct DRAFT → PUBLISHED; new versions are always DRAFT.
  • Editability is now status-only: any DRAFT version is fully editable (older DRAFT versions included); the previous "only the latest version is editable" frontend restriction was removed. PermissionMiddleware blocks mutating edits unless the session version is DRAFT (non-mutating compute / validate / generate stay open).
  • API/MCP serve PUBLISHED only: find_published_version / load_published_domain_data (numeric-latest PUBLISHED, no fallback); the external /api/v1/graphql mount is strict PUBLISHED-only; DigitalTwin.resolve_domain rejects a non-PUBLISHED explicit version.
  • domain_versions.status column (CHECK-constrained + indexed) with a lazy, owner-aware self-heal migration; the retired mcp_enabled toggle is left dormant.
  • Colour-coded status badge wired across navbar, Registry → Browse, Domain → Versions, the Digital Twin / Ontology query headers, and the Load-Domain modal (which now shows v<n> — Draft/In Review/Published instead of "Latest / Read-Only").
  • Digital Twin pages stay fully interactive on PUBLISHED/IN-REVIEW versions (read/analysis surface) — the read-only form gate excludes body[data-page="digitaltwin"]; real mutations remain server-gated.

Ontology Validation & Review Workflow

  • [Audit] New domain_review_events append-only audit table (actor, action, from → to, comment, meta, timestamp) and a stateless ReviewService orchestrator: my_tasks, review_detail, submit, signoff, publish, reopen. Approvals reset on resubmit / change-request / publish.
  • New /review router: GET /my-tasks, GET /{folder}/{version}, and POST submit | signoff | publish | reopen, all resolving the caller's role against the target domain.
  • Domain → Validation workspace: a visual lifecycle diagram (Draft → In Review → Published with the people involved at each stage), status banner with live quorum progress, a soft (advisory) consistency-check summary, header-mounted action buttons, and the audit timeline.
  • My Tasks worklist on Registry → Review and on the Home page (revealed only when tasks exist), each handing off to the Validation workspace via a single Validate button rather than driving the workflow inline.
  • Per-domain sign-off quorum: stored as a typed review_quorum column on domains (default 1), set at domain creation and editable on Domain → Information → Global. The old registry-wide global_config.review_quorum is no longer read.
  • Admin quorum override: admins (app- or domain-level) can publish regardless of quorum; the override is recorded in the published event's meta and surfaced in the UI.
  • Review comments everywhere: a shared ReviewModals helper adds a comment prompt to every status switch and a chat-style "all comments" history viewer reachable from the worklists.
  • [Audit] Domain → Audit trail (viewer): a single domain-wide feed interleaving review events (with comments) and build runs, with All / Status / Builds filter pills and a version dropdown (defaults to the current version).
  • [Audit] Attribution: direct lifecycle dropdown changes (/domain/set-version-status) now also write an audit row tagged meta.source="lifecycle"; local-dev sign-offs are attributed via a cached SCIM /Me lookup so quorum counts correctly without the proxy header.

Graph Chat — Streaming (SSE)

  • run_agent() gained an on_event callback fired per tool_call / tool_result / final output; the legacy on_step string callback is preserved.
  • New POST /dtwin/assistant/chat/stream SSE endpoint bridges the sync agent thread to an async generator (asyncio.Queue + run_coroutine_threadsafe), streaming step events then a final done event (reply, tool trace, usage, iterations); error events carry the exception message.
  • Frontend renders a live streaming bubble (createStreamingBubble / updateStreamingBubble / finalizeStreamingBubble / errorStreamingBubble) consuming the ReadableStream and parsing SSE frames.

Ontology Generation — Pitfalls Tool & Iteration Loop

  • New src/agents/tools/pitfalls.py: check_owl_pitfalls agent tool returns {score, is_clean, total_warnings, warnings[…], fix_instruction}; the agent now owns the validate → fix → re-validate cycle in-loop (with a max_fix_rounds cap forcing final output when the budget is exhausted).
  • Default convergence tightened: score_threshold 70 → 100, stop_on_no_critical True → False, max_fix_rounds 3 → 5 — the loop now targets a zero-warning ontology.
  • Precision-score fix: replaced the size-based penalty normalisation (which rounded any non-trivi...
Read more

v0.4.0

28 May 06:17
d0df25f

Choose a tag to compare

v0.4.0 Pre-release
Pre-release

OntoBricks — Release Notes V0.4.0

Release window: May, 2026
Test status: all changes shipped with the suite green (≥ 2003 passing, 80 skipped).


Highlights

  • Lakebase GraphDB engine (full): Postgres-backed triple store via Databricks Lakebase Autoscaling completely replaces LadybugDB as the primary graph backend. Synchronization can be done in two (load) modes
    app_managed (direct streaming into Postgres)
    managed_synced (Lakeflow-managed Unity Catalog synced-table pipeline).
    Both modes share the same 3-object Postgres layout (*_sync + *__app companion + union view).
  • Managed Sync pipeline end-to-end: SyncedTableManager handles UC synced-table registration, Lakeflow pipeline polling, ghost control-plane state recovery, union-view creation, and all downstream Digital Twin build steps — with live progress visible in the app log and Build page.
  • Registry OBX Export / Import: Export one or several registry domains to a single .obx (JSON) file with per-domain version-mode selection; import with per-domain Skip / Overwrite / Rename conflict resolution. Format-version gating ensures future backward compatibility.
  • Ontology Pitfalls Detector: D2KLab's OPD (Apache-2.0) integrated as a new Ontology sidebar panel. Detects 19 structural, logical, and semantic pitfalls across four categories, powered by an async TaskManager job. ML-heavy checks can be enabled optionally via [pitfalls] extra.
  • HL7 FHIR R5 / R4B / R4 industry import: FHIR added as a fourth importable ontology alongside FIBO, CDISC, and IOF. OWL restriction-based property extraction, six domain buckets (Foundation required + Clinical/Diagnostics/Medications/Workflow/Financial), user-selectable version.
  • Ontology labels throughout: Labels (with name fallback) now propagate to the KG detail panel, graph-chat agent responses, MCP tool outputs, ontology viewer link labels, and entity-type / predicate columns.
  • Security: Closed urllib3 GHSA-mf9v / GHSA-qccp (#27, #28) by bumping to 2.7.0; GitPython bumped to 3.1.50 for GHSA-x2qx (CVE-2026-42215 follow-on); Mako ≥ 1.3.12, python-multipart ≥ 0.0.27 retained from v0.3.x.
  • CI simplification: Sphinx HTML build removed from CI; generated artifacts gitignored; scripts/build_docs.sh retained for local on-demand builds.

Lakebase GraphDB Engine

  • New pluggable graph backend (graph_engine = "lakebase") alongside LadybugDB. Selected in Settings → Graph DB.
  • Process-wide Postgres connection pool with JWT-aware Lakebase auth (lakebase/pool.py).
  • LakebaseFlatStore implements DDL (triple table + datatype/lang RDF columns), CRUD, VACUUM ANALYZE optimize, bounded-memory bulk_insert_iter, keyed-pagination iter_triples, _sql_relation override for physical vs logical table name handling.
  • Factory-only dispatch (GraphDBFactory._create_lakebase): Ladybug and Lakebase are mutually exclusive; engine config validated on save.
  • LAKEBASE_AVAILABLE capability flag; TripleStoreFactory includes Lakebase in availability detection.
  • Reference DDL: src/back/core/graphdb/lakebase/schema.sql; autodoc page: docs/sphinx/api/app.core.graphdb.lakebase.rst.

App-managed companion layout

  • app_managed builds now use the same 3-object Postgres layout as managed_synced:
    • *_sync — bulk warehouse data (streamed by the build pipeline)
    • *__app — companion (reasoning / materialise writes)
    • union view — single read surface
  • LakebaseFlatStore.bulk_load_into_sync() for the build pipeline; _writable_table_id() always returns companion (*__app).
  • drop_table() cleans up all 3 objects; optimize_table() vacuums both *_sync and *__app.

Settings — Graph DB tab

  • Cascading Lakebase pickers: Project → Branch → Database → Schema (with manual pencil override). UC catalog picker triggers a UC schema picker for Managed Sync configuration.
  • Health probe (GET /settings/graph-engine/lakebase-health): uses pg_catalog queries (privilege-independent) and the same resolve_lakebase_graph_schema logic as the build pipeline.
  • Loading spinner while Graph DB tab data fetches.
  • Live UC synced-table name preview (Settings → Managed Sync panel) showing all 4 Postgres / UC object names before the first build.
  • Lakebase Objects panel: lists all user-visible schemas, tables, and views in the configured database; Drop button (owner-only, Bootstrap confirm modal) for objects the service principal owns.
  • Local Graph Files panel hidden when Lakebase is selected (shows only for LadybugDB).
  • GET /settings/graph-engine and GET /settings/graph-engine-config now allowed for all app users (POST remains admin-only).
  • Graph engine choice persisted in global_config.config under graph_engine / graph_engine_config and mirrored into the domain registry entry.

Schema resolution

  • resolve_lakebase_graph_schema: explicit graph_engine_config.schema wins; falls back to Registry Volume schema; then DEFAULT_GRAPH_SCHEMA (ontobricks_graph).
  • resolve_lakebase_graph_database: explicit graph_engine_config.database wins; falls back to RegistryCfg.lakebase_database; then auth default.
  • UC sync FQN (synced_uc_name) always uses the registry UC schema (RegistryCfg.schema) so the Lakeflow synced object lands in the same Unity Catalog namespace as all other registry artefacts.

Managed Sync / Lakeflow Pipeline

  • SyncedTableManager: handles UC synced-table registration, trigger (full refresh), Lakeflow pipeline polling (wait_for_completion via get_update(update_id) + idle-wait fallback), on_state_change callback for live task context updates.
  • _normalize_state strips SYNCED_TABLE_ prefix from SDK enum names so terminal/in-progress sets match correctly.
  • Ghost control-plane state recovery: when a previous synced-table was deleted outside the API, _is_ghost_control_plane_state detects the conflict; ensure() tries DELETE then re-CREATE, with _b/c/d fallback names if the primary slot is permanently reserved.
  • ensure_synced_union_view() runs after Lakeflow materializes *_sync; schema-qualifies the _sync reference when Postgres and UC schemas differ; drops existing table with same name before creating view.
  • auto-repair: if the union view is absent (crashed previous build), repair_synced_view_if_possible recreates it from the existing *_sync/*__app objects.
  • Synced table payload uses database_instance_name (project) + database_branch + logical_database_name (required by the Lakebase Synced Tables API for Autoscaling projects).
  • LakebaseAuth.branch_name property parses the branch segment from the PGHOST endpoint resource path.
  • Build pipeline: stores lakebase_synced_uc / lakebase_pipeline_id in task context; frontend polls /dtwin/sync/pipeline-status every 6 s; 30-second terminal-OK grace window before build is declared complete.
  • SELECT DISTINCT fix in R2RML-to-Spark SQL templates to prevent duplicate triples causing Lakeflow PK violations.
  • UC schema auto-created (CREATE SCHEMA IF NOT EXISTS) before synced-table registration, with conn.commit() after DDL.

Digital Twin Build — UI & UX

  • Build page Graph DB card: compact in-card build note showing database.schema.table; existence badges for dtLakebaseTableExists and dtLakebaseSyncedUcExists; Lakeflow line shows catalog.schema.<physical_table>_sync.
  • Build log card: engine-specific title, Lakebase Pipeline UC FQN, archive step hidden for Lakebase builds.
  • "Backing up graph to registry" archive step removed entirely for Lakebase deploys (no Volume backup needed).
  • Post-build session cache (_populate_session_cache): Lakebase path sets graph_has_data = final_count > 0, graph_engine, registry_archive_applicable = False; LadybugDB path unchanged.
  • Per-section _ts cache timestamp prevents cross-section staleness (previously a shared clock caused "Loaded" badge + "not built" text contradiction).
  • Triplestore stats cache schema version (_TS_STATS_CACHE_SCHEMA_VERSION = 2) invalidates old formatted strings on upgrade.
  • Legacy local_lbug_exists / local_lbug_path field names retired; renamed to graph_has_data / graph_display throughout backend, build pipeline, and frontend.

Cockpit (Domain Validation)

  • Graph DB card parity with the Build page: Database / Schema / Table / UC sync row layout; psDtLakebaseTableExists and psDtLakebaseSyncedUcExists existence badges.
  • HomeService.dtwin_detail enriched with all lakebase fields (lakebase_table_exists, lakebase_database, lakebase_schema, lakebase_table, lakebase_synced_uc, lakebase_sync_mode).
  • triple_count prefers dt_existence over ts_status for accuracy.

Registry OBX Export / Import

  • New src/back/objects/registry/obx_format.py: CURRENT_OBX_FORMAT_VERSION = 1, upgrader-chain pattern, build_envelope(), load() with format-version validation and min_ontobricks_version gate.
  • Export modes per domain: all, active, latest, selected (per-version checkboxes).
  • Import: preview step shows per-domain conflict flags + suggested rename; apply step resolves each domain with skip / overwrite / rename. 50 MB upload cap.
  • UI: Export modal (per-domain checkboxes + version mode selector) and Import modal (2-step: file picker → preview → decisions) on Registry → Browse page.

Ontology Pitfalls Detector

  • src/back/core/external/pitfalls/ subpackage (vendored D2KLab OPD, Apache-2.0): OntologyPatternToolkit with 19 run_p* methods across P1–P4 categories; PitfallsService entry point serializes the rdflib Graph to temp TTL and returns grouped results.
  • Optional [pitfalls] extra in pyproject.toml: sentence-transformers, scikit-learn, NLTK, SciPy. ML imports inside try/except so taxonomy constants remain accessible wit...
Read more

Hot Fix 0.3.1

12 May 10:58
663bc02

Choose a tag to compare

Hot Fix 0.3.1 Pre-release
Pre-release

OntoBricks — Release Notes V3.3.1

Release window: May 2026
Type: Hotfix
Test status: 141 cohort tests passed, 0 failed (49 test_cohort_builder.py, 31 test_dtwin_cohort.py, 34 test_cohort_models.py, 24 test_agent_cohort_tools.py, 3 test_agent_cohort_engine.py).


Highlights

  • Cohort Discovery: predicate namespace fixhasClaim (and any predicate loaded outside R2RML) now resolves correctly. The engine no longer silently misses triples whose predicate is in ontology-namespace form (#) when the lookup key is in data-namespace form (/).
  • Cohort Discovery: cross-namespace predicate fallbackCohortBuilder gains a local-name alias map mirroring the SparqlTranslator approach, so predicates from a completely foreign namespace (e.g. ontobricks.com/ontology#hasclaim vs. databricks-ontology.com/Cust360Auto/hasclaim) resolve via local-name matching.
  • Cohort designer UX — attribute dropdowns in the Path "where" filter and the Compatibility section are now scoped to the entity being filtered, not the full ontology property list.

Cohort Discovery — Bug Fixes

Fix 1: ontology-form predicate silent miss (CohortBuilder._outgoing_edge_index)

When data triples were inserted outside the R2RML pipeline (direct insert, W3C OWL round-trip, manual load), their predicates were stored in ontology-namespace form (…#hasClaim) while the lookup key produced by _normalized_links was in data-namespace form (…/hasClaim). This caused a silent neighbours_raw = 0 and an empty cohort.

Changes:

  • src/back/core/graph_analysis/CohortBuilder.py_outgoing_edge_index promoted from @staticmethod to instance method so it can call self._to_data_uri(pred). Every triple predicate is now normalised to data-namespace form when the index is built.
  • src/front/static/query/js/query-cohorts.js_renderTraceLink now guards on in_frontier === 0 before neighbours_raw === 0. When the starting frontier is empty the diagnostic message now reads "the starting frontier for this hop is empty — all members were eliminated before reaching it. Check the compatibility (Stage 3a) filters or the previous hop's target_class." instead of misleadingly blaming the predicate URI.
  • tests/test_cohort_builder.py — 2 new tests: test_data_with_ontology_form_predicate_is_indexed_correctly, test_trace_shows_nonzero_raw_for_ontology_form_predicate.

Fix 2: cross-namespace predicate — local-name alias fallback

_to_data_uri can only bridge #/ within the same base namespace. When the domain's object property URIs live in a completely different namespace (e.g. inherited shared namespace ontobricks.com/ontology#) the first fix was not sufficient.

Changes:

  • src/back/core/graph_analysis/CohortBuilder.py:
    • _predicate_alias_map() — scans loaded triples, builds {local_name → canonical_data_namespace_uri}, cached in self._cache["predicate_alias"] and invalidated on triple reload.
    • _resolve_predicate(uri) — tries _to_data_uri first; if the URI is unchanged (foreign namespace) falls back to the alias map by local name.
    • _normalized_links and _normalized_compat updated to use _resolve_predicate instead of _to_data_uri.
  • tests/test_cohort_builder.py — 1 new test: test_via_from_foreign_namespace_resolved_by_local_name (exact replica of the ElectricitySuspended / Cust360Auto production scenario).

Cohort Designer — UX

Attribute dropdowns scoped to entity

Property dropdowns in the Path "where" filter and the Compatibility section previously listed every property in the ontology regardless of the entity in scope. Users had to scroll through unrelated properties when filtering a specific hop.

Changes:

  • src/front/static/query/js/query-cohorts.js:
    • New _dataPropsForClass(classUri) helper — filters to data properties whose domain matches the class, with a full-list fallback when ontology metadata is incomplete.
    • _renderHopWhereRow now calls _dataPropsForClass(targetClassUri).
    • _renderCompat now calls _dataPropsForClass(this.rule.class_uri).

Modified files

File Change
src/back/core/graph_analysis/CohortBuilder.py Predicate normalisation fixes + alias map
src/front/static/query/js/query-cohorts.js Diagnostic guard + scoped attribute dropdowns
tests/test_cohort_builder.py 3 new regression tests

Upgrade notes

No schema, API, or configuration changes. Drop-in replacement for v3.3.0.
If a cohort was returning empty results due to the hasClaim predicate mismatch, re-run Materialise — no manual data migration required.

V0.3.0

12 May 05:59
9fada28

Choose a tag to compare

V0.3.0 Pre-release
Pre-release

OntoBricks — Release Notes V0.3.0

Release window: May, 2026
Test status: all changes shipped with the suite green (2045 passing, 80 CloudFetch probe tests conditionally skipped in CI).


Highlights

  • Cohort Discovery — new end-to-end feature for business-friendly entity grouping: rule-based linkage (shared resources via predicates), compatibility constraints, a 6-stage deterministic engine, full Volume + Lakebase persistence, graph-triple + Unity Catalog Delta materialisation, and a natural-language Stage 2 agent that translates free-text prompts into validated CohortRule JSON.
  • Live Digital Twin build log — the Build page now shows a real-time per-step log panel with elapsed timers, phase descriptions, a one-click export to .log, and honest background-archive handling. TaskManager gains skip_step / complete_current_step so skipped phases are labelled correctly.
  • Real /health readiness probe — 11 checks covering filesystem, Databricks auth, SQL warehouse, registry Volume read/write, registry UC DDL permissions, Lakebase schema/table/sequence grants, and CloudFetch capability. /health/detailed retired. New admin Health tab in Settings surfaces the same payload in-app.
  • Mapping diagnostics: source table permissions — new third section runs a non-destructive SELECT … LIMIT 0 against every Unity Catalog table referenced by the mapping and reports ok / PERMISSION_DENIED / TABLE_OR_VIEW_NOT_FOUND per table, surfacing missing grants before a build attempt.
  • Knowledge Graph — right-click Expand neighbours — any node can be expanded N hops in place without re-running a full SPARQL query. Non-blocking spinner, depth picker, camera zoom + highlight on new nodes. KG preview/expand also hardened against timeouts and 502 errors on large graphs.
  • Deployment: single source of truthscripts/deploy.config.sh + app.yaml.template replace scattered literals across Makefile, deploy.sh, and bootstrap scripts. app.yaml is now a generated artifact. App name ontobricks-030 unified across all tooling.
  • CloudFetch — runtime capability probeDatabricksAuth detects at runtime whether the Apps sandbox can actually reach the CloudFetch storage host, sets use_cloud_fetch accordingly, and exposes the verdict in /health. A new Settings → Global toggle lets admins override the default.
  • Task duration & UTC timestampsTaskManager emits START task / END task log lines with compact durations, serialises duration_seconds in to_dict(), and uses UTC-aware ISO timestamps throughout to prevent timezone-drift bugs in the browser.

Cohort Discovery

Stage 1 — deterministic engine, UI, persistence, materialisation

  • New CohortRule model with validate(), CRUD endpoints, dry-run + materialise, and a 6-stage pure-Python engine (CohortBuilder) that works against both Delta/Spark SQL and LadybugDB/Cypher backends.
  • Materialise to graph (idempotent DELETE then INSERT, per-rule :inCohort<RuleId> predicate) and to Unity Catalog Delta (partitioned by rule_id, chunked INSERT).
  • Content-hash cohort URIs (<base>/cohort/<rule_id>/c-sha256(…)[:8]).
  • Live preview helpers: class stats, edge count, node count, sample property values, explain_membership (Why? / Why not?).
  • 59 new tests across test_cohort_models.py, test_cohort_builder.py, test_dtwin_cohort.py.
  • New docs/cohort_discovery.md with mental model, UX walkthrough, 4 worked examples, API summary.

Stage 2 — NL agent for rule generation

  • agents/agent_cohort/ — six read-only tools (list_classes, list_properties_of, count_class_members, sample_values_of, propose_rule, dry_run) wired to Stage 1 endpoints.
  • One-shot agent loop with 10-iteration cap; never writes — saving still goes through the Builder-protected endpoint.
  • UI: Describe tab (NL prompt + agent trace with tool calls, durations, iterations) auto-switches to Build rule tab when a rule is proposed.
  • Fix: list_properties_of was returning empty arrays when rdfs:domain was stored as a local name or property URIs were missing — resolved with _domain_matches + _ensure_uri helpers and a fallback to the full object-property list.
  • 22 new tests across test_agent_cohort_tools.py (19) and test_agent_cohort_engine.py (3).

Cohort designer — UX refinements

  • Three-tab layout (Describe / Build rule / Preview) replaces the full-width drawer; the Preview tab carries a live cohort-count badge.
  • Saved rules pane promoted to a persistent right/left rail, always reachable regardless of active tab.
  • Dependent dropdowns in the "Link members" section: via property narrows to predicates whose domain is the source class and range is the chosen shared class; falls back to the full list when ontology metadata is incomplete.
  • camelCase rule name enforcement: live input sanitisation, paste-to-camelCase, _isValidRuleName validator, _toCamelCase helper applied to agent-generated names. id = label (no more slug fork).
  • Rule-scoped predicate and UC table: membership triples use :inCohort<RuleId>; Auto-pick proposes cohorts_<snake_rule_name>. Both the graph-triples hint and the UC-table hint in the Configure-outputs modal now show the actual predicate / table name for the active rule.
  • Clickable entity badge in the Preview pane: each cohort member renders as a pill that links to the Sigma graph focused on that node. URI is demoted to inline parenthetical muted text.
  • Configure-outputs modal — visible feedback: Auto-pick and Test write access were silently swallowing errors. Both are now four-state (idle → working → success | error) with inline status lines, spinner, toast, and error-envelope surfacing.
  • Clearer step labels in the designer: "Link members via a shared entity" (was "When are two members linked?"), "Conditions every member must satisfy" (was "Compatibility policies"). Terminology switched from "class" to "entity" throughout user-facing strings.
  • Cohort explain fix — namespace drift: CohortBuilder._members_of_class now checks all URI variants (ontology form, data form, raw) so explain_membership works regardless of which normalisation path the loader used. Enhanced "not in class" diagnostics report typed-as, untyped, or URI not found with actionable advice.
  • Rule summary card: removed the redundant rule_id chip (equal to label for camelCase names); fixed binary UC/graph output display to enumerate all four states (graph + UC / graph only / UC only / no outputs).

Digital Twin Build

  • Live build log panel (#syncBuildLogCard): appears on Build click, grows row-by-row with icon, description, live sub-message, and per-step elapsed timer. Step labels rewritten to plain English ("Preparing mappings…", "Detecting what changed since last build", etc.).
  • TaskManager.skip_step() marks a step skipped and advances current_step so the label array stays aligned with execution when phases are conditionally bypassed (e.g. Detecting what changed on first build, Checking source tables on forced full rebuild).
  • Export build log: one-click export to digital-twin-build_<timestamp>.log (plain text with a header block, per-step table, and result block). Button enabled from the first poll onwards.
  • Fast gzip + background archive: GraphSyncService.sync_to_volume now uses compresslevel=1 (≈ 6–10× faster than the previous level 9). For session builds the Volume upload runs in a daemon thread; the build task completes immediately after the snapshot step with the note "Registry backup continues in the background."
  • Archive checkbox: new "Archive graph to registry" checkbox on the Build page (default on); when off, the archive step is marked skipped with a plain-English reason.
  • Honest timing: archive row shows "Continues after build" with a tooltip instead of a misleading 0ms duration when the upload is backgrounded.
  • Background archive task tracked as a separate registry_archive TaskManager task with its own navbar hourglass entry.

Task Manager & Notifications

  • Task.duration_seconds() — computed live for running/pending, frozen at completed_at − started_at for terminal tasks; serialised in to_dict().
  • TaskManager lifecycle log lines unified to START task <id> [<type>] — <name> / END task <id> [<type>] completed|failed|cancelled in <duration>.
  • _format_duration produces compact strings: 450ms, 2.40s, 1m 23.5s, 1h 5m.
  • Navbar hourglass: running rows show a live 1 s ticking elapsed time; bell toasts append (in 1m 23s).
  • All task/step timestamps emitted in UTC-aware ISO format (+00:00); duration_seconds() tolerates mixed naive/aware legacy timestamps.
  • Removed the broken "Open" link from terminal task completion toasts.

Health & Observability

/health readiness probe (replaces static {"status":"healthy"})

11 probes, each timed and wrapped in _safely_run so one failure never breaks the overall response:

Probe What it checks
runtime Python + OntoBricks version
filesystem.tmp Write sentinel + shutil.disk_usage thresholds (warn < 1 GB, error < 100 MB)
filesystem.session_dir Same check against the session directory
filesystem.log_dir Same check against the log directory
databricks.auth has_valid_auth + OAuth token mint in App mode
databricks.warehouse SELECT 1 against the configured warehouse
databricks.cloudfetch Real SQL probe with use_cloud_fetch=True; cached 5 min
registry.cfg Resolved catalog / schema / volume
registry.volume_read VolumeFileService.list_directory on the registry volume
registry.volume_write Write + delete of a sentinel file via the Files API
registry.uc_schema_ddl `CREATE OR REP...
Read more

v0.2.1

06 May 17:39
0fb1baf

Choose a tag to compare

v0.2.1 Pre-release
Pre-release

HF0.2.1

v0.2.0

01 May 16:52
8b875d2

Choose a tag to compare

v0.2.0 Pre-release
Pre-release

OntoBricks — Release Notes V0.2.0

Release window: May, 2026
Test status: all changes shipped with the suite green (≥ 1892 passing).


Highlights

  • New end-to-end Permissions model: app-level perms come from Databricks, domain-level perms from the Teams matrix, with a 4-step refactor (declarative guards, body data-* attrs, CSS gating) and a hardened Viewer / read-only role across every ontology and mapping widget.
  • Graph Chat (formerly Digital Twin): natural-language chat with the knowledge graph, now session-aware and stable behind the deployed reverse proxy.
  • New in-app Help Center accessible from the navbar, including a Starter Guide, Workflow / FAQ accordions, a Data Access / GraphDB engine map (LadybugDB as default), and a refreshed About page.
  • Lakebase registry backend wired end-to-end.
  • Databricks dev sandbox bundle (databricks.yml): deploys ontobricks-020 (main UI) and mcp-ontobricks (MCP); targets dev (Volume-only) and dev-lakebase (Volume + Lakebase Autoscaling postgres binding). Lakebase variables include lakebase_database_resource_segment (the db-… suffix from databricks postgres list-databases … -o json, not the Postgres datname) and lakebase_registry_schema (keep in sync with LAKEBASE_SCHEMA in app.yaml).
  • Deploy & bootstrap scripts aligned with the bundle: scripts/deploy.sh uses APP_NAME=ontobricks-020; make bootstrap-perms / make bootstrap-lakebase and the underlying shell scripts default to ontobricks-020, mcp-ontobricks, and the documented Lakebase project / schema-grant flow.
  • Major domain-switching robustness improvements (no more stale state, full-page loading overlay everywhere, including cross-domain bridges).
  • Security: patched two GitPython advisories (GHSA-rpm5-65cw-6hj4 and GHSA-x2qx-6953-8485 / CVE-2026-42284) by pinning gitpython>=3.1.47 via uv constraint — transitive vuln only, no code-path exposure.

Permissions & multi-tenant access control

  • App-level permissions now sourced from Databricks; domain-level permissions handled by the Teams matrix.
  • First-deploy bootstrap detects and fixes the app SP self-permission chicken-and-egg situation.
  • Viewer / read-only role:
    • Cascaded to all ontology and mapping widgets.
    • OWL preview no longer fails with "Unknown error" in read-only mode.
    • Belt-and-suspenders contextmenu blocker on design surfaces.
    • Gates data-source reset and all ontology / mapping imports.
  • Fixed Registry → Teams sub-menu leaking to non-admin users in the top navbar.
  • New three-level permission matrix tests + OWL endpoint contract tests.
  • 4-step permissions refactor: declarative guards, body data-* attributes, CSS-based gating.
  • Code-review fix-up: navbar role-badge inline CSS moved into permissions.css.

Graph Chat (renamed from Digital Twin)

  • Natural-language chat over the knowledge graph.
  • Forwards X-Forwarded-* headers on loopback to fix a deployed 302 redirect issue.
  • All tools now use session-aware internal routes.
  • Code-review hardening pass: clean layering, consistent error handling, deduplication, class-first refactor.

In-app Help Center

  • New navbar Help icon opens a modal with comprehensive documentation.
  • Refreshed About page to reflect the current product scope.
  • Visual pass:
    • Palette switched from blue to red/black (solid red, no gradients).
    • OntoBricks logo used in title and welcome hero.
    • Modal height locked (no resize when switching menu items).
    • Fixed double vertical scrollbar in tall sections (Starter Guide).
    • Removed horizontal scroll on the Welcome pipeline.
    • Removed grey borders on Workflow / FAQ accordions.
  • Starter Guide:
    • Added optional "Import Documents" step.
    • Rewrote the mapping step (manual or Auto-Map).
  • Added Data Access engine-map documentation, then generalized it to GraphDB (LadybugDB as default engine).

Domain switching

  • Fixed stale session state leaking between domain switches — DomainSession.import_from_file now fully resets ontology, assignment, design layout, domain info, metadata, and triplestore before overlay.
  • Full-page "Loading {domain}…" overlay now appears for:
    • Graph switcher modal.
    • Bridge-based switches via URL parameters.
    • Cross-Domain Bridge links going through /resolve (server-side redirect).

UI / UX fixes

  • Build sub-menu: fixed unreadable "Mapping" stale-indicator badge.
  • Build sub-menu: stopped reporting "Loaded" for the Graph DB digital twin when nothing had actually been built.
  • Sidebar: fixed the "Teams" icon misalignment.
  • Cockpit: the Active Version tile now reflects the version exposed via API/MCP (the one set in Registry → Browse), not merely the latest version on disk, with a (not loaded) hint when the loaded version differs. is_active keeps its legacy is_latest meaning so the read-only body class still gates writes correctly.
  • Navbar: the Domain name and version in the top navbar now refresh reliably after every domain mutation (new domain, load from registry, save / rename, version switch / create / rollback, file import). The /navbar/state sessionStorage cache (15 s TTL) was previously surviving window.location.reload(), so the navbar could display the previous domain identity for up to 15 s. Every mutation flow now invalidates the cache before navigating; in-place edits (e.g. saving Domain Information) re-fetch the navbar state immediately.
  • Domain → Versions: the API/MCP “Active” control is no longer a toggle on this page — it is shown as a read-only badge; changing the active version is done only from Registry → Browse (consistent with registry-centric operations).
  • Domain creation: Save to UC is now blocked when the chosen Domain Name already exists in the registry. The duplicate-name check (/domain/check-name) was already running on every keystroke of the name field, but its result was only advisory — the navbar's Save action still POSTed and the user only saw the conflict after a round-trip. The Save flow now re-runs the check synchronously and refuses with a clear notification + focuses the offending field.

Documentation

  • README, docs/features.md, docs/INFO.md, docs/user-guide.md, docs/get-started.md, docs/README.md, and docs/mcp.md updated so operator-facing text matches the above: Ontology Designer, Domain Cockpit Active Version vs loaded vs latest, Registry → Browse for MCP/API active version, new-domain loading overlay, Digital Twin path refresh on committed name/version changes, duplicate-name guard, and navbar identity refresh.
  • docs/deployment.md rewritten for the current DAB: dev / dev-lakebase targets, correct bundle deployment bind / bundle run resource keys and app names, scripts/deploy.sh flags (no legacy --all / --mcp-only), Lakebase variable summary, Step 5b for bootstrap-lakebase-perms.sh, full deployment checklist, MCP and troubleshooting sections, and §9 DAB reference aligned with the Makefile.
  • README Lakebase paragraph: documents lakebase_database_resource_segment and the list-databases lookup pattern.

Tasks & Notifications

  • Tasks panel now shows only currently running tasks; finished tasks are moved to the Notifications drawer.

Backend & Databricks Apps bundle (operator-facing)

  • Lakebase registry backend wired end-to-end (runtime + optional Volume toggle unchanged).
  • databricks.yml: ontobricks_dev_app / mcp_ontobricks_app resource keys; workspace app names ontobricks-020 and mcp-ontobricks; dev-lakebase target adds the Apps postgres resource whose database path ends with lakebase_database_resource_segment (db-… from the Postgres API name field).
  • scripts/deploy.sh: default target dev-lakebase; APP_NAME set to ontobricks-020 so post-deploy bootstrap-app-permissions.sh and bootstrap-lakebase-perms.sh resolve the correct service principal.
  • scripts/bootstrap-lakebase-perms.sh: default Lakebase project ontobricks-app, default Postgres DB ontobricks_registry (dedicated datname aligned with the bundle bind), schema ontobricks_registry; default grantees ontobricks-020 and mcp-ontobricks. Use -d databricks_postgres if the registry schema still lives in the shared default DB. Retarget with -i / -d / -s / -a when your workspace differs.
  • scripts/bootstrap-app-permissions.sh: default app list ontobricks-020 mcp-ontobricks (matches the bundle).

Security

  • Patched two GitPython advisories pulled in transitively via
    mlflow-skinny:
    • GHSA-rpm5-65cw-6hj4 — command injection via upload_pack /
      receive_pack kwargs on Repo.clone_from, Remote.fetch,
      Remote.pull, Remote.push (affected [3.1.30, 3.1.47)).
    • GHSA-x2qx-6953-8485 / CVE-2026-42284 — argument injection via
      multi_options shlex.split bypass in _clone() /
      Submodule.update (affected <= 3.1.44).
  • Both fixed by adding gitpython>=3.1.47 to
    [tool.uv].constraint-dependencies in pyproject.toml; lockfile
    bumped gitpython 3.1.46 → 3.1.47. OntoBricks itself does not import
    git anywhere, so there is no code-path exposure — this only closes
    the SCA finding on the lockfile / installed env.

Upgrade notes

  • Databricks Apps sandbox name: if you still point scripts or docs at ontobricks-dev, switch to ontobricks-020 (the name in databricks.yml for ontobricks_dev_app) for databricks apps get, bootstrap-app-permissions.sh, and bootstrap-lakebase-perms.sh -a …, or pass -a explicitly.
  • Lakebase bundle variables: the monolithic branc...
Read more

V0.1.1

23 Apr 13:21

Choose a tag to compare

V0.1.1 Pre-release
Pre-release

This is the first official release.