Skip to content

Build-cache data mart + MetricFlow semantic layer (10 measures + KPI)#2

Merged
zozo123 merged 3 commits into
mainfrom
feat/build-cache-data-mart
Jun 22, 2026
Merged

Build-cache data mart + MetricFlow semantic layer (10 measures + KPI)#2
zozo123 merged 3 commits into
mainfrom
feat/build-cache-data-mart

Conversation

@zozo123

@zozo123 zozo123 commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

Closes #1.

Adds the data mart and governed semantic layer on top of the staging foundation.

What's here

  • Marts: dim_agents, fct_build_tasks (task grain — the cache facts), fct_builds (build grain rollup).
  • Contract: fct_builds is contract: enforced + access: public — schema breaks fail at build time.
  • Semantic layer (MetricFlow): 10 governed measures + the headline KPI cache_hit_rate and supporting metrics (remote_execution_rate, cpu_hours_saved, cpu_efficiency, gb_served_from_cache), sliceable by tenant/project/build_type/region/agent_class/task_kind/time.
  • Tests: PK unique+not_null, FK relationships, and a singular invariant (cache_hits <= cacheable_tasks, rate ∈ [0,1]).
  • CI: dbt build on DuckDB → dbt parsemf query the KPI, plus an agent-guardrail check that flags execution-plane changes per ADR 0001.

Verified locally

  • dbt build → PASS=35, 0 errors.
  • mf query --metrics cache_hit_rate,cpu_hours_saved,remote_execution_rate0.6898, 264.16, 0.7562.

Confinement note

Pure SQL/YAML — no macros, hooks, run-operation, Python models, or packages.yml. Per ADR 0001 this sits in the branch + DB role + CI zone; no sandboxed execution required.

zozo123 and others added 3 commits June 22, 2026 18:01
…+ KPI)

Adds the marts and semantic layer on top of staging:
- dim_agents, fct_build_tasks (task grain, semantic), fct_builds (build grain)
- enforced contract on the public fct_builds mart
- MetricFlow semantic model with 10 governed measures
- KPI cache_hit_rate (+ remote_execution_rate, cpu_hours_saved, cpu_efficiency,
  gb_served_from_cache) and a metricflow time spine
- singular business-invariant test (cache_hits <= cacheable_tasks; rate in [0,1])
- CI: dbt build on DuckDB + KPI resolution via MetricFlow + agent-guardrail check

Closes #1

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…s sources

dbt has no dependency edge from a source() to the seed that populates it, so a
parallel `dbt build` on a fresh DB races the staging views ahead of the seed
load. Load seeds first, then `dbt build --exclude resource_type:seed`, on a
persistent file target (DBT_TARGET=ci) shared across steps incl. MetricFlow.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
dbt has no built-in DBT_TARGET env var, so `env: DBT_TARGET: ci` was a no-op and
CI actually ran on the dev target while the ci: output sat unused — misleading
config that passed only by luck. MetricFlow can't select a target either (no
--target flag); it uses the profile default. So consolidate on a single
persistent dev target that seed/build/parse/mf all share, and document the
warehouse target as a commented example.

Found via adversarial review workflow. No behavior change to a passing build.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@zozo123 zozo123 merged commit 80f4385 into main Jun 22, 2026
2 checks passed
@zozo123 zozo123 deleted the feat/build-cache-data-mart branch June 22, 2026 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Model build-cache telemetry into a data mart with a semantic layer (10 measures + KPI)

1 participant