Skip to content

Wire SQL-AST column-lineage inference into the live pipeline draft preview #9817

Description

@rubenfiszel

Follow-up to #9814 (column-level lineage for DuckLake pipelines).

Context

#9814 ships automatic column-lineage inference from the DuckDB SQL AST (windmill-parser-sql-asset::infer_column_lineage). It runs server-side — at the asset-graph endpoint and at deploy via parse_assets — so inferred column lineage shows on deployed pipeline members. That is the case that matters and it works today.

Unsaved drafts currently show only their // column annotations, not inferred lineage, for two reasons:

  1. The live preview's body inference (frontend/src/lib/infer.tsparse_assets_sql) uses the published, pinned windmill-parser-wasm-asset package (1.728.1 in frontend/package.json), which predates the inference and so doesn't emit column_lineage.
  2. Even with a fresh WASM, nothing on the live path consumes it: inferAssets callers read only .assets, and resolveGraph derives column lineage solely from the TS annotation parser (parsePipelineAnnotations).

This was a deliberate scope decision — deployed pipelines are where column lineage is inspected/trusted, and drafts keep the annotation path working — but the gap is worth closing for edit-time feedback.

Scope

  1. Rebuild + npm publish a new windmill-parser-wasm-asset from current Rust source (so the WASM parse_assets_sql serializes column_lineage), and bump the pin in frontend/package.json. Same manual release flow as the other windmill-parser-wasm-* packages.
  2. In resolveGraph (and/or the live overlay in the pipeline +page.svelte), merge the WASM-inferred column_lineage for the edited DuckDB script with its // column annotations — reusing the same merge_column_lineage precedence (annotation wins) the backend already applies, so the live preview matches what deploys.
  3. Expose column_lineage from inferAssets's result type so callers can read it (today only .assets is surfaced).

Acceptance

  • Editing a DuckDB pipeline script with a computed/passthrough projection (e.g. SELECT amount + tax AS total FROM orders) shows the inferred column-lineage badge + ColumnLineageView diagram before deploy, matching the deployed result.
  • A // column annotation still overrides the inferred edge for that output column, live.
  • Parser parity (parsePipelineAnnotations.parity.test.ts) and the existing inference unit tests stay green.

Non-goals

  • The dbt docs serve-style static lineage site (separate TODO in docs/pipelines-vs-dbt.md §3).
  • Inference for non-DuckDB languages (no SQL AST; annotation-only by design).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions