Skip to content

Allow Option.insights to reference findings (not only prior_insights) #70

Description

@cailmdaley

Problem

Option.insights cannot currently reference items declared in findings:. The semantic validator at ASTRA/src/astra/validation/semantic.py:394-401 resolves Option.insights against prior_insights: only and rejects finding IDs with:

Option insight '{insight_ref}' not found in prior_insights

The Option.insights slot itself is stringly-typed in astra-spec/analysis.yaml (no explicit range:), so the constraint is purely at the validator layer.

Why it should be possible

The epistemic graph of an analysis includes finding → decision edges. "We observed X; therefore we chose Y among the alternatives" is a basic scientific move. When it happens inside a single analysis unit, the schema has no way to represent it. The reproducibility and composability guarantees ASTRA promises are incomplete at exactly the decision site where a reader most needs them: a graph extractor cannot answer "what in-analysis findings motivated this decision?" as a structural query.

Minimum example

Inference analysis fitting cosmological parameters. A nuisance parameter (the galaxy-bias second-order coefficient b2) turns out to be prior-dominated, and its prior choice meaningfully shapes the final σ8 uncertainty through the degeneracy structure. Ubiquitous pattern in likelihood-based cosmology.

outputs:
  - id: b2_posterior_with_s8
    type: figure
    description: "2D joint posterior on b2 and σ8, with b2 1D marginal against its prior range."

findings:
  b2_prior_dominated_degenerate:
    claim: >-
      The b2 posterior is flat across its prior [0, 5]; the data provides no constraint.
      The 2D joint posterior with σ8 shows an anti-correlation of r = -0.7, so the b2
      prior range directly sets how much uncertainty propagates into σ8.
    evidence:
      - artifact: b2_posterior_with_s8

decisions:
  b2_prior_choice:
    label: "Prior on the galaxy-bias second-order coefficient b2"
    rationale: >-
      The data does not inform b2. The choice of prior therefore controls how
      much uncertainty enters the σ8 posterior through the b2–σ8 degeneracy. The
      prior should be as informative as external knowledge licenses: no wider,
      no narrower.
    options:
      gaussian_from_sims:
        label: "Gaussian prior from hydrodynamical simulations of matching galaxy samples"
        insights: [b2_prior_dominated_degenerate]   # ← currently rejected by validator
        description: >-
          σ_b2 ≈ 0.3 from N-body + HOD measurements of samples matching our selection.
          Tightest defensible prior; directly addresses the degeneracy-driven σ8 inflation.
      flat_hod_motivated:
        label: "Flat prior [1, 3], physically motivated range from HOD literature"
        insights: [b2_prior_dominated_degenerate]   # ← currently rejected by validator
        description: >-
          Less informative than the Gaussian but uses no model-dependent sim inputs.
      flat_wide_0_5:
        label: "Keep the current flat [0, 5] prior"
        excluded_reason: >-
          Unphysically wide; inflates σ8 uncertainty through the b2–σ8 degeneracy
          without reflecting actual prior knowledge of galaxy bias.

All three entities belong in a single inference analysis. The finding is produced by this analysis (you ran the fit, observed that b2 was prior-dominated and degenerate with σ8). The decision genuinely rests on the finding: without the degeneracy observation, prior choice would be cosmetic. No circularity: the finding comes from the initial flat-prior run, the decision selects a new prior, the canonical run uses the new prior.

Proposed fix

  • Semantic validator resolves Option.insights against the union of prior_insights: and findings: rather than prior_insights: alone.
  • Downstream tools (Vellum card chips, CLI renderers) accept both, optionally with a bag-of-origin visual distinction.
  • Narrative-authoring conventions document when a finding is the appropriate justification source.

Happy to PR if this lands directionally.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions