Skip to content

Unified mapper data model PR2a: write-side wiring + AI Assistant payload v2 + match_type removal#17

Merged
snyaggarwal merged 4 commits into
mainfrom
issues#2337-pr2a
May 11, 2026
Merged

Unified mapper data model PR2a: write-side wiring + AI Assistant payload v2 + match_type removal#17
snyaggarwal merged 4 commits into
mainfrom
issues#2337-pr2a

Conversation

@paynejd
Copy link
Copy Markdown
Member

@paynejd paynejd commented May 11, 2026

⚠️ Deploy to STAGING ONLY before prod

This PR contains the bridge-recommendation bug fix on the AI Assistant payload side, but the bridge code path can't be exercised locally (the bridge module is closed-source — BridgeMatchStub.jsx returns canBridge: () => false in OSS builds). Please deploy to staging and run a full bridge-using project end-to-end before promoting to prod.

The Option A safety means nothing user-visible breaks even if the bridge wiring has a bug — the new normalized state is dark-launched and the legacy candidates field stays in the AI payload. But staging verification is the only way to confirm the new dark-launched paths are correct before PR2b flips reads onto them.

Summary

PR2a of the row-scoped, canonical-URL-identified candidate/concept refactor for OCL Mapper. Spec: unified-mapper-model.md.

References OpenConceptLab/ocl_issues#2337 (does not yet close — PR2a of 4: PR1 merged as #16, PR2b and PR3 still pending).

What's in this PR

Four commits, dark-launched (UNIFIED_MODEL_ENABLED = false stays default-OFF):

  1. 1d0dfecmatch_type removal (-95 LOC). Removes the legacy 5-bucket match_type enum (very_high/high/medium/low/no_match), matchTypes state, selectedMatchBucket filter UI, updateMatchTypeCounts, onMatchTypeChange, and the Auto Match Switch + count Badge. The 3-bucket score grouping (recommended/available/low_ranked) driven by candidatesScore thresholds — already used by ScoreBucketButton — is the replacement. setStateViews's auto-match path is converted from match_type === 'very_high' to search_normalized_score >= candidatesScore.recommended, matching the existing setAutoMatched threshold. __match_type__ and __Match Type__ stay in the save-format omit lists as defensive cleanup against legacy data.

  2. 0b881ef — bridge wiring. New CONCEPT_IDENTITY_BY_TYPE map in algorithms.jsx as single source of truth (covers ocl-search, ocl-semantic, ocl-bridge, ocl-ciel-bridge). getAlgoDef injects concept_identity from the map when an algo (typically API-loaded bridge variants) is missing it. buildProjectContext now includes bridge_repo derived from algo.target_repo_url (relative URL → derived canonical_url; PR2b will read explicit canonical from bridge repo metadata once ConfigurationForm carries it). fetchBulkBridgeCandidates callback adds the normalizer + mergeIntoRowMatchState wiring (gated by the flag); the per-row path was already routed through onResponse so it just needed the concept_identity injection.

  3. b29d490 — scispacy wiring. Adds 'ocl-scispacy' to CONCEPT_IDENTITY_BY_TYPE (reference_source: 'fixed', canonical_url: 'http://loinc.org'). The single-row path was already routed through onResponse and the inline fromScispacyResultsToConcepts transform was already there; only the bulk path callback needed normalizer wiring (mirrors the bridge bulk pattern from [Snyk] Security upgrade nginx from 1.19-alpine to 1.29.4-alpine #2).

  4. 8aafd4e — AI Assistant payload v2 (Option A: additive) [the bridge-recommendation bug fix]. New buildV2RecommendationPayload helper inline before fetchRecommendation. Iterates selectedAlgoIds, runs normalizeAlgorithmInvocation per algo against allCandidates for the row (sourced from legacy state — works regardless of feature flag), aggregates with richer-wins dedup, then projects into:

    • target_repo: canonical_url + relative_url + version
    • recommendable_concepts: deduped target-repo concepts with per-source evidence[] including bridge provenance via via.bridge_concept_key + map_type
    • bridge_context: bridge intermediaries with target_concept_keys pointing back to recommendable entries they justify

    fetchRecommendation payload spreads v2 fields alongside the legacy candidates field with payload_version: 'v2' so the prompt template can branch. AICandidatesAnalysis.jsx and the aiCandidateID export read canonical_reference.code first, fall back to legacy concept_id/id. Bridges are now structurally excluded from the recommendation pool — once the prompt template revision lands and reads recommendable_concepts instead of candidates, the bridge-recommendation bug is fixed.

Verification done locally

  • npm test → 26/26 normalizer unit tests passing (covers concept_identity resolution for all three reference_source modes, bridge cascade fan-out + intra-invocation dedup, scispacy partial → richer dedup, multi-algo convergence)
  • npm run eslint → clean
  • NODE_ENV=production npm run build → webpack compiled (only pre-existing bundle-size warnings)
  • Smoke test against prod via local dev server: project loads, candidates retrieve (ocl-search, ocl-semantic, scispacy), AI Recommendation fires successfully, v2 fields present in payload (payload_version: 'v2', target_repo, recommendable_concepts correctly deduped, bridge_context: [] since bridge module unavailable in OSS build), recommendable_concepts.length ≤ legacy candidates.length confirming dedup, current prompt template still works (Option A's safety net), recommendation displays normally in UI
  • Match-type removal smoke: Auto Match Switch + count Badge are gone; ScoreBucketButton works; Score badges render with bucket-derived color

Verification needed in staging (bridge path)

The bridge code path is dark-launched but couldn't be exercised in the OSS dev build. In staging please:

  1. Open a CIEL → LOINC bridge project
  2. Confirm Mapper still works as today with the flag OFF (no user-visible regression — Option A's safety means even if our bridge wiring is wrong, the legacy path is untouched)
  3. Temporarily flip UNIFIED_MODEL_ENABLED = true in MapProject.jsx:116
  4. Run candidates on a row that triggers the bridge algorithm
  5. Inspect rowMatchStateRef.current (e.g. via React DevTools or a temporary console.log in mergeIntoRowMatchState):
    • Bridge intermediary should appear as a ConceptRow with reference.url = the bridge repo's canonical URL (e.g. https://CIELterminology.org or the derived https://ns.openconceptlab.org/orgs/CIEL/sources/CIEL/) and a Candidate entry with type: 'bridge'
    • Each cascade target should appear as a separate ConceptRow with reference.url = target repo canonical (e.g. http://loinc.org) and Candidate entries with type: 'bridge_child', bridge_concept_key, parent_candidate_id, map_type
    • concept_definitions for cascade targets are lookup_status: 'pending' (will be filled by ensureLoaded in PR2b)
  6. Run AI Recommendation on the same row and inspect the payload — bridge_context[] should now be populated with each bridge intermediary and its target_concept_keys; recommendable_concepts[i].evidence[] for cascade targets should include entries with candidate_type: 'bridge_child' and via: { bridge_concept_key, map_type }
  7. Revert the feature flag change before doing anything else

If bridge entries don't appear in rowMatchState or look malformed, the bridge response shape doesn't match what the normalizer expects (the spec's assumption from cascade_target_concept_code/url/name); fix is to extend the cascade extraction in normalizers.js and re-test.

Coordination needed in ocl-ai-assistant before PR3

The new payload v2 fields (recommendable_concepts, bridge_context) bypass the server-side _to_essential field-stripping at services.py:251 — which today strips candidates, bridge_candidates, etc. Add the new field names to that allow-list when revising the prompt template, otherwise the new fields skip the stripping and inflate token count.

The prompt-template revision (also separate scope) should branch on payload_version === 'v2' to read from recommendable_concepts/bridge_context instead of candidates. That's when the bridge-recommendation bug is structurally fixed.

Deferred to PR2b / PR3

  • PR2b — read-side flip (Candidates.jsx, Concept.jsx, Score.jsx, MapButton.jsx, setAutoMatched/setStateViews); ensureLoaded over $resolveReference (verified callable from existing APIService — see unified-mapper-model.md status table); MultiAlgoSelector canonical_url field for custom algos; ConfigurationForm namespace + bridge_repos[] UI; flip the feature flag ON.
  • PR3 — schema-v2 save format with normalizeLegacy.js; remove legacy allCandidates, lookupCandidates/lookupCode; drop the legacy candidates field from the AI payload; drop the concept_id/id fallback shims from the response handler (the request-side and response-side legacy compat get cleaned up together).

Test plan

  • npm install, npm test, npm run eslint, npm run build all green
  • Code review the four commits
  • Deploy to staging only
  • Bridge flag-on test in staging (steps in "Verification needed in staging" above)
  • After bridge verification + prompt-template revision lands in ai-assistant: deploy to prod
  • Don't merge PR2b or change the feature flag default until staging bridge verification passes

🤖 Generated with Claude Code

paynejd and others added 4 commits May 9, 2026 10:57
…es state

The legacy 5-bucket match_type enum (very_high/high/medium/low/no_match) is
superseded by the 3-bucket score grouping (recommended/available/low_ranked)
already driven by candidatesScore thresholds. Maintaining both invited drift
and added surface area for no benefit.

Removed:
- MATCH_TYPES constant in constants.jsx (and orphan AutoMatch/MediumMatch/
  LowMatch/NoMatch icon imports)
- matchTypes state and selectedMatchBucket state in MapProject.jsx
- updateMatchTypeCounts() and all its call sites
- onMatchTypeChange handler and the selectedMatchBucket filter in getRows
- The Badge + Switch UI for the very_high filter
- showMatchSummary (orphan) and orphan FormControlLabel/Switch/Badge/countBy/
  sum imports
- match_type read in Score.jsx (color now derived from bucketColor)
- Orphan setMatchTypes call inside setAutoMatched

Refactored:
- setStateViews now derives auto-match decisions from
  search_meta.search_normalized_score >= candidatesScore.recommended,
  matching the existing setAutoMatched threshold

The __match_type__ / __Match Type__ entries in the save-format omit lists
are kept as defensive cleanup against legacy data.

No behavior change visible to users beyond the removal of the very_high
filter Switch (which is replaced by the existing ScoreBucketButton
filtering on 'recommended').

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nified-model normalizer

Bridge algorithms (ocl-bridge / ocl-ciel-bridge) come from the OCL Online
API and don't carry the concept_identity config the normalizer needs.
Their per-row fetch path already routes through onResponse via the
fetchBridgeCandidates callback chain, but the bulk fetch path bypasses
onResponse with its own callback. Both paths now feed the normalizer
when UNIFIED_MODEL_ENABLED is true.

algorithms.jsx:
- New CONCEPT_IDENTITY_BY_TYPE export — single source of truth for
  per-algo-type concept identity, covering ocl-search, ocl-semantic,
  ocl-bridge, ocl-ciel-bridge. Bridge entries declare reference_source:
  'bridge_repo' for the intermediary plus a cascade_target block that
  resolves the cascade to target_repo.
- useAlgos now references the map for the inline ocl-search /
  ocl-semantic concept_identity (no behavior change; deduplication).

MapProject.jsx:
- getAlgoDef injects concept_identity from CONCEPT_IDENTITY_BY_TYPE when
  the algo (typically API-loaded bridge variants) doesn't carry it.
- buildProjectContext now includes bridge_repo when a bridge algo is
  selected, derived from algo.target_repo_url. PR2b will read explicit
  canonical from bridge repo metadata once ConfigurationForm carries it.
  Reordered so it sits below bridgeAlgo in the component body to satisfy
  TDZ for the new useCallback dep.
- fetchBulkBridgeCandidates: callback adds the
  normalizeAlgorithmInvocation + mergeIntoRowMatchState wiring (gated by
  UNIFIED_MODEL_ENABLED) so the bulk path mirrors what onResponse does
  on the per-row path.

Feature flag remains OFF by default; this is dark-launch scaffolding.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… unified-model normalizer

The single-row scispacy path already routed through onResponse and the
inline fromScispacyResultsToConcepts transform was already inside
onResponse. The two missing pieces:

- concept_identity for ocl-scispacy: added to CONCEPT_IDENTITY_BY_TYPE
  with reference_source='fixed', canonical_url='http://loinc.org',
  code_field='id'. getAlgoDef injects it on lookup since the scispacy
  algo definition comes from the OCL Online API and doesn't carry its
  own concept_identity.
- Bulk path (fetchBulkScispacyCandidates) had its own callback that
  bypassed the normalizer. Added the normalizeAlgorithmInvocation +
  mergeIntoRowMatchState wiring (gated by UNIFIED_MODEL_ENABLED) to
  mirror what we did for the bulk bridge path in 0b881ef.

Feature flag remains OFF.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…n A: additive)

Add the v2 payload structure alongside the legacy `candidates` field in
fetchRecommendation, sourced from allCandidates via the unified-model
normalizer (so it works with the feature flag OFF). The current prompt
template ignores the new fields and continues working unchanged. Once
the prompt template is revised to read recommendable_concepts /
bridge_context, the bridge-recommendation bug is fixed structurally:
bridges live in bridge_context (never recommendable), and target-repo
concepts in recommendable_concepts are deduped across algorithms with
per-source evidence.

MapProject.jsx:
- New buildV2RecommendationPayload(rowIndex) helper inline before
  fetchRecommendation. Iterates selectedAlgoIds, runs
  normalizeAlgorithmInvocation per algo against allCandidates for the
  row, aggregates with richer-wins dedup, then projects into:
    - target_repo: from buildProjectContext (canonical_url + version)
    - recommendable_concepts: deduped target-repo concepts with per-
      source evidence[] including bridge provenance via 'via'
    - bridge_context: bridge intermediaries with target_concept_keys
      pointing back to recommendable_concepts entries they justify
- fetchRecommendation payload spreads v2 fields when constructible,
  with payload_version: 'v2' so the prompt template can branch.
- aiCandidateID export now reads canonical_reference.code first, falls
  back to legacy concept_id (for the period both prompt-template
  versions may be in flight).

AICandidatesAnalysis.jsx:
- getAlternateIds() and the primary_candidate display read
  canonical_reference.code first, fall back to legacy concept_id/id.

The legacy `candidates` field, the concept_id fallback shims, and
payload_version itself can all be removed in PR3 alongside the other
legacy-shape cleanup once the prompt template revision is stable.

Note for ocl-ai-assistant coordination: the server-side _to_essential
allow-list at core/prompts/services.py:251 should add 'recommendable_
concepts' and 'bridge_context' so the new fields get the same field-
stripping pass as 'candidates' / 'bridge_candidates'.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@paynejd paynejd requested a review from snyaggarwal May 11, 2026 10:02
@snyaggarwal snyaggarwal merged commit a0d0185 into main May 11, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants