advisor: surface learning evidence gaps by sarvesh1327 · Pull Request #53 · sarvesh1327/advisor

sarvesh1327 · 2026-04-24T16:24:50Z

Summary

Start Phase 9 from docs/plans/2026-04-24-final-multiturn-real-learning-completion.md.
Add dashboard evidence badges per run for trajectory/reward/lineage presence.
Add machine-readable DB and artifact evidence counts to dashboard snapshots and validation gate output.
Add profile artifact evidence for active checkpoint id, adapter file presence, training manifest, and backend manifest.
Surface the same evidence through HTTP dashboard/activity routes and CLI validation/activity commands.

Testing

python -m pytest tests/agent/advisor/test_dashboard.py tests/agent/advisor/test_api.py tests/agent/advisor/test_cli.py -q
ruff check .
python -m pytest tests/agent/advisor -q
git diff --check

Advisor service remained stopped: port_8765=not_listening

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e0d6be8103

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-24T16:29:29Z

+        "outcomes": sum(1 for row in rows if row.get("outcome")),
+        "reward_labels": sum(1 for row in rows if row.get("reward_label")),
+        "lineages": sum(1 for row in rows if store.get_lineage(row.get("run_id")) is not None),
+        "trajectories": len(store.list_trajectories()),


Track trajectory coverage by run_id

trajectories is currently computed as the total number of trajectory records, but _evidence_blocking_reasons compares it against total run count. If one run has multiple trajectories while another has none, these counts can still match and missing_trajectories will not be raised, so the evidence surface can report readiness even though some runs are missing trajectory provenance. This should count runs with at least one trajectory (or otherwise evaluate coverage per run) instead of raw trajectory rows.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d3e4a3fb7d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-24T16:41:49Z

+        "runs": len(rows),
+        "outcomes": sum(1 for row in rows if row.get("outcome")),
+        "reward_labels": sum(1 for row in rows if row.get("reward_label")),
+        "lineages": sum(1 for row in rows if store.get_lineage(row.get("run_id")) is not None),


Batch lineage evidence lookup for activity snapshots

The lineage count in build_advisor_evidence_snapshot performs store.get_lineage(...) once per run, and each get_lineage call opens its own SQLite query/connection path. Because both /v1/operator/advisor-activity and the auto-refreshing /dashboard/advisor-activity route build this snapshot on every request, larger run histories will cause many DB round-trips per page load and can materially degrade dashboard latency (and DB contention). This should be replaced with a single bulk lineage lookup (for example, one query that returns the run_ids present in run_lineages).

Useful? React with 👍 / 👎.

sarvesh1327 added 2 commits April 24, 2026 21:53

advisor: surface learning evidence gaps

e0d6be8

fix: tolerate scalar trajectory rewards

d05222b

chatgpt-codex-connector Bot reviewed Apr 24, 2026

View reviewed changes

Count trajectory coverage by run

d3e4a3f

sarvesh1327 merged commit 839c484 into main Apr 24, 2026
1 check passed

sarvesh1327 deleted the phase9-evidence-surfaces branch April 24, 2026 16:37

chatgpt-codex-connector Bot reviewed Apr 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

advisor: surface learning evidence gaps#53

advisor: surface learning evidence gaps#53
sarvesh1327 merged 3 commits into
mainfrom
phase9-evidence-surfaces

sarvesh1327 commented Apr 24, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 24, 2026

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sarvesh1327 commented Apr 24, 2026

Summary

Testing

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant