Skip to content

rbx ui: inverted test-first explorer (verdicts of every solution per test) #550

@rsalesc

Description

@rsalesc

Part of #326 (umbrella). Effort: Large (~4–6 days) · Risk: Medium-high (new screen + navigation + CSS + interaction with compare/metadata) · Depends on: Issue A (MAIN badge); otherwise largely standalone.

Add a test-first flow complementing today's solution-first flow. Today: pick a solution (RunExplorerScreen) → browse its tests (RunTestExplorerScreen). This adds: pick a test → see every solution's verdict on it.

Decision: build the primary test-first navigation screen (not a transposed matrix grid). Matches the issue wording: "navigate across tests first, and show verdicts of each solution."

Flow

  • New TestCentricExplorerScreen (rbx/box/ui/screens/), reachable from the main menu (new option) and/or via an i (invert) toggle on RunExplorerScreen.
  • Left pane: the list of all tests across groups (same renderer as the solution-first list, minus a single solution's verdict column).
  • Right pane: for the highlighted test, a compact table/list of every solution → its verdict (outcome badge + time + memory), with the MAIN solution marked (Issue A badge). Selecting a solution row drills into the existing per-(solution, test) detail widgets (TwoSidedTestBoxWidget, input FileLog, metadata footer) — reuse, don't duplicate.

Data assembly

  • The per-(solution, test) evals already live on disk; get_solution_eval (rbx/box/ui/utils/run_ui.py:33) loads one. Add a transpose helper: for a given test entry, iterate skeleton.solutions and collect each solution's Evaluation (the dual of get_solution_evals, which today fixes a solution and iterates entries).
  • Verdict markup reuses solutions.get_testcase_markup_verdict / get_full_outcome_markup_verdict.

Open sub-decisions (resolve during planning)

  • Whether the inverted screen reuses side-by-side compare semantics (compare a test's output between two solutions) or defers that to v2.
  • Whether to keep both menu entry and the i toggle, or pick one entry point.
  • Sample-vs-hidden / POINTS group score display in the test-first list.

Tests

  • Transpose helper returns the right verdict per solution for a test, tolerating missing .eval files (incomplete runs).
  • Screen renders the verdict-per-solution panel and drills into the detail view.

Design: docs/plans/2026-06-08-rbx-ui-qol-design.md.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions