Optimize fetching Recent workflows

**This will be slow on prod — measured 2.39s on the heaviest workspace, and it runs on every home-page load.** I profiled it against prod (`EXPLAIN ANALYZE`, live DB).

The `DISTINCT ON (parent_version__published_run_id)` forces Postgres to read **all** of a workspace’s matching `SavedRun`s and disk-sort them just to emit 4 cards. `bots_savedrun` is 7.4M rows, and run history is heavily concentrated per workspace:

| workspace | runs (w/ parent PR) | distinct published runs |
|---|---:|---:|
| 5 | 1,780,569 | 1 |
| 232,791 | 231,343 | 3 |
| 3,206,451 | 69,119 | 1 |
| 3,764,345 | 53,981 | 3 |
| 3,466,675 | 52,471 | 10 |

Plan for workspace 5:

```
Limit (rows=4) actual time=2370..2383
  -> Sort (updated_at DESC)
    -> Unique  [DISTINCT ON published_run_id]
      -> Gather Merge rows=1780569
        -> Sort Key: published_run_id, updated_at DESC
              Sort Method: external merge  Disk: ~21MB
          -> Parallel Seq Scan on bots_savedrun  Filter:(workspace_id=5)  Rows Removed: 1775903
Execution Time: 2390 ms (+555ms JIT)
```

Crucially, no query reformulation fixes this: deriving “recent distinct workflows” from `SavedRun` history is inherently O(total runs), and the data above shows workspaces with millions of runs but only 1–3 distinct published runs (so a “walk -updated_at and dedup in Python” approach would scan the *entire* history and still never reach 4 distinct).

**Proper fix: denormalize last-used.** Maintain a small `(workspace, published_run, last_used_at)` row, upserted whenever a run is saved (and a backfill migration). Then recent workflows becomes:

```python
WorkspaceRecentWorkflow.objects.filter(workspace=workspace)
    .order_by("-last_used_at")[:RECENT_WORKFLOW_LIST_LIMIT]
```

That reads 1–240 rows per workspace instead of up to 1.78M — O(distinct), exact, and index-friendly (`Index(workspace, -last_used_at)`; add `uid` for the team-scoped path). Happy to share the EXPLAIN output if useful.

_Originally posted by @devxpy in https://github.com/GooeyAI/gooey-server/pull/992#discussion_r3339710433_
            

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize fetching Recent workflows #993

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

workspace	runs (w/ parent PR)	distinct published runs
5	1,780,569	1
232,791	231,343	3
3,206,451	69,119	1
3,764,345	53,981	3
3,466,675	52,471	10

Optimize fetching Recent workflows #993

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions