Skip to content

[Phase 5b] Top-K heap for ORDER BY ... LIMIT N (sq-lcw.4)#35

Merged
philcunliffe merged 1 commit into
integration/batch-executionfrom
polecat/sq-lcw.4
May 6, 2026
Merged

[Phase 5b] Top-K heap for ORDER BY ... LIMIT N (sq-lcw.4)#35
philcunliffe merged 1 commit into
integration/batch-executionfrom
polecat/sq-lcw.4

Conversation

@philcunliffe
Copy link
Copy Markdown
Contributor

Summary

[Phase 5b] Top-K heap for ORDER BY ... LIMIT N. Implemented Phase 5b Top-K heap for ORDER BY ... LIMIT N. SortNode gains an optional 'limit' cap (LIMIT + OFFSET, <=10000, no DISTINCT). Executor uses a bounded max-heap with lazy multi-key evaluation: keeps memory O(k) and preserves the cell-access economy of the existing sort path (later sort keys unevaluated unless earlier keys tie). First-key evals are batched in parallel for throughput. Full sort remains the fallback when no limit hint is set. 1573/1573 tests pass; 20 new tests cover correctness, planner edge cases (DISTINCT, threshold, OFFSET), and a 100k-row streaming check that proves O(k) memory. Lint and tsc clean.

Delivery

  • Issue: sq-lcw.4
  • Branch: polecat/sq-lcw.4
  • Target: integration/batch-execution

Sorts feeding a small LIMIT now use a bounded max-heap of size limit + offset
instead of buffering the full input. Memory stays O(k) regardless of input
size, matching Phase 5b's bounded-memory goal for ORDER BY x LIMIT 100 over
100M-row scans.

Planner: SortNode gains an optional `limit` cap. The planSelect / planSet
construction sets it when LIMIT is defined, LIMIT + OFFSET <= 10000, and no
DISTINCT sits between Sort and Limit (DISTINCT can drop rows, so the cap
would be unsafe).

Executor: a new TopKHeap maintains the worst-among-top-K candidate at the
root. Sort keys are evaluated lazily, term by term, so multi-key ORDER BY
keeps later (often expensive) terms unevaluated unless earlier terms tie —
preserving the same cell-access economy the existing full-sort path
guarantees. First-key evaluation is parallelized in chunks for streaming
throughput.

Falls back to the full-sort path when no limit hint is set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@philcunliffe philcunliffe merged commit f28772e into integration/batch-execution May 6, 2026
6 checks passed
@philcunliffe philcunliffe deleted the polecat/sq-lcw.4 branch May 6, 2026 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant