Skip to content

chore(qwen3-scheduler): prune redundant fake-loop tests, extract test module#433

Merged
xiaguan merged 1 commit into
mainfrom
chore/qwen3-scheduler-prune-fake-tests
Jun 21, 2026
Merged

chore(qwen3-scheduler): prune redundant fake-loop tests, extract test module#433
xiaguan merged 1 commit into
mainfrom
chore/qwen3-scheduler-prune-fake-tests

Conversation

@xiaguan

@xiaguan xiaguan commented Jun 21, 2026

Copy link
Copy Markdown
Collaborator

What

openinfer-qwen3-4b/src/scheduler.rs had ballooned to 2672 lines, ~1555 of which were an in-file #[cfg(test)] module — larger than the code it tested. Most of that was fake-executor-driven loop tests that either duplicate the real-GPU e2e in tests/ or assert on mock spy recordings.

This PR prunes the low-value tests and extracts the rest into src/scheduler/tests.rs. scheduler.rs: 2672 → 1119. No production code changes.

Deleted (11 tests, 27 → 16)

Because this commit both deletes and relocates, the diff can't show the deletions directly — here's the manifest:

Deleted test Why
disconnect_mid_chunk_drops_the_request, active_receiver_drop_releases_request_state covered by tests/scheduler_robustness.rs::scheduler_survives_consumer_drop (real GPU)
over_context_window_..._through_scheduler_loop, oversized_echo_..._through_scheduler_loop, impossible_request_is_rejected_... reject decision covered by pure admit_deferred_requests / lifetime-block tests; reject delivery covered by tests/context_window.rs
scheduled_event_reports_cached_tokens_exactly_once covered by tests/cached_tokens_usage.rs::warm_repeat_reports_cached_tokens
one_token_completion_on_page_boundary_fits_one_page, long_prompt_chunks_across_steps_and_still_completes page/chunk math covered by pure take_prefill_chunks / page_boundary_lifetime_blocks tests; the latter also asserted on Arc<Mutex<Vec>> spy recordings
request_waits_for_full_kv_budget_before_prefill reservation proven against the real allocator by lifetime_blocks_*_kvbm_peak_draw
mixed_lora_prefill_requests_run_in_one_batch testing the mock: asserted on recorded internal batch grouping, not observable behaviour
lora_control_reports_unimplemented_when_idle asserts a stub is a stub

Also dropped the now-unused FakeExecutor batch-recorder fields and with_batch_records / with_lora_batch_records / with_cached_tokens / with_max_context_tokens builders.

Kept (16)

  • 8 pure functional-core testsadmit_deferred_requests, take_prefill_chunks, lifetime-block math, called directly on plain data (no fake).
  • 2 property tests against the real BlockPool (lifetime_blocks_*_kvbm_peak_draw) — guard the off-by-one that silently OOMs serving.
  • 6 tests that need the fake for things real-GPU e2e can't do: fault injection (decode_error_drops_request_state_and_scheduler_recovers), missing-adapter rejection (no e2e covers it), control-channel idle-gating (lora_control_{unloads,waits}), and real side-effect observation (echo_requests_are_never_offered_to_prefetch, retiring_multiple_active_requests_tolerates_unsorted_indices).

FakeExecutor is now a focused fault-injection + control-channel + side-effect-observation harness, not a spy. The only spy test is gone.

Follow-up (not here)

scheduler.rs is still ~1119 lines. Extracting the admission / prefetch modules (and co-locating their pure tests) is the next step toward the 1k line budget.

Validation

Pre-commit clippy (compiles the test targets) passes — confirms no dead code, dangling refs, or unresolved super::* after the move. Per repo convention for test-only removals, cargo test (slow CUDA build) was not run; clippy is the gate.

🤖 Generated with Claude Code

… module

scheduler.rs's in-file test module had grown to ~1555 lines — larger than
the code it tested — by accumulating fake-executor-driven loop tests that
either duplicate the real-GPU e2e in tests/ or assert on mock spy recordings.

Drop 11 low-value tests (27 -> 16):
- 5 fake-loop tests covered by tests/{scheduler_robustness,context_window,
  cached_tokens_usage}.rs: consumer/receiver drop, oversized-echo & over-
  context rejection, cached-tokens-once, long-prompt chunk completion
- 3 reject-through-loop tests whose decision is already covered by the pure
  admit_deferred_requests / lifetime-block unit tests
- mixed_lora_..._one_batch: asserted on Arc<Mutex<Vec>> spy recordings of
  internal batch grouping (testing the mock, not behaviour)
- active_receiver_drop (dup of scheduler_robustness e2e) and
  lora_control_reports_unimplemented (asserts a stub is a stub)

Drop the now-unused FakeExecutor batch-recorder fields/builders. The surviving
harness only does fault injection (decode failure), missing-adapter rejection
(no e2e covers it), control-channel idle-gating, and side-effect observation
(KV release / echo-not-prefetched) — things the real-GPU e2e can't reproduce.

Move the remaining #[cfg(test)] module to src/scheduler/tests.rs (flat layout,
`use super::*` preserves crate-private access). scheduler.rs: 2672 -> 1119.

Kept: 8 pure functional-core tests + 2 BlockPool peak-draw property tests +
the 6 fault/control/side-effect tests above.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@xiaguan xiaguan merged commit b96a31c into main Jun 21, 2026
1 check passed

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a135b8c823

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

assert!(matches!(error, EngineControlError::OperationFailed(_)));
}
}
mod tests;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restore the dropped scheduler-loop tests

This extraction is not preserving the old inline test module: scheduler/tests.rs omits several no-GPU scheduler-loop regressions such as long_prompt_chunks_across_steps_and_still_completes, disconnect_mid_chunk_drops_the_request, scheduled_event_reports_cached_tokens_exactly_once, request_waits_for_full_kv_budget_before_prefill, and active_receiver_drop_releases_request_state. Those were the only tests exercising chunked prefill, cancellation cleanup/KV release, cached-token Scheduled reporting, and full-lifetime KV admission through start_with_executor, so future regressions in those production paths can now pass the suite. Please move the full test module rather than a subset.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant