chore(qwen3-scheduler): prune redundant fake-loop tests, extract test module by xiaguan · Pull Request #433 · openinfer-project/openinfer

xiaguan · 2026-06-21T16:58:12Z

What

openinfer-qwen3-4b/src/scheduler.rs had ballooned to 2672 lines, ~1555 of which were an in-file #[cfg(test)] module — larger than the code it tested. Most of that was fake-executor-driven loop tests that either duplicate the real-GPU e2e in tests/ or assert on mock spy recordings.

This PR prunes the low-value tests and extracts the rest into src/scheduler/tests.rs. scheduler.rs: 2672 → 1119. No production code changes.

Deleted (11 tests, 27 → 16)

Because this commit both deletes and relocates, the diff can't show the deletions directly — here's the manifest:

Deleted test	Why
`disconnect_mid_chunk_drops_the_request`, `active_receiver_drop_releases_request_state`	covered by `tests/scheduler_robustness.rs::scheduler_survives_consumer_drop` (real GPU)
`over_context_window_..._through_scheduler_loop`, `oversized_echo_..._through_scheduler_loop`, `impossible_request_is_rejected_...`	reject decision covered by pure `admit_deferred_requests` / lifetime-block tests; reject delivery covered by `tests/context_window.rs`
`scheduled_event_reports_cached_tokens_exactly_once`	covered by `tests/cached_tokens_usage.rs::warm_repeat_reports_cached_tokens`
`one_token_completion_on_page_boundary_fits_one_page`, `long_prompt_chunks_across_steps_and_still_completes`	page/chunk math covered by pure `take_prefill_chunks` / `page_boundary_lifetime_blocks` tests; the latter also asserted on `Arc<Mutex<Vec>>` spy recordings
`request_waits_for_full_kv_budget_before_prefill`	reservation proven against the real allocator by `lifetime_blocks_*_kvbm_peak_draw`
`mixed_lora_prefill_requests_run_in_one_batch`	testing the mock: asserted on recorded internal batch grouping, not observable behaviour
`lora_control_reports_unimplemented_when_idle`	asserts a stub is a stub

Also dropped the now-unused FakeExecutor batch-recorder fields and with_batch_records / with_lora_batch_records / with_cached_tokens / with_max_context_tokens builders.

Kept (16)

8 pure functional-core tests — admit_deferred_requests, take_prefill_chunks, lifetime-block math, called directly on plain data (no fake).
2 property tests against the real BlockPool (lifetime_blocks_*_kvbm_peak_draw) — guard the off-by-one that silently OOMs serving.
6 tests that need the fake for things real-GPU e2e can't do: fault injection (decode_error_drops_request_state_and_scheduler_recovers), missing-adapter rejection (no e2e covers it), control-channel idle-gating (lora_control_{unloads,waits}), and real side-effect observation (echo_requests_are_never_offered_to_prefetch, retiring_multiple_active_requests_tolerates_unsorted_indices).

FakeExecutor is now a focused fault-injection + control-channel + side-effect-observation harness, not a spy. The only spy test is gone.

Follow-up (not here)

scheduler.rs is still ~1119 lines. Extracting the admission / prefetch modules (and co-locating their pure tests) is the next step toward the 1k line budget.

Validation

Pre-commit clippy (compiles the test targets) passes — confirms no dead code, dangling refs, or unresolved super::* after the move. Per repo convention for test-only removals, cargo test (slow CUDA build) was not run; clippy is the gate.

🤖 Generated with Claude Code

… module scheduler.rs's in-file test module had grown to ~1555 lines — larger than the code it tested — by accumulating fake-executor-driven loop tests that either duplicate the real-GPU e2e in tests/ or assert on mock spy recordings. Drop 11 low-value tests (27 -> 16): - 5 fake-loop tests covered by tests/{scheduler_robustness,context_window, cached_tokens_usage}.rs: consumer/receiver drop, oversized-echo & over- context rejection, cached-tokens-once, long-prompt chunk completion - 3 reject-through-loop tests whose decision is already covered by the pure admit_deferred_requests / lifetime-block unit tests - mixed_lora_..._one_batch: asserted on Arc<Mutex<Vec>> spy recordings of internal batch grouping (testing the mock, not behaviour) - active_receiver_drop (dup of scheduler_robustness e2e) and lora_control_reports_unimplemented (asserts a stub is a stub) Drop the now-unused FakeExecutor batch-recorder fields/builders. The surviving harness only does fault injection (decode failure), missing-adapter rejection (no e2e covers it), control-channel idle-gating, and side-effect observation (KV release / echo-not-prefetched) — things the real-GPU e2e can't reproduce. Move the remaining #[cfg(test)] module to src/scheduler/tests.rs (flat layout, `use super::*` preserves crate-private access). scheduler.rs: 2672 -> 1119. Kept: 8 pure functional-core tests + 2 BlockPool peak-draw property tests + the 6 fault/control/side-effect tests above. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a135b8c823

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-21T17:01:32Z

-        assert!(matches!(error, EngineControlError::OperationFailed(_)));
-    }
-}
+mod tests;


Restore the dropped scheduler-loop tests

This extraction is not preserving the old inline test module: scheduler/tests.rs omits several no-GPU scheduler-loop regressions such as long_prompt_chunks_across_steps_and_still_completes, disconnect_mid_chunk_drops_the_request, scheduled_event_reports_cached_tokens_exactly_once, request_waits_for_full_kv_budget_before_prefill, and active_receiver_drop_releases_request_state. Those were the only tests exercising chunked prefill, cancellation cleanup/KV release, cached-token Scheduled reporting, and full-lifetime KV admission through start_with_executor, so future regressions in those production paths can now pass the suite. Please move the full test module rather than a subset.

Useful? React with 👍 / 👎.

xiaguan merged commit b96a31c into main Jun 21, 2026
1 check passed

chatgpt-codex-connector Bot reviewed Jun 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(qwen3-scheduler): prune redundant fake-loop tests, extract test module#433

chore(qwen3-scheduler): prune redundant fake-loop tests, extract test module#433
xiaguan merged 1 commit into
mainfrom
chore/qwen3-scheduler-prune-fake-tests

xiaguan commented Jun 21, 2026

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

xiaguan commented Jun 21, 2026

What

Deleted (11 tests, 27 → 16)

Kept (16)

Follow-up (not here)

Validation

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant