Speed up `zstash check` multiprocessing scheduling and queue handling by Copilot · Pull Request #447 · E3SM-Project/zstash

Copilot · 2026-06-03T22:42:11Z

Summary

zstash check spent avoidable time in worker assignment and failure collection in
the multiprocessing path. This change keeps behavior intact while reducing
scheduler and polling overhead.

Objectives:

Remove O(files × workers) tar-to-worker lookup overhead.
Eliminate CPU-heavy busy polling while waiting on worker failures.
Preserve zstash check behavior and output semantics.

Select one: This pull request is...

a bug fix: increment the patch version
a small improvement: increment the minor version
a new feature: increment the minor version
an incompatible (non-backwards compatible) API change: increment the major version
Key changes
- Worker assignment
  - Build a direct tar -> worker_idx map during greedy bin-packing.
  - Replace nested membership scans with O(1) assignment per row.
- Failure queue draining
  - Replace empty() spin loops with blocking
    failure_queue.get(timeout=0.05).
  - Drain remaining items once workers finish with get_nowait().
- Scheduling correctness/perf cleanup
  - Heapify the worker-load heap (work_to_workers) directly.
  - Assign largest tar groups first (reverse=True) for better balancing.

# Before: repeated scans over all workers for each row
for db_row in matches:
    for workers_idx in range(len(workers_to_tars)):
        if db_row.tar in workers_to_tars[workers_idx]:
            workers_to_matches[workers_idx].append(db_row)

# After: direct O(1) assignment
workers_idx = tar_to_worker_idx[db_row.tar]
workers_to_matches[workers_idx].append(db_row)

Small Change

To merge, I will use "Squash and merge". That is, this change should be a single commit.
Logic: I have visually inspected the entire pull request myself.
Pre-commit checks: All the pre-commits checks have passed.

Big Change

To merge, I will use "Create a merge commit". That is, this change is large enough to require multiple units of work (i.e., it should be multiple commits).

1. Does this do what we want it to do?

Required:

Product Management: I have confirmed with the stakeholders that the objectives above are correct and complete.
Testing: I have added at least one automated test. Every objective above is represented in at least one test.
Testing: I have considered likely and/or severe edge cases and have included them in testing.

If applicable:

Testing: this pull request adds at least one new possible command line option. I have tested using this option with and without any other option that may interact with it.

2. Are the implementation details accurate & efficient?

Required:

Logic: I have visually inspected the entire pull request myself.
Logic: I have left GitHub comments highlighting important pieces of code logic. I have had these code blocks reviewed by at least one other team member.

If applicable:

Dependencies: This pull request introduces a new dependency. I have discussed this requirement with at least one other team member. The dependency is noted in zstash/conda, not just an import statement.

3. Is this well documented?

Required:

Documentation: by looking at the docs, a new user could easily understand the functionality introduced by this pull request.

4. Is this code clean?

Required:

Readability: The code is as simple as possible and well-commented, such that a new team member could understand what's happening.
Pre-commit checks: All the pre-commits checks have passed.

If applicable:

Software architecture: I have discussed relevant trade-offs in design decisions with at least one other team member. It is unlikely that this pull request will increase tech debt.

forsyth2 · 2026-06-03T22:52:39Z

@copilot This implementation looks sort of similar to #402 (the attempt at Python 3.14 support that requires changing how we approach parallelism). Should these two things (zstash check performance and Python 3.14 parallelism issues) be addressed together? Or are they independent enough?

Copilot · 2026-06-03T22:53:46Z

@copilot This implementation looks sort of similar to #402 (the attempt at Python 3.14 support that requires changing how we approach parallelism). Should these two things (zstash check performance ...

They’re mostly independent and can be handled separately. This PR only optimizes scheduling/queue handling inside multiprocess_extract; it doesn’t change process start semantics or IPC model assumptions. The Python 3.14 work is about start-method behavior (spawn vs fork) and compatibility of multiprocessing objects. It’s reasonable to merge this performance improvement on its own and validate 3.14 parallelism fixes in #402.

Optimize multiprocess check scheduling

7a8d96e

Copilot AI assigned Copilot and forsyth2 Jun 3, 2026

Copilot created this pull request from a session on behalf of forsyth2 June 3, 2026 22:42 View session

forsyth2 linked an issue Jun 3, 2026 that may be closed by this pull request

[Feature]: Improve efficiency of zstash check #446

Open

forsyth2 mentioned this pull request Jun 3, 2026

[Feature]: Improve efficiency of zstash check #446

Open

Copilot started work on behalf of forsyth2 June 3, 2026 22:52 View session

Copilot finished work on behalf of forsyth2 June 3, 2026 22:54

Copilot AI requested a review from forsyth2 June 3, 2026 22:54

forsyth2 mentioned this pull request Jun 11, 2026

Profile zstash check #450

Draft

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up `zstash check` multiprocessing scheduling and queue handling#447

Speed up `zstash check` multiprocessing scheduling and queue handling#447
Copilot wants to merge 1 commit into
mainfrom
copilot/fix-performance-bottlenecks-zstash-check

Copilot AI commented Jun 3, 2026

Uh oh!

forsyth2 commented Jun 3, 2026

Uh oh!

Copilot AI commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Jun 3, 2026

Summary

Small Change

Big Change

1. Does this do what we want it to do?

2. Are the implementation details accurate & efficient?

3. Is this well documented?

4. Is this code clean?

Uh oh!

forsyth2 commented Jun 3, 2026

Uh oh!

Copilot AI commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants