Skip to content

swarm: guard empty-partition reshape in _pack_array_to_data_format (parallel read_timestep crash)#221

Open
lmoresi wants to merge 1 commit into
developmentfrom
bugfix/swarm-empty-partition-read
Open

swarm: guard empty-partition reshape in _pack_array_to_data_format (parallel read_timestep crash)#221
lmoresi wants to merge 1 commit into
developmentfrom
bugfix/swarm-empty-partition-read

Conversation

@lmoresi

@lmoresi lmoresi commented Jun 6, 2026

Copy link
Copy Markdown
Member

Problem

On a rank that owns no local particles, SwarmVariable._pack_array_to_data_format does array_data.reshape(array_data.shape[0], -1) on a size-0 array. numpy cannot infer the -1 component dimension from a 0-element array and raises:

ValueError: cannot reshape array of size 0 into shape (0,newaxis)

This crashes parallel read_timestep of a vector field (e.g. P2 velocity) at np>=2 whenever a rank lands an empty partition — killing the run at checkpoint load.

Fix

When the array is empty, compute the component count explicitly from the trailing dims (prod(shape[1:])) instead of relying on -1 inference. Non-empty path unchanged.

Validation

Parallel (np=4/5) read_timestep of velocity from a serial checkpoint now succeeds; reproduced the crash before, gone after. Used throughout the parallel adaptive-convection runs.

Underworld development team with AI support from Claude Code

A rank that owns no local particles has an N=0 array; numpy then cannot infer
the -1 component dimension in array_data.reshape(shape[0], -1) (raises
"cannot reshape array of size 0 into shape (0,newaxis)"). This crashed parallel
read_timestep of a vector field (e.g. velocity) at np>=2 when a rank had an
empty partition, killing the run at checkpoint load. Compute the component
count from the trailing dims explicitly when the array is empty.

Underworld development team with AI support from Claude Code
Copilot AI review requested due to automatic review settings June 6, 2026 08:49

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a parallel SwarmVariable.read_timestep crash when a rank owns zero local particles by avoiding NumPy’s reshape(..., -1) inference on size-0 arrays inside SwarmVariable._pack_array_to_data_format.

Changes:

  • Add an empty-array guard in SwarmVariable._pack_array_to_data_format to compute component count explicitly before reshaping.
  • Prevent parallel checkpoint loads from failing on empty-partition ranks during routed read_timestep.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/underworld3/swarm.py
Comment on lines +936 to 939
if array_data.size == 0:
ncomp = int(np.prod(array_data.shape[1:])) if array_data.ndim > 1 else 1
return array_data.reshape(array_data.shape[0], ncomp)
return array_data.reshape(array_data.shape[0], -1)
Comment thread src/underworld3/swarm.py
Comment on lines +932 to +936
# infer the -1 component dimension ("cannot reshape array of size 0 into
# shape (0,newaxis)"). This bites a rank that owns no local particles
# during a parallel read_timestep. Compute the component count from the
# trailing dims explicitly.
if array_data.size == 0:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants