Summary
During a num_nodes DSE sweep, TestRun.apply_params_set writes the NUM_NODES action key into cmd_args as an undeclared field. Execution is unaffected, but the per-run record is corrupted.
Mechanism
param_space adds "NUM_NODES" to the action space when num_nodes is a list, so each action dict contains {"NUM_NODES": <n>, ...}.
_apply(key, value) treats every key as a field path into cmd_args, so "NUM_NODES" becomes setattr(cmd_args, "NUM_NODES", n).
CmdArgs is configured extra="allow", so instead of raising on the unknown field it silently stores it.
Impact
- Functionally correct: the workload runs fine and
TestRun.num_nodes is still set correctly (via the explicit if "NUM_NODES" in action assignment).
- The phantom field shows up in
cmd_args.model_dump() and therefore in the per-run TestRunDetails dump — recording a config with a NUM_NODES arg the workload never accepts. Anything reading that dump back (reproduction, reporting) sees a fabricated field.
- It does not leak into the generated shell command (strategies read named fields, not the extras bag).
Root cause
Scope mismatch: action keys address two different objects — cmd_args/extra_env_vars on the TestDefinition, vs num_nodes on the TestRun. _apply only knows how to write into the TestDefinition. NUM_NODES is the single boundary-crossing key, and extra="allow" hides the misroute instead of surfacing it.
Severity
Low — record/repro integrity, not execution. Pre-existing on main; surfaced (not introduced) by the apply_params_set refactor in #901.
Repro
```python
from cloudai.models.workload import CmdArgs
c = CmdArgs()
setattr(c, "NUM_NODES", 2) # what _apply("NUM_NODES", 2) effectively does
print(c.model_dump()) # -> {'NUM_NODES': 2}
```
Suggested fix
Make _apply the single dispatch point: route NUM_NODES to TestRun.num_nodes, everything else into the TestDefinition. Share a NUM_NODES_KEY constant between param_space (producer) and apply_params_set (consumer) so the one scope-crossing key is defined in one place.
Code: src/cloudai/_core/test_scenario.py — TestRun.param_space / TestRun.apply_params_set.
Summary
During a
num_nodesDSE sweep,TestRun.apply_params_setwrites theNUM_NODESaction key intocmd_argsas an undeclared field. Execution is unaffected, but the per-run record is corrupted.Mechanism
param_spaceadds"NUM_NODES"to the action space whennum_nodesis a list, so eachactiondict contains{"NUM_NODES": <n>, ...}._apply(key, value)treats every key as a field path intocmd_args, so"NUM_NODES"becomessetattr(cmd_args, "NUM_NODES", n).CmdArgsis configuredextra="allow", so instead of raising on the unknown field it silently stores it.Impact
TestRun.num_nodesis still set correctly (via the explicitif "NUM_NODES" in actionassignment).cmd_args.model_dump()and therefore in the per-runTestRunDetailsdump — recording a config with aNUM_NODESarg the workload never accepts. Anything reading that dump back (reproduction, reporting) sees a fabricated field.Root cause
Scope mismatch: action keys address two different objects —
cmd_args/extra_env_varson theTestDefinition, vsnum_nodeson theTestRun._applyonly knows how to write into theTestDefinition.NUM_NODESis the single boundary-crossing key, andextra="allow"hides the misroute instead of surfacing it.Severity
Low — record/repro integrity, not execution. Pre-existing on
main; surfaced (not introduced) by theapply_params_setrefactor in #901.Repro
```python
from cloudai.models.workload import CmdArgs
c = CmdArgs()
setattr(c, "NUM_NODES", 2) # what _apply("NUM_NODES", 2) effectively does
print(c.model_dump()) # -> {'NUM_NODES': 2}
```
Suggested fix
Make
_applythe single dispatch point: routeNUM_NODEStoTestRun.num_nodes, everything else into theTestDefinition. Share aNUM_NODES_KEYconstant betweenparam_space(producer) andapply_params_set(consumer) so the one scope-crossing key is defined in one place.Code:
src/cloudai/_core/test_scenario.py—TestRun.param_space/TestRun.apply_params_set.