Skip to content

fix: build NodeInfoMap before ExcludeTaintedNodePods filter#289

Closed
leomao10 wants to merge 1 commit into
atlassian:masterfrom
cespo:cesposito/fix-exclude-tainted-node-pods
Closed

fix: build NodeInfoMap before ExcludeTaintedNodePods filter#289
leomao10 wants to merge 1 commit into
atlassian:masterfrom
cespo:cesposito/fix-exclude-tainted-node-pods

Conversation

@leomao10
Copy link
Copy Markdown
Contributor

@leomao10 leomao10 commented Apr 16, 2026

When ExcludeTaintedNodePods is enabled, NodeInfoMap was previously built from the already-filtered pod list, causing NodeEmpty() to always return true for tainted nodes. This resulted in premature soft-path deletion of nodes that still had running pods.

Fix: move CreateNodeNameToInfoMap() to before the ExcludeTaintedNodePods filter so it always reflects the true pod state. The filtered pod list is used only for capacity calculations (CPU/memory utilisation).

Also adds a regression test verifying that a tainted node with a running non-daemonset pod is not deleted via the soft path.


Rovo Dev code review: Rovo Dev couldn't review this pull request
The pull request author does not have access to Rovo Dev.

When ExcludeTaintedNodePods is enabled, NodeInfoMap was previously built
from the already-filtered pod list, causing NodeEmpty() to always return
true for tainted nodes. This resulted in premature soft-path deletion of
nodes that still had running pods.

Fix: move CreateNodeNameToInfoMap() to before the ExcludeTaintedNodePods
filter so it always reflects the true pod state. The filtered pod list is
used only for capacity calculations (CPU/memory utilisation).

Also adds a regression test verifying that a tainted node with a running
non-daemonset pod is not deleted via the soft path.
@leomao10 leomao10 requested a review from awprice April 16, 2026 01:35
@leomao10 leomao10 self-assigned this Apr 16, 2026
@leomao10 leomao10 requested review from FocalChord, cespo and dtnyn April 16, 2026 01:38
// Filter to pods on untainted nodes
// Build NodeInfoMap BEFORE filtering tainted node pods so it always reflects
// the true pod state on every node. TryRemoveTaintedNodes uses NodeInfoMap to
// determine whether a tainted node is empty; if we built it from an already-
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we trim this down, the context of the bug is only relevant in the context of this PR being made

})

// Build NodeInfoMap from ALL pods (including the pod on the tainted node).
// This is what the fixed code does: NodeInfoMap is constructed before the
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: this should be a comment on the PR instead of a code comment, after this is merged "fixed code" has no meaning, should this be removed or reworded?

// Build NodeInfoMap from ALL pods (including the pod on the tainted node).
// This is what the fixed code does: NodeInfoMap is constructed before the
// ExcludeTaintedNodePods filter so it accurately reflects tainted-node state.
nodeGroupsState[nodeGroupOpts.Name].NodeInfoMap = k8s.CreateNodeNameToInfoMap(allPods, allNodes)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: this manually builds NodeInfoMap from all pods, which is the correct (post-fix) behavior
but it means the test doesn't exercise the actual bug path through scaleNodeGroup. test passes on master too
maybe consider adding an integration test that goes through scaleNodeGroup to prove the ordering fix?

@leomao10
Copy link
Copy Markdown
Contributor Author

Reopen a new one in #290

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants