Skip to content

[VPlan] Peek through nested phi incoming values in computeBlendMasks#203164

Open
lukel97 wants to merge 18 commits into
llvm:mainfrom
lukel97:loop-vectorize/computeBlendMasks-peek-thru-phis
Open

[VPlan] Peek through nested phi incoming values in computeBlendMasks#203164
lukel97 wants to merge 18 commits into
llvm:mainfrom
lukel97:loop-vectorize/computeBlendMasks-peek-thru-phis

Conversation

@lukel97

@lukel97 lukel97 commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Stacked on #201783

If we have a phi with an incoming value that is also a phi, and that inner phi postdominates the original phi, then we can "unfold" that inner phi and look through its incoming values.

This is needed to fold away the blend masks entirely when the SSA is repaired in #201784.

The two other changes in this PR:

  • When a nested phi is unfolded, we may end with an unconditional incoming value, i.e. all paths from the header lead to it. In this case there'll be no edge mask, so handle this in the if (InValEdges.size() == 1) check.
  • Predicated reduction chains can be optimized with this, but this leads to a lot of test diffs and requires some transforms like handleFindLastReductions to be updated. For now, just ignore any phis that are a part of a reduction chain so we can focus on them in a later PR.


// If the value is a phi postdominated by VPBB, then look through the inner
// incoming values instead of propagating the phi.
if (auto *Phi = dyn_cast<VPPhi>(Common))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if that's always beneficial. What about something like (maybe need slightly more complicated than that)

            o
          /   \
         /    o
        /   /   \
       o    o   o
      / \  / \ /
     o   o    o (phi)
      \  |   /
         o (final blend)

IIUC, once we add edges leading to (phi), we won't be able to "fold" them away using post-do in this scenario.

@lukel97 lukel97 Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm missing something, why can't we fold them away? E.g. if we take something like

            o
          /   \
         /    f
        /   /   \
       g    a   b
      / \  / \ /
     c   d    e (phi1)
      \  |   /
         h (phi2)
phi1 = phi [%y, a], [%x, b]
phi2 = phi [%x, c], [%y, d], [%phi1, e]

We'll have the edges g->c := %x, g->d := %y, a->d := %y, e->h := %phi1.

phi1 is postdominated by phi2 so we can expand its edges, and we end up with g->c := %x, g->d := %y, a->d := %y, a->e := %y, f->b := %x.

a has the same value %y on all outgoing edges, so it will get folded away to g->c := %x, g->d := %y, f->a := %y, f->b := %x

@lukel97 lukel97 Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if phi1 has non-foldable values, e.g. phi1 = phi [%random1, a], [%random2, b], then we won't fold and will end up with the a->e / f->b edges. But we cache the edge masks anyway, so phi2 will reuse phi1's edge masks.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if phi1 has non-foldable values...

Yes, that's the scenario I was talking about

But we cache the edge masks anyway, so phi2 will reuse phi1's edge masks.

But we'll still have two operands for the blend instead of just one, won't we?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we'll still have two operands for the blend instead of just one, won't we?

Yes but usually the inner phi won't have any other users and will be removed when dead. If it's not dead then I believe both blends will expand to the same select VPInstruction which cse will eliminate. But we can restrict it to single use phis, that's good enough for the early exit use case.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 39472f2

Comment thread llvm/test/Transforms/LoopVectorize/VPlan/predicator.ll
lukel97 added 5 commits June 17, 2026 11:03
* Reuse previous method in DomiananceFrontier
* Replace GetAllEqual with a map_range
After thinking about this for a bit this isn't needed.

If a phi doesn't postdominate an incoming block, the incoming block
will have an outgoing edge with no value. So we won't propagate any
further up that incoming block anyway.

What differs between this approach and
llvm#184838 is that the latter
performs a full inverse DFS to see what blocks are reachable, whereas this
just checks that the incoming values are the same at each
postdominance frontier.

The test case phi_doesnt_postdom_incoming shows a scenario where the
full inverse DFS approach could simplify the edge to just c1 and !c1,
but we calculate the conservative (but still correct) edges in this
PR.
@lukel97 lukel97 force-pushed the loop-vectorize/computeBlendMasks-peek-thru-phis branch from 0c700f0 to 6a4188f Compare June 17, 2026 12:37
@github-actions

github-actions Bot commented Jun 17, 2026

Copy link
Copy Markdown

✅ With the latest revision this PR passed the C/C++ code formatter.

@lukel97 lukel97 force-pushed the loop-vectorize/computeBlendMasks-peek-thru-phis branch from 6a4188f to 6e3fea1 Compare June 19, 2026 12:05
@lukel97 lukel97 force-pushed the loop-vectorize/computeBlendMasks-peek-thru-phis branch from 6e3fea1 to 77cdce8 Compare June 26, 2026 07:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants