[VPlan] Peek through nested phi incoming values in computeBlendMasks#203164
[VPlan] Peek through nested phi incoming values in computeBlendMasks#203164lukel97 wants to merge 18 commits into
Conversation
We don't need it for now
|
|
||
| // If the value is a phi postdominated by VPBB, then look through the inner | ||
| // incoming values instead of propagating the phi. | ||
| if (auto *Phi = dyn_cast<VPPhi>(Common)) |
There was a problem hiding this comment.
I wonder if that's always beneficial. What about something like (maybe need slightly more complicated than that)
o
/ \
/ o
/ / \
o o o
/ \ / \ /
o o o (phi)
\ | /
o (final blend)
IIUC, once we add edges leading to (phi), we won't be able to "fold" them away using post-do in this scenario.
There was a problem hiding this comment.
I think I'm missing something, why can't we fold them away? E.g. if we take something like
o
/ \
/ f
/ / \
g a b
/ \ / \ /
c d e (phi1)
\ | /
h (phi2)
phi1 = phi [%y, a], [%x, b]
phi2 = phi [%x, c], [%y, d], [%phi1, e]
We'll have the edges g->c := %x, g->d := %y, a->d := %y, e->h := %phi1.
phi1 is postdominated by phi2 so we can expand its edges, and we end up with g->c := %x, g->d := %y, a->d := %y, a->e := %y, f->b := %x.
a has the same value %y on all outgoing edges, so it will get folded away to g->c := %x, g->d := %y, f->a := %y, f->b := %x
There was a problem hiding this comment.
And if phi1 has non-foldable values, e.g. phi1 = phi [%random1, a], [%random2, b], then we won't fold and will end up with the a->e / f->b edges. But we cache the edge masks anyway, so phi2 will reuse phi1's edge masks.
There was a problem hiding this comment.
And if phi1 has non-foldable values...
Yes, that's the scenario I was talking about
But we cache the edge masks anyway, so phi2 will reuse phi1's edge masks.
But we'll still have two operands for the blend instead of just one, won't we?
There was a problem hiding this comment.
But we'll still have two operands for the blend instead of just one, won't we?
Yes but usually the inner phi won't have any other users and will be removed when dead. If it's not dead then I believe both blends will expand to the same select VPInstruction which cse will eliminate. But we can restrict it to single use phis, that's good enough for the early exit use case.
* Reuse previous method in DomiananceFrontier * Replace GetAllEqual with a map_range
After thinking about this for a bit this isn't needed. If a phi doesn't postdominate an incoming block, the incoming block will have an outgoing edge with no value. So we won't propagate any further up that incoming block anyway. What differs between this approach and llvm#184838 is that the latter performs a full inverse DFS to see what blocks are reachable, whereas this just checks that the incoming values are the same at each postdominance frontier. The test case phi_doesnt_postdom_incoming shows a scenario where the full inverse DFS approach could simplify the edge to just c1 and !c1, but we calculate the conservative (but still correct) edges in this PR.
0c700f0 to
6a4188f
Compare
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
6a4188f to
6e3fea1
Compare
6e3fea1 to
77cdce8
Compare
Stacked on #201783
If we have a phi with an incoming value that is also a phi, and that inner phi postdominates the original phi, then we can "unfold" that inner phi and look through its incoming values.
This is needed to fold away the blend masks entirely when the SSA is repaired in #201784.
The two other changes in this PR:
if (InValEdges.size() == 1)check.handleFindLastReductionsto be updated. For now, just ignore any phis that are a part of a reduction chain so we can focus on them in a later PR.