[Codegen] Use default read semantics for LinalgExt scatter#24504
Open
moomoohorse321 wants to merge 1 commit into
Open
[Codegen] Use default read semantics for LinalgExt scatter#24504moomoohorse321 wants to merge 1 commit into
moomoohorse321 wants to merge 1 commit into
Conversation
This removes the ScatterOp-specific no-read override in the LinalgExt bufferization external model and uses the default DestinationStyleOpInterface read semantics instead. Signed-off-by: Yuwei Sun <yuweis2@illinois.edu> Signed-off-by: Hao Ren <rhao8608@gmail.com> Co-authored-by: Yuwei Sun <yuweis2@illinois.edu>
Contributor
|
I think it might be a good idea to create that followup copy elision pass as I suspect this might have performance implications. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The original impl. will introduce critical but very subtle bugs. We identified the bug when running Qwen-3.5, with decode-prefill-decode pattern. Because of the problem detailed below, the 2nd decode are completely wrong.
Problem
iree_linalg_ext.scatterwas special-cased in the LinalgExt bufferizationexternal model as not reading its DPS init operand which is not correct.
Fix
Remove the ScatterOp-specific no-read override and use the default
DestinationStyleOpInterface bufferization read semantics.
After this change, out-of-place bufferization preserves the init value before
scatter updates it:
Minimal Repro
A simple overwrite scatter still needs to preserve elements that are not updated:
Example shape:
original: 32 x 1 x 512
updates : 11 x 1 x 512
indices : 11 rows
result : 32 x 1 x 512
Scatter writes only 11 rows. The remaining 21 rows must come from original.
Before this change, bufferization thinks
outs(%original)as a destinationplaceholder instead of a read. So it stores tmp[32][1][512] to output
Analysis
I propose that we remove the override because scatter op should be bufferized as reads except for some small corner cases. Overriding the scatter op does not preserve correctness.
Let's analyze the following cases for scatter:
updatesis read: scatter must read the update value.indicesis read: scatter must read indices to know where to write.mask, if present, is read: a false mask suppresses the update and preservesthe original value.
indicesmust be inherited from original;yield update + old;unique_indices(false)combine/reduction-style scatter also needs theold/current value.
The only case where init may not need to be read is a separately proven
full-overwrite optimization: indices cover the entire result, mask cannot
suppress updates, and the combiner does not use the old value. That should be
handled by a dedicated copy-elision/full-overwrite analysis, not by the default
bufferization semantics.
Tests
Added LLVMGPU bufferization tests for: