fuzz: Add structured graph fuzz harness and fix folding overflows it found#149
Draft
weltling wants to merge 4 commits into
Draft
fuzz: Add structured graph fuzz harness and fix folding overflows it found#149weltling wants to merge 4 commits into
weltling wants to merge 4 commits into
Conversation
This harness decodes the input blob directly into a structurally valid IR function and runs the full optimization and codegen pipeline on it, so inputs exercise the optimizer and backend. The input bytes are turned into a valid IR function step by step. The first byte picks the type and parameter count, then each small group of bytes adds one operation that uses earlier nodes as inputs. The last node becomes the return value. The optimization level is selected at compile time via FUZZ_GRAPH_O0, FUZZ_GRAPH_O1 or FUZZ_GRAPH_O2. Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
The harness that drives the text parser was named fuzz_ir.c, which did not say what set it apart from fuzz_graph.c since both fuzz the IR. Name it after its input format so the pair reads as text versus graph. Update the build rules to match. Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
The two u16 operands are promoted to signed int before the multiply, so a wide product such as 54300 * 54300 overflows int. The result is truncated back to u16 and is correct on any two complement target, but it is still undefined behavior. Cast both operands to uint32_t so the multiply is unsigned. Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
The rules MUL(NEG(x), c) and DIV(NEG(x), c) negate the constant on a signed int64_t, which is undefined behavior for INT64_MIN. Negate through uint64_t instead. The result is unchanged on a release build since the negation of INT64_MIN wraps to itself. Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
dstogov
reviewed
Jun 17, 2026
dstogov
left a comment
Owner
There was a problem hiding this comment.
The ir_fold.h fixes look right. Thanks!
According to fuzzer, it would be great to completely separate it, moving everything related from Makefile to fuzz/Mafefile. Is this difficult to do?
In case some day fuzzer is going to execute generated code, it should do this inside a container... This is not necessary right now.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This adds a second libFuzzer harness that fuzzes the IR builder and optimization pipeline directly, plus two folding fixes the harness found.
Harness
The existing harness
fuzz_ir.cfeeds raw bytes to the text IR parser, so most mutated inputs are rejected before reaching the optimizer. The new harnessfuzz_graph.cdecodes each input blob into a structurally valid IR function instead, so inputs reach the optimizer and the code emitter. This is the core of the harness. Later features like loops and conditions will be built on top of it.Decoding maps the leading bytes to a working type and parameter count, then turns each following group of bytes into one type preserving operation over earlier nodes, ending with a return of the last value. The build targets
fuzz-graph-O0,fuzz-graph-O1andfuzz-graph-O2select the optimization level. The existingfuzz_ir.cis renamed tofuzz_text.cso the pair reads as text versus graph.Fixes found
MUL(C_U16, C_U16). Bothuint16_toperands promote toint, so the product overflowed signedint. Multiply throughuint32_t.MUL/DIV(NEG, C_I). The constant was negated on a signedint64_t, which is undefined forINT64_MIN. Negate throughuint64_t. The release result is unchanged since the negation wraps to itself.Both were caught by the UBSan build of the harness, and the folded output is identical on a release build.