decomp+ghidra: fix FUN_0000f7b0 lift for issue #225#228
Merged
Conversation
Stack locals that Ghidra mistypes (e.g. `unsigned char[2]` for a 16 KB recv buffer) caused a SIGABRT in clang Sema at INT_ZEXT. Three layers, matching Ghidra-decomp / IDA Hex-Rays / BN HLIL: - Producer (PcodeSerializer.java): infer stack-local size from buffer-shaped ABI calls (memset/memcpy/recv/...) and widen the DECLARE_LOCAL type, gated by defensive guards (sibling-view, frame-budget, no-overlap-with-other-locals, zero-offset, size cap, thunk-walk, COPY recursion bound, positive-offset refusal, VariableStorage identity for self-skip). - Resolution (OpBuilder::create_varnode): when the truthful per-varnode `size` is narrower than the resolved storage, emit `arr[0]` or `*(uintN_t*)&arr` at use-site resolution. Helper uses unsigned arithmetic end-to-end, caps reinterpret width at storage size, requires non-zero array length for subscript path. - Write side (create_assign_operation): when a scalar→array width matches one element, emit `arr[0] = input` instead of refusing; requires non-zero element count. LIT fixture: test/patchir-decomp/issue_225_fun_0000f7b0.json pins the post-fix shape — the widening guard refuses here (Ghidra placed overlapping sub-locals), keeping `unsigned char local_4018[2]` and showing `memset(&local_4018, 0, 16384U)` as a visible OOB for downstream verifiers; element-subscript paths preserve byte read/write semantics. LIT: patchir-decomp 72/72; full suite 179/181 (two unresolved are pre-existing veribin infra). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Collaborator
Author
When the helper returns null (unsupported width, or over-read past storage), create_varnode falls back to the unnarrowed aggregate. A downstream INT_ZEXT/INT_SEXT/CAST then trips assert(Diagnosed && "failed to diagnose bad conversion") in clang Sema, which is indistinguishable from the original #225 SIGABRT. Emit a WARNING naming storage type, requested size, and upstream op key so the unhandled width is one grep away. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Stack locals that Ghidra mistypes (e.g.,
unsigned char[2]for a 16 KB recv buffer) caused a SIGABRT in clang Sema at INT_ZEXT.Three layers, matching Ghidra-decompile C:
sizeis narrower than the resolved storage, emitarr[0]or*(uintN_t*)&arrat use-site resolution. Helper uses unsigned arithmetic end-to-end, caps reinterpret width at storage size, requires non-zero array length for subscript path.arr[0] = inputinstead of refusing; requires non-zero element count.LIT fixture: test/patchir-decomp/issue_225_fun_0000f7b0.json pins the post-fix shape — the widening guard refuses here (Ghidra placed overlapping sub-locals), keeping
unsigned char local_4018[2]and showingmemset(&local_4018, 0, 16384U)as a visible OOB for downstream verifiers; element-subscript paths preserve byte read/write semantics.