Skip to content

Add function signature reconstruction for Issue #14#79

Merged
stevemk14ebr merged 8 commits into
mandiant:masterfrom
kami922:issue-14-define-apply-structs
Jun 1, 2026
Merged

Add function signature reconstruction for Issue #14#79
stevemk14ebr merged 8 commits into
mandiant:masterfrom
kami922:issue-14-define-apply-structs

Conversation

@kami922

@kami922 kami922 commented Jan 22, 2026

Copy link
Copy Markdown
Contributor
  • Parse FuncType metadata to extract input/output parameter types (including variadic flag).
  • Build C-style function pointer typedefs and emit them in CReconstructed for Func kinds.
  • Represent multiple returns by commenting them in the typedef (C still single-return).
  • Add IDA script scaffolding to ingest function types (signature application is intentionally left as future work).
  • JSON now carries function signatures instead of opaque void* for functions.

closes #14

- Parse FuncType metadata to extract input/output parameter types
- Build C-style function pointer typedefs with proper signatures
- Handle variadic functions and multiple return values
- Add infrastructure in IDA script to receive function types
- Emit CReconstructed field for Func types in JSON output

This enables IDA to import function signatures as types, laying
groundwork for applying types to function arguments/locals.
@kami922

kami922 commented Jan 22, 2026

Copy link
Copy Markdown
Contributor Author

@stevemk14ebr Hello

based on your review, here is the remaining checklist for the next commit. let me know if I missed anything.

  • Implement[apply_function_type in the IDA script to actually apply reconstructed signatures to funcs/args/locals.
  • Improve multiple-return handling (currently commented in typedef; consider struct/tuple).
  • Add small Go fixtures/tests to assert CReconstructed for funcs and an IDA import smoke test.

@stevemk14ebr

stevemk14ebr commented Jan 22, 2026

Copy link
Copy Markdown
Collaborator

I like where this is going! We can do better with the arg type and return type parsing though. Checkout this case

.rodata:00000000005B1F40 RTYPE_funcunsafe_Pointer_comma__unsafe_Pointer_bool RTYPE <8, 8, 6BB69B10h, TFLAG_EXTRASTAR, 8, 8, \
.rodata:00000000005B1F40                                         ; DATA XREF: .rodata:00000000005C7F08↓o
.rodata:00000000005B1F40                                         ; .rodata:00000000005C7F18↓o ...
.rodata:00000000005B1F40                        KIND_FUNC or KIND_DIRECTIFACE, 0, offset runtime_gcbits__ptr_, \
.rodata:00000000005B1F40                        offset byte_5A2644 - offset type__ptr_, 0>
.rodata:00000000005B1F70                 FUNC_TYPE <2, 1>
.rodata:00000000005B1F74                 align 8
.rodata:00000000005B1F78                 dq offset RTYPE_unsafe_Pointer
.rodata:00000000005B1F80                 dq offset RTYPE_unsafe_Pointer
.rodata:00000000005B1F88                 dq offset RTYPE_bool
.rodata:00000000005B1F90                 align 20h

This function takes 2 inputs, has 1 output, and the types of all 3 are present. Our reconstructed typedef is

{
    VA: 5971776, 
    Str: "func(unsafe.Pointer, unsafe.Pointer) bool", 
    CStr: "func(unsafe_Pointer,_unsafe_Pointer)_bool_funcptr", 
    Kind: "Func", 
    Reconstructed: "", 
    CReconstructed: "typedef void* (*func(unsafe_Pointer,_unsafe_Pointer)_bool_funcptr)(void*, void*)", 
    baseSize: 48, 
    kindEnum: Func (19), 
    flags: tflagExtraStar (2)
}

I'd expect the CStr to be bool (unsafe_Pointer,_unsafe_Pointer), the Reconstructed to be type funcunsafe_Pointer_comma__unsafe_Pointer_bool(unsafe.Pointer, unsafe.Pointer) bool or something similar. The CReconstructed is wrong, it should be something similar to typedef bool (*func_unsafe_Pointer__unsafe_Pointer_bool_funcptr)(unsafe_Pointer, unsafe_Pointer). In general we try to match the naming scheme of IDA, Reconstructed tries to use Go syntax, and CReconstructed tries to use C or C++ syntax.

So there's some formatting issues with the reconstruction, but the raw parsing logic seems mostly working! I agree with all the follow on work points, especially about tests.

For multiple return values we can use either structures or tuples, the project uses structures now, but IDA recently added tuples for Go so lets prefer those for this function type work. https://hex-rays.com/blog/stop-guessing-and-start-going

Actually applying these signatures via IDA scripting can be quite tricky since the types recursively depend on other types. I am completely ok with not having the IDA script implemented for now. I care most about reasonable and correct symbol recovery. We can do the IDA script work later.

Fix param array offset from baseSize+4 to baseSize+ptrSize to account
for struct padding on 64-bit. Add tflagUncommon handling (+16 bytes).
Reformat CStr as "returnType (params)", add Reconstructed with Go
syntax, fix CReconstructed with actual parsed types and clean funcptr
name. Collect both Go and C type names during param parsing.
@stevemk14ebr

Copy link
Copy Markdown
Collaborator

The CStr output formatting is inconsistent. See this case:

 RTYPE_func_int_comma__bool RTYPE <8, 8, 5B738012h, TFLAG_EXTRASTAR, 8, 8, \
.rdata:00000000004B2E40                                         ; DATA XREF: .rdata:00000000004B7928↓o
.rdata:00000000004B2E40                                         ; .rdata:00000000004B7930↓o ...
.rdata:00000000004B2E40                        KIND_FUNC or KIND_DIRECTIFACE, 0, offset unk_4E7708, \
.rdata:00000000004B2E40                        offset byte_4AA839 - offset byte_4A6000, 0>
.rdata:00000000004B2E70                 FUNC_TYPE <0, 2>
.rdata:00000000004B2E74                 align 8
.rdata:00000000004B2E78                 dq offset RTYPE_int
.rdata:00000000004B2E80                 dq offset RTYPE_bool
.rdata:00000000004B2E88                 align 20h
objfile.Type {
VA: 4927040, 
Str: "func() (int, bool)", 
CStr: "void (void)", 
Kind: "Func", 
Reconstructed: "func() (int, bool)", 
CReconstructed: "typedef tuple(int, bool) (*func_int_bool_funcptr)(void)", 
baseSize: 48,
 kindEnum: Func (19), 
flags: tflagExtraStar (2)}

The type recovery indicates 2 output arguments, but the CStr reconstruction is void (void).

Handle multiple returns using tuple syntax in CStr field.
For func() (int, bool), CStr now shows 'tuple(int32, bool) (void)'
instead of incorrectly showing 'void (void)'.
@kami922

kami922 commented Feb 9, 2026

Copy link
Copy Markdown
Contributor Author

@stevemk14ebr Hello, just dropping by to remind you about the pr.

@stevemk14ebr

stevemk14ebr commented Feb 9, 2026

Copy link
Copy Markdown
Collaborator

Hi, have other work taking priority I will review when able, thanks!

@kami922

kami922 commented Feb 20, 2026

Copy link
Copy Markdown
Contributor Author

@stevemk14ebr Hello dropping by again for review whenever convenient

@kami922

kami922 commented Feb 24, 2026

Copy link
Copy Markdown
Contributor Author

@stevemk14ebr Hello can you please review this as well?

@stevemk14ebr

stevemk14ebr commented Feb 25, 2026

Copy link
Copy Markdown
Collaborator

I'm aware of this, this one's a bit bigger so I have to set aside some time to review with the focus needed

@Krish-Anand-dev

Copy link
Copy Markdown

Hello @kami922 and @stevemk14ebr,

I was reading through this PR and noticed one issue that doesn't appear to have been addressed yet.
In commit 63d3322, funcPtrName for CReconstructed is derived from origCStr:

// Save original CStr (the C-safe type name) before overwriting
origCStr := _type.CStr

(*_type).CStr = returnCStr + " (" + argsCStr + ")"

// ... later ...

funcPtrName := strings.NewReplacer(
    "(", "_", ")", "_", ",", "_",
).Replace(origCStr)

However, origCStr is captured after the fallback (_type).CStr = "void" at the top of the Func case, but before the real CStr is computed. This means origCStr is always "void*", so every func typedef ends up named void_funcptr regardless of its actual signature — rather than something like func_unsafe_Pointer__unsafe_Pointer_bool_funcptr as shown in @stevemk14ebr's review example. The name should be derived from _type.Str (sanitized) instead.

kami922 added 2 commits May 27, 2026 18:03
Test CStr output for multi-return function types using naturally
occurring types from stdlib (io.Writer, fmt package usage):
- func([]uint8) (int, error) asserts tuple(int, error) format
- func() (int, bool) asserts tuple( syntax for multi-return

Verified passing across all 22 Go versions (1.5-1.26)
@kami922

kami922 commented May 27, 2026

Copy link
Copy Markdown
Contributor Author

@stevemk14ebr I have added tests. these are for
Single param, multi-return — func([]uint8) (int, error)
Zero params, multi-return — func() (int, bool)

I had planned for implementing variadic function but I don't think they occur in test binaries.
lemme know your thoughts.

@stevemk14ebr

Copy link
Copy Markdown
Collaborator

Some minor adjustments I'd like:

  • Please fix the format that is breaking the lint
  • Please add a test condition that includes a _ptr_ and one that includes an unsafe_Pointer. This will help catch some obvious breakages in the future with recursive types

It would be better if our tests were more complete, and looking for more than just the presence of some correct types, but I recognize this is a difficult thing to test so I will not require more precise tests for this feature.

- Remove trailing whitespace in objfile.go (gofmt lint fix)
- Add test assertions for Func types containing unsafe_Pointer
  and _ptr_ in CStr to catch breakage in recursive type parsing
@kami922

kami922 commented May 29, 2026

Copy link
Copy Markdown
Contributor Author

@stevemk14ebr Hello addressed in new commit let me know what to do next.

Comment thread main_test.go Outdated
Comment thread main_test.go Outdated
@stevemk14ebr stevemk14ebr dismissed their stale review June 1, 2026 13:40

resolved

@stevemk14ebr

stevemk14ebr commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator

@kami922 please either complete apply_function_type or remove this function and the type map being built up in the python script. After that I will merge 🥳

@kami922 kami922 requested a review from stevemk14ebr June 1, 2026 18:06
@stevemk14ebr stevemk14ebr merged commit c69cbd6 into mandiant:master Jun 1, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Define And Apply Structs

3 participants