Skip to content

1-3-7/disrobe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

532 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

disrobe: decompile, deobfuscate, and unpack almost anything, deterministically, in a single Rust binary

CI Docs Release License: Elastic-2.0 Rust 1.95+

One tool to decompile, deobfuscate, and unpack almost anything, deterministically, in a single Rust binary.

disrobe strips the bytecode, packers, freezers, and protectors layered onto compiled and frozen software across 20+ ecosystems, then proves what it recovered against an independent oracle. Recovered Python is recompiled and diffed opcode-for-opcode in CI; unpacked bytes are byte-compared to the original; recovered Android, WebAssembly, and Lua are re-run through the real JVM verifier, wasmtime, and lua. No model in the loop, nothing to drift or contaminate a benchmark. Built for malware analysis, CTFs, IP recovery, and security research.

Try it in your browser: 1-3-7.github.io/disrobe/playground. Decompile a .pyc, scan a pickle for malicious reduce callables, and summarize a .wasm module, all client-side. The analysis passes are compiled to WebAssembly; nothing is uploaded.

Why disrobe

  • The only decompiler that proves its own output. Recovery is measured against an independent reference, never the tool grading itself: recovered Python recompiled and diffed opcode-for-opcode (92.76% per-code-object on the real CPython 3.14 stdlib), recovered Android re-verified by real java -Xverify:all (99% of verifiable classes), recovered WebAssembly re-executed under wasmtime (24 of 24 numeric-ABI functions equivalent), recovered Lua re-run under real lua.
  • Broadest ecosystem coverage in a single binary. 20+ ecosystems, wider than any other reverse-engineering tool, with no JVM, Python, or Docker runtime to install. cargo build --release and go.
  • Every family named, with its honest level. 27 native packers, 20 .NET protectors, 18 Python source obfuscators, 14 Lua families, 19 shell families, 9 JVM/Android obfuscators, 8 Android RASP vendors, 4 WASM reversers, and 98 container formats, each tagged recovered, partial, or detect-only with the reason stated.
  • A reverse-engineer's toolkit, not just a decompiler. A queryable-IR layer (disrobe query: calls-to, xrefs-to, string-decoders, complexity-over, capability sites) over stripped code, a disrobe capabilities rule engine that maps behavior to MITRE ATT&CK and Malware Behavior Catalog IDs with per-instruction evidence, an MBA simplifier, emulation-driven stack-string recovery, and in-tree extraction of 98 container, filesystem, and firmware formats with no external unzipper.
  • Recon over recovered source, not a shell grep. disrobe frisk surfaces leaked secrets, API endpoints, cloud buckets, Android manifest exposure, and IOCs with file, line, and column. Because disrobe recovers the real source first, frisk searches truth, and it is encoding-safe.
  • Deterministic, no LLM. Identical input yields identical output on every machine and run. Lossy results carry a measured score that is never rounded in the tool's favor.
  • Chain auto-detection. disrobe auto fingerprints the input and composes the whole pipeline: PE -> UPX -> demangle, APK -> dex -> Java, PyInstaller -> PyArmor -> .pyc decompile. APK bundles (.apkm, .xapk, .aab) route by structure straight to the android/dex path.
  • Written from scratch, not a wrapper. Every decompiler, unpacker, and container parser is original Rust, over 637,000 lines of it: the Python, JVM/Dalvik, .NET CIL, WebAssembly, Lua, and Go engines, the native stub-emulator unpackers, the in-tree disassembler and queryable-IR layer, and the 98-format container superset. Ghidra, CFR, jadx, ILSpy, and de4dot are optional --backend fallbacks, off by default, that a normal run never invokes.
  • Built for pipelines. JSON/NDJSON/SARIF output, a content-addressed .dr cache, an HTTP/gRPC/LSP/MCP daemon, and typed Python bindings. Fully offline, zero telemetry.

Full documentation: 1-3-7.github.io/disrobe

disrobe demo

What it does

disrobe removes obfuscation, freezing, packing, and protection from a binary so its behavior can be read statically, without execution. Nothing in the pipeline is statistical, so there is no learned model to drift, retrain, or contaminate with a benchmark. The suite compiles to a single static binary with no JVM, Python, or Docker dependency: cargo build --release, then run it headlessly in CI.

The Python, JVM/Kotlin, Dalvik, .NET CIL, and WebAssembly decompilers are written in Rust and ship as the product. CFR, Vineflower, Procyon, jadx, ILSpy, dnSpy, and de4dot are optional --backend fallbacks, off by default. On native PE/ELF/Mach-O it does not attempt raw decompilation; it unpacks, recovers symbols, and resolves packer chains, then hands Ghidra, IDA, or Binary Ninja cleaner input.

Every artifact is content-addressed and persisted as a .dr envelope (rkyv payload, postcard sidecar, BLAKE3 root), so cache hits are byte-identical and chains compose offline. Recovery is measured against an independent reference: recovered Python is recompiled on the matching interpreter and diffed opcode-for-opcode, and unpacked bytes are compared to the original. Lossy results carry their measured score under SEMANTIC, PARTIAL, or SKELETON. Whatever cannot be fully recovered is reported as detect-only. Any pass can also emit an --llm metadata sidecar (call graph, types, control flow, capability surface, provenance).

Proof, not claims

Most tools cover one ecosystem and trust their own output. disrobe covers 20+ and proves what it recovers against an independent oracle, in CI, with no model in the loop. Every number here is cited per value in xtask/data/recovery.json and reproduced by a committed test gate [CI] or a local measurement harness [local].

  • Defeats real obfuscators, graded against the clean original. 18 families are proven on genuine tool output, each peeled and scored against the pre-obfuscation source or bytes, never the tool's own report: javascript-obfuscator, garble, ConfuserEx2, PyArmor, Invoke-Obfuscation, ProGuard, Prometheus, MPRESS, python-minifier, yGuard, Obfuscar, Kramer, Bashfuscator, PyInstaller, JSConfuser, wasm-name-obfuscation, Nuitka, JBCO, and R8. Native format coverage carries the same real-fixture proof across ELF, PE, COFF, Mach-O, EFI, STABS, Win16-NE, OS/2-LX, and AVR.
  • Verified Python, not guessed. A deterministic CPython decompiler (1.0-3.15) whose output is recompiled and diffed opcode-for-opcode: 92.76% per-code-object equivalence on the real 3.14 stdlib (CPython 3.14.5, 5831 of 6286 code objects), above a 90% floor enforced in CI. uncompyle6 stops at 3.8 and the LLM-based tools self-flag benchmark contamination; there is no model here to contaminate.
  • Android bodies the JVM itself signs off on, at production scale. Recovered DEX methods are re-hosted and run through real java -Xverify:all, not a self-report: 102 of 103 classes (99% of verifiable) and 307 isolated bodies pass in CI, with a live-range-splitting pass for registers that carry conflicting types across control-flow joins. The in-house decompiler is the default, not a jadx wrapper: 97.7% of methods recompile on the construct megafixture, and a real 131 MB Shopify APK recovers 99.9% of methods (62,953 classes, 380,525 methods) in one run [local].
  • WASM you can re-run. It lifts WebAssembly to typed Rust, TypeScript, WAT, or C with DWARF recovery, then proves the lift by re-executing original against recovered under wasmtime: 24 of 24 numeric-ABI functions execution-equivalent (1 also byte-identical in memory), the rest op-coverage-only; all 94 functions in the parseable corpus at full op-coverage.
  • Real Lua VM devirtualization, proven by execution. It reverses genuine IronBrew2 2.7.0 output (standard and MAX mode) and confirms it with a real-lua execution differential against the original, on a committed real-sample corpus.
  • .NET virtualization, lifted back to CIL. Beyond detecting 20 protectors, it reverses ConfuserEx2 constant decryption on a real committed sample, devirtualizes the Eazfuscator VM tier at 57 of 57 instructions against an in-repo virtualizer, and devirtualizes the KoiVM tier on a sample built by the real KoiVM tool (6 of 6 bodies lifted to CIL, graded against the clean baseline).
  • PyArmor 8/9 unpacked statically, end to end. Runtime-key extraction, AES decrypt, and source recovery, recovered 72 of 72 local samples and validated to a pre-obfuscation identifier in CI; the v3-v5 RSA-wrapped-key tier is an information-theoretic wall, stated as one.
  • Real packer stubs, emulated and sliced. An in-house x86 stub emulator unpacks the UPX/ASPack/PECompact/Yoda's-Crypter tier in one static Rust binary with no Python or unicorn dependency, scored byte-for-byte against real committed originals (UPX .text and .pdata byte-identical; Yoda's Crypter .text decrypted to full plaintext).
  • 98 archive formats, no external unzipper. It detects 98 container, archive, filesystem, and firmware formats and writes member bytes in-tree for all 98, from bare gz/zstd/lz4 streams down to FAT-walked disk-image partitions, embedded-linux filesystems (squashfs, erofs, jffs2, ubifs, yaffs), and vendor firmware decryptors (D-Link AES, EnGenius XOR, QNAP PC1), with a recursive carve-everything engine and no 7-zip or binwalk shellout.

Where prior FOSS already does a thing well, disrobe says so: it credits GoReSym/redress for the Go pclntab layer (its own edge there is static garble -literals init-thunk emulation), de4dot for full ConfuserEx2 cleanup, and FFDec/Ghidra/IDA where a mature dedicated tool wins.

Supported targets

Every supported packer, obfuscator, protector, format, and runtime, grouped by ecosystem and read straight from the in-tree catalog. Each family carries an honest level:

  • Recover is real recovered output (source, bytes, or structure) on the run path.
  • Partial is structural peel or constant/string recovery with a stated residual.
  • Detect-only is identification plus a reason the rest cannot be recovered statically (a runtime key, a live process, or a network-fetched payload), stated honestly and never faked.

Python

Surface Coverage
Bytecode decompile In-house Rust decompiler for CPython 3.6-3.15 (recompile-verified per construct), plus a legacy 1.0-3.7 band (CI floor 152 of 191 verified, 166 of 191 local with the full interpreter zoo). Recovers match, walrus, f/t-strings (PEP 750), exception groups, and PEP 695/696/709. Detects PyPy, MicroPython .mpy, Jython, IronPython, and Brython runtimes.
Freezers PyInstaller 2.x-6.20+, Nuitka (onefile/standalone/module/wheel; byte-exact unpack, native bodies lossy), cx_Freeze, py2exe, PyOxidizer, shiv, pex, Briefcase, SourceDefender .pye (in-house AES-256-CTR + BLAKE2b decrypt, real-corpus validated).
Protector Recover: PyArmor v6-v9-pro (default, super, no-wrap; recovered 72 of 72 samples in a local corpus). Detect-only: the v3-v5 RSA-wrapped-key tier is a runtime-key wall.
Source obfuscators (18) Recover to source via an AST-evaluator backend: Kramer/Specter, Berserker, Jawbreaker, BlankOBF, PlusOBF, Wodx, pyobfuscate.com, PyObfuscator (mauricelambert), ObfuXtreme, Manglify, Oxyry, pyminifier, Xindex, Patchwork, and the online-obfuscator family. Partial (detect + peel): python-obfuscator (PyPI), pyobfus, Pypacker. Hyperion, Pyfuscator, Opy, and generic exec/eval droppers plus marshal/base64/zlib/lzma packers are unwrapped by chain. Jawbreaker's b16/b32/b64 loader shell is decoded statically; a payload it fetches from a remote paste at run time is absent from the file. ObfuXtreme's AES-CBC/b85/xor static body is recovered; its runtime-payload segment is not in the artifact.
Pickle Static disasm + symbolic-VM trace + safety grading + polyglot and ML-model detection. Never unpickles.

JavaScript / TypeScript / WebAssembly

Surface Coverage
JS obfuscators Recover: obfuscator.io (full pipeline), JS-Confuser, Jscrambler. Partial: js-obfuscator (jsobfu). Scope-aware renaming, control-flow unflattening, and an MBA simplifier throughout.
JS esoteric encoders Recover: JSFuck, aaencode, jjencode, JSFireTruck, and the Dean Edwards p,a,c,k,e,r. atob/base64 and eval/Function indirection are folded back where the indirection is static.
JS protectors Partial (detect + peel where static): JSDefender, Arxan/Digital.ai. Detect-only: PACE.
JS bundlers (11) Un-bundled with source-map reconstruction: webpack 4, webpack 5, Vite, Rollup, Rolldown, esbuild, Turbopack, Bun, Parcel, Browserify, SystemJS.
V8 / Bytenode Self-contained static recovery of the .jsc user-string layer plus structure, Node single-executable-application (SEA) blob carve, and Node 18-24 version detection. Offline, no patched V8 binary.
WebAssembly Lift to typed Rust, TypeScript, WAT, or C with DWARF recovery. GC, component model, threads, SIMD, tail-call, memory64. Recover (4 reversers): Jscrambler-WASM, Wobfuscator, Tigress-via-Emscripten, wasm-mixer. Detect-only: wasm-name-obfuscator (its hex renames destroy the original names).

JVM / Kotlin / Android / .NET

Surface Coverage
JVM / Kotlin / Scala In-house Rust decompiler for classfile 1.0.2-25 as the default. At least 93.1% of EdgeCases methods recompile error-free under real javac (CI floor 122 of 131; 128 measured on JDK 25). ProGuard/R8 mapping replay (overload-correct name restore). Optional --backend cfr|vineflower|procyon|jadx.
Android / DEX In-house Rust decompiler for DEX 1.0-16. 102 of 103 recovered classes (99% of verifiable) pass the real JVM verifier (-Xverify:all) on the committed corpus, with a live-range-splitting pass for registers carrying conflicting types across joins. APK signature v1-v4 verify, BlackObfuscator control-flow deflatten.
JVM / Android obfuscators (9) Recover: ProGuard/R8 name restore. Partial (detect + structural peel, with in-class string-decrypt emulation for keyed-constant variants): Zelix KlassMaster, Allatori, Stringer, DashO, DexGuard. Detect-only: yGuard, SkidSuite2, JBCO.
Android RASP (8 vendors, detect-only) Promon SHIELD, Guardsquare DexGuard RASP, Guardsquare ThreatCast, Appdome Mobile Shield, OneSpan (Vasco), Arxan/Digital.ai, Zimperium zShield, Licel DexProtector.
.NET / CIL (20 protectors) In-house CIL to C#/F#/VB; full PE + CLR + table-stream parser, R2R + native-AOT classify. Reverse on real samples: ConfuserEx2 constant decryption (real committed sample), the Eazfuscator VM tier (57 of 57 instructions, re-injected to run byte-identical to baseline), the KoiVM VM tier (6 of 6 bodies lifted to CIL). Detect + structural peel: ConfuserEx, SmartAssembly, Babel, Crypto Obfuscator, .NET Reactor, Agile.NET, Dotfuscator, Dotfuscator CE, DeepSea, Spices.Net, Skater, Goliath, ArmDot, Obfuscar. Detect-only (runtime-key wall): ILProtector, MaxToCode, and the Themida/.NET wrapper derive their per-method key in a native loader absent from the artifact. Full ConfuserEx2 cleanup can delegate to --backend de4dot|ilspy|dnspy.

Native (PE / ELF / Mach-O / COFF)

Surface Coverage
Symbols and structure DWARF/PDB/STABS across x86/ARM/RISC-V/MIPS/PowerPC/SPARC/eBPF. Rust + C++ + Swift + Itanium demangle and restoration; C++ RTTI/vtable and class-hierarchy recovery from the in-memory layout. Not a raw SSA decompiler (native decompile orchestrates headless Ghidra for that): this is the unpack, symbol-recovery, disassembly, and chain-detect layer that feeds Ghidra/IDA cleaner input.
Disassembler An in-tree iced-backed disassembler discovers functions without symbols, builds the whole-program call graph and per-function CFG (native disasm --emit cfg-dot), and renders Intel/AT&T/NASM/MASM listings with per-instruction register, memory, and rflags effects. native callgraph, native patch, native sigmaker, and native diff (relocation-invariant function matching) all work on stripped input.
Packers (27 families) Recover (byte-level, in-house x86 stub emulator): UPX (.text and .pdata byte-identical, ~96% whole loaded image), kkrunchy classic (byte-exact), NSPack (~99% content-section), Petite, MPRESS, MEW, FSG, ASPack, PECompact (recovered .text byte-identical, import table >=98%), Yoda's Crypter (.rsrc byte-identical, .text decrypted to full plaintext). Partial (stub emulator validated on a spec stub, real-sample recovery pending): ASProtect, Morphine, nPack, NeoLite, PolyCryptor, Warzone Crypter. Detect + carve (virtualized, original code not recoverable by unpacking): VMProtect, Themida, Yoda's Protector. Detect-only: WinLicense, Enigma Protector, Armadillo, Obsidium, PE-Protector, PELock, .NET Patcher, NetCryptor; their handler stream is keyed per-machine at run time and absent from the file.
Queryable IR disrobe query runs a queryable-IR layer over the disassembled code (functions, calls-to a target, xrefs-to a symbol, string-decoder-shaped functions, complexity-over a threshold, capability sites for network/crypto/filesystem/process), symbol-independent. disrobe capabilities runs a rule engine over the same IR and maps matched behaviors to MITRE ATT&CK techniques and Malware Behavior Catalog IDs with per-match evidence. Both read a stripped binary or a .dr envelope.

Other runtimes

Ecosystem Coverage
Go GoReSym + redress symbol recovery, garble name-recovery via pclntab, embedded-FS walker, pclntab eras go1.2-go1.20+ (covering go1.26), validated on a committed stripped go1.26.3 fixture. CI gates type-name resolution at a >=85% floor (528 of 528 on the fixture). garble -literals rebuilds rodata-key literals via static init-thunk emulation; heavy-mutation variants are the boundary.
Lua (14 families) Bytecode for 5.1-5.4, LuaJIT 2.0/2.1, full Luau (all 82 opcodes), GLua, and SLua. Recover (full VM devirt): IronBrew2 (real 2.7.0, standard and MAX mode: opcode-permutation and xor-key reconstruction, constant-pool decode, bytecode lift, validated by a real-lua execution differential). Partial (detect + peel): Prometheus, MoonSec V1/V2/V3, AztupBrew, DarkSec, Boronide, PSU, WeAreDevs, luaobfuscator.com, SLua. MoonSec-shape recovery runs against a synthetic bootstrap pending a real sample.
Ruby / PHP / BEAM Ruby: MRI/YARV 2.6-3.4 + mruby decompile via a recompile-equivalence oracle, plus JRuby, TruffleRuby AOT, and the Ruby2Exe and OCRA freezers. PHP: source + bytecode recovery, Phar decode, Zend legacy XOR decrypt; ionCube, SourceGuardian, and Zend Guard are detect-only (native-loader-resident key). BEAM: .beam and .ez chunk parse + Core Erlang lift + Elixir Dbgi quoted-AST.
React Native Hermes Bytecode v60-v96. A committed hermesc-built v96 sample is CI-gated at 8 of 8 functions recovered with 100% op-coverage; older headers down to v60 parse. Parsed the 122,633-function table of a 66 MiB production bundle locally with no parse failure. Routes Hermes, Xamarin, Cordova, Capacitor, and NativeScript bundles out of .apk/.ipa containers.
Flutter / Swift / Obj-C / AS3 Flutter: Dart kernel parse plus byte-exact body recovery from the kernel source table and ARM64 AOT disasm, CI-gated. Swift/Obj-C: Mach-O class-dump and SwiftConfidential/SwiftShield rename-undo, CI-gated. AS3: SWF (uncompressed, zlib, LZMA) + ABC bytecode disasm and class-skeleton render.
Shell / scripting (19 families) PowerShell: Invoke-Obfuscation (token, AST, string, encoding, compress, launcher), Invoke-Stealth, PowerHell, Chameleon, psobf, ISESteroids. Bash: Bashfuscator (token, string, obfuscate, compress) and IFS/eval indirection. Batch: %random% and set-indirection. VBA/VBS/WSH: full VBA p-code decompile (264-opcode table, VBA3/5/6/7) with VBA-stomping detection.
Nim / Zig / Crystal / Perl / R / Tcl / Haxe Detect + name-demangle + symbol/metadata recovery from each binary's own tables (source is compiler-erased). Tcl starkit byte-identical extract, R .rds round-trip, Perl B::Concise op-tree plus a ByteLoader bytecode decoder validated against a real perl 5.8.9 sample, Haxe cross-target detect + route.

Containers, archives, filesystems, firmware

Detects and chains 98 container/archive/filesystem/firmware formats, all 98 with in-tree extractors that write member bytes.

Class Formats
Archives / installers ZIP, tar, 7z, RAR4/RAR5 (stored plus RAR5 LZ "normal"), cab, .deb, .rpm, MSI, NSIS (solid and non-solid), Docker, OCI, ISO 9660 + Joliet, macOS .pkg xar, .dmg UDIF, InnoSetup (decoded setup-data stream), InstallShield (stored + zlib), Bun standalone exes, Unity AssetBundle UnityFS
Single-stream compression gz, bz2, zst, lzma, lzip, lz4-frame, zlib, .Z
Legacy archives ar, arj, arc, lzh, lzop, uzip, Xamarin xalz, par2, ELF appended-overlay carve
Embedded-linux filesystems squashfs, cramfs, ext4, romfs, minixfs, jffs2, UBI + UBIFS, yaffs, erofs, NTFS, android-sparse, btrfs-send
Disk images / partitions GPT, MBR, VHD (fixed + dynamic), VHDX, WIM (XPRESS/LZX/LZMS chunk payloads decompressed in-tree), each carved to partitions and walked through FAT12/16/32 to the stored files
Vendor firmware D-Link AES, EnGenius XOR, Autel table, QNAP PC1, plus CRC-verified Netgear/Xiaomi/Tesla carves

A recursive carve-everything engine (multi-magic scan, chunk model, depth recursion, entropy gating) drives nested extraction, with universal zip-slip and bomb guards. A few heavy codecs are carved or reported rather than fully decoded: ARJ method 4, ARC methods 5-7, EROFS microlzma and the compact index, StuffIt compressed forks (proprietary, no public spec), and OTP-AES airoha firmware (an information-theoretic wall).

Anti-analysis defeat

disrobe is a static, deterministic analyzer that never runs the sample. It recognizes the standard anti-static-analysis arsenal and recovers what is statically recoverable.

Technique What disrobe does
Signature defeat Identification never trusts a single magic byte. A zeroed/flipped magic, renamed UPX0/UPX1 sections, or a corrupt UPX! marker is re-identified from internal self-consistency (PE through e_lfanew, ELF/Mach-O by header offsets that close against the file, ZIP by its end-of-central-directory anchor, DEX by section-offset consistency, classfile by a constant-pool walk, wasm by the LEB section stream). A real UPX executable with a flipped MZ and renamed sections still unpacks byte-identically.
String / data encryption Recovers single-byte XOR stack strings with English-likeness key detection, per-family keyed strings (Mirai/Dridex/Trickbot), JVM string-encryption by emulating the in-class decrypt method, .NET constant decryption (ConfuserEx2, on a real sample), JS string-array rotation, and Python exec/eval/compile payloads through base64/85/16/32/zlib decode chains. Runtime-keyed schemes are flagged as walls.
Control-flow obfuscation Deflattens BlackObfuscator DEX flattening, reverses OLLVM-style control-flow flattening, bogus control flow, and instruction substitution on native, drops obfuscator-planted out-of-range exception entries that poison the JVM CFG, and reconstructs JS control-flow from flattened dispatchers.
Anti-disassembly / MBA The JVM, Dalvik, and CIL decoders tolerate broken StackMapTable, fake exception ranges, and illegal-but-verifiable bytecode. On native, jump-into-the-middle desync, overlapping instructions, and opaque predicates are resolved in-tree, and a mixed-boolean-arithmetic simplifier (also wired through the JS and WebAssembly decoders) collapses MBA expressions back to their algebraic form.
Bytecode virtualization Lua: devirtualizes real IronBrew2 2.7.0 in standard and MAX mode, graded by a real-lua execution differential. .NET: lifts the Eazfuscator and KoiVM VM tiers back to CIL on real samples. Native: disrobe native devirt locates the interpreter, fingerprints each handler's micro-op behaviorally through the in-tree x86 emulator, and lifts to a re-executable IR plus pseudo-code, validated end-to-end on a self-authored Tigress-shape VM. VMProtect/Themida/Enigma front-ends are extended per published RE write-ups, not a running commercial sample, and a per-machine-keyed handler stream is the one residual.
Overlay inflation The PE overlay carve computes the true end of the executable image and isolates any trailing archive (gzip/xz/zstd/bzip2/tar/7z/cab/rar) into its own segment, so padding cannot mask an appended payload.
Symbol stripping Restores ProGuard/R8 names from mapping.txt (overload-correct), recovers Go type and stdlib names from pclntab/moduledata on stripped binaries, demangles Rust/C++/Swift/Itanium, and recovers structure from DWARF. garble name-hashing (HMAC-SHA256 over an absent build seed) is a wall, but structure, types, and control flow recover regardless.
Runtime-keyed protection PyArmor v6-v9 static decryption succeeds when the pyarmor_runtime is supplied; with no runtime, the verdict routes to the dynamic-capture path rather than emitting fabricated plaintext.

Install

Prebuilt binaries from the Releases tab, or build from source.

Prebuilt binaries (recommended)

Download from the Releases page. Windows, Linux (glibc + musl), and macOS, each for x86-64 and ARM64, with SHA256SUMS and a cosign signature bundle per archive. Verify, extract, and place disrobe (disrobe.exe on Windows) on your PATH.

sha256sum -c SHA256SUMS

Build from source

Requires Rust 1.95+ stable. That is the only build dependency.

git clone https://github.com/1-3-7/disrobe
cd disrobe
cargo build --release
./target/release/disrobe doctor   # optional: probe ~50 external tools

A release build takes about 4-6 minutes on commodity hardware.

Quick start

disrobe auto suspect.exe --out recovered/            # auto-detect + chain the whole pipeline
disrobe py decompile module.pyc --out recovered/
disrobe pyinstaller extract onefile.exe --out out/
disrobe pyarmor unpack protected.py --out out/       # add --allow-dynamic only on trusted samples
disrobe js deob bundle.min.js --out clean.js
disrobe js unbundle app.bundle.js --out src/
disrobe wasm decompile module.wasm --target rust --out lifted.rs
disrobe jvm decompile app.apk --out src/             # in-house Dalvik decompiler is the default
disrobe dotnet decompile App.dll --out src/          # in-house CIL decompiler is the default
disrobe native unpack packed.exe --out unpacked.bin
disrobe query packed.exe "calls-to recv"            # queryable IR over stripped code
disrobe go recover app --out symbols.json
disrobe lua decompile script.luac --out script.lua
disrobe hermes decompile index.android.bundle --out surface/

disrobe auto fingerprints the input and chains the full pipeline in one call (PE -> UPX -> rust-demangle, APK -> dex -> Java, PyInstaller -> PyArmor -> .pyc decompile). With --capture-stages, stage outputs land in out/01-*/, out/02-*/, ..., out/final/. Run disrobe --help for the full surface, disrobe <pass> --help for any subcommand, disrobe passes to list passes, disrobe catalog to list every supported family with its level, and disrobe install --list for optional external tools.

disrobe chain auto-detection: PyInstaller, PyArmor, UPX, APK, and DEX unpacking pipelines

Query stripped code

disrobe query is a queryable-IR layer over the in-tree disassembler, so a reverse-engineer can ask structural questions of a stripped binary or a cached .dr envelope without symbols. It is the same IR the --llm sidecar and disrobe capabilities read.

disrobe query sample.bin functions                  # every discovered function + complexity
disrobe query sample.bin "calls-to recv"            # callers of a target
disrobe query sample.bin "xrefs-to aes_key"         # references to a symbol or string
disrobe query sample.bin string-decoders            # decoder-shaped functions (loop + xor/add)
disrobe query sample.bin "complexity-over 40"       # functions above a cyclomatic threshold
disrobe query sample.bin "capability network"       # network / crypto / filesystem / process sites

Every verb emits JSON with --json for scripting. disrobe capabilities sample.bin runs the rule engine over the same IR and tags each match with its MITRE ATT&CK technique and Malware Behavior Catalog ID plus the instruction evidence.

Recon with frisk

disrobe frisk is disrobe's built-in recon engine. Point it at any file, directory, APK, or disrobe-recovered source tree and it surfaces leaked secrets (cloud keys, SaaS/AI tokens, private keys), API endpoints and routes, cloud-storage buckets, Android manifest exposure (deep-link schemes and hosts, exported components, content-provider authorities, dangerous permissions), and IOCs (URLs, domains, IPs, emails, .onion, webhooks), each with its file, line, and column. Because disrobe recovers the real source first, frisk searches truth, not a shell grep, and it is fully encoding-safe.

disrobe frisk app/                                  # walk a directory or recovered source tree
disrobe frisk app.apk                               # APK manifest exposure + secrets + IOCs
disrobe frisk recovered/ --format json              # text, json, or sarif
disrobe frisk recovered/ --format sarif > frisk.sarif
disrobe frisk app/ --pattern rules.txt              # custom rule pack: name=regex per line
disrobe frisk app/ --suppress example.com           # drop findings whose value contains a substring
disrobe frisk app/ --emit-baseline > baseline.json  # snapshot current findings
disrobe frisk app/ --baseline baseline.json         # report only new findings
disrobe frisk app/ --entropy                        # include high-entropy generic-secret findings

Recovery

Every number below is produced by a committed test gate or a local measurement harness, never by the tool grading its own output. Sources are cited per value in xtask/data/recovery.json. A value is tagged [CI] when a committed gate reproduces it from this repo, or [local] when it is measured against a sample that license or size keeps out of the tree.

Measured recovery by ecosystem

  • Python [CI]: a 90% per-code-object recompile-equivalence floor on a pinned 200-module CPython 3.14 stdlib corpus (6286 code objects); measured 92.76% (5831 of 6286) on CPython 3.14.5. The legacy 1.0-3.7 band asserts a floor of 152 of 191 verified (166 of 191 local with the full interpreter zoo).
  • JVM classfile [CI]: recompiles at least 93.1% of EdgeCases methods error-free under real javac (floor 122 of 131; 128 of 131 measured on JDK 25).
  • Dalvik [CI]: the real JVM verifier (-Xverify:all) passes 99% of verifiable recovered classes on the committed dex corpus (102 of 103, 0 lifter verify failures); 307 re-hosted bodies verify clean. On gitignored real FOSS apks the lifter self-reports a body for 89-92.5% of methods [local].
  • Ruby YARV [CI]: 100% opcode-equivalence on greeter, 85% on megafile, via the execution-differentials job with a provisioned Ruby.
  • WebAssembly [CI]: all 94 functions across the 30 parseable corpus modules are fully op-covered; 24 of 24 numeric-ABI functions are execution-equivalent under wasmtime (1 also byte-identical in memory), the rest op-coverage-only.
  • Go [CI]: resolves a name for >=85% of the type descriptors extracted from the committed stripped go1.26.3 fixture (528 of 528 on that fixture).
  • Hermes [CI]: a committed hermesc-built v96 sample recovers 8 of 8 functions at 100% op-coverage. Separately [local]: parsed the 122,633-function table of a 66 MiB production bundle with no parse failure.

Breadth ([CI] unit-tested): containers detect 98 formats and write member bytes in-tree for all 98; the native packer pass covers 27 families; the .NET pass detects 20 protectors and reverses ConfuserEx2 constant decryption, the Eazfuscator VM tier at 57 of 57 instructions, and the KoiVM VM tier on a real-KoiVM-tool sample (6 of 6 bodies lifted to CIL, graded against the clean baseline); PyArmor recovers 72 of 72 samples [local]; the Lua devirtualizer reverses real IronBrew2 2.7.0 in standard and MAX mode, graded by a real-lua execution differential [CI]. Regenerate the chart with cargo run -p xtask -- graphs.

Benchmarks

Reproducible measurement harnesses under benches/. Every number comes from a real measurement (a byte compare, disrobe's own disassembler, or the CI-gated recovery.json), never the tool grading itself.

  • Native unpack: per-packer recovery with no external tool. Recovered .text byte-identity where a disk-aligned reference exists (UPX .text byte-identical), Shannon entropy packed vs unpacked, and disassembler-resolvable intra-code calls before vs after (PECompact goes from 0 to 261). Overlay and flat-dump packers report n/a for byte-identity rather than an inflated figure.
  • Ghidra cleaner input: the packed and rebuilt PEs are measured with disrobe's own iced-x86 decoder, no external tool. A PECompact .text goes from near-random entropy (~8.0, 0 resolvable intra-code calls) to code-like (6.52, 261 resolvable calls); MEW exposes no executable section packed, then decodes to tens of thousands of instructions rebuilt, with an optional headless Ghidra cross-check script committed alongside.
  • Decompile quality: the per-ecosystem recompile-equivalence and verifier rates, read verbatim from the CI-gated recovery.json.

Comparison

Ecosystem Leading tools Where disrobe differs
Python pycdc, pylingual, uncompyle6, decompyle3, pychd Spans 3.6-3.15 in one engine, correctness checked by recompiling the recovered source and diffing opcodes; deterministic, no LLM, no benchmark contamination. uncompyle6 stops at 3.8, decompyle3 at ~3.9. pychd reaches 3.0-3.14 but is LLM-assisted and self-flagged "likely contaminated", with pass bodies in deterministic mode; disrobe has no model to contaminate and reconstructs the actual bodies.
JVM / Android CFR, Vineflower, Procyon, jadx In-house Rust decompiler is the default (those become optional --backends). 97.7% method recompile on the construct megafixture and 99.9% on a real 131 MB Shopify APK (62,953 classes, 380,525 methods); adds chain auto-detect, .apkm/.xapk/.aab routing, .dr envelopes, and APK sig verify in one binary.
.NET / CIL ILSpy, dnSpy, de4dot In-house CIL to C#/F#/VB plus actively maintained in-house obfuscator reversal (de4dot has been unmaintained since 2020); deterministic .dr output.
Native Ghidra, IDA, Binary Ninja Not a competitor on raw decompilation; the unpack + symbol-recovery + chain-detect layer that feeds them cleaner input.
Native packers per-packer scripts; unipacker; UPX only unpacks UPX General-purpose unpacker for the tier in one static Rust binary (no Python or unicorn), with an in-house x86 stub emulator and per-fixture byte-recovery scores against real committed originals.
JS webcrack, synchrony, REstringer Full obfuscator.io + JS-Confuser + Jscrambler + 11 bundlers with source-map reconstruction, behind a deterministic codegen.
WASM wasm-decompile, wasm2c, wasm-tools The only one lifting to typed Rust/TS/WAT/C with DWARF recovery, proven by a wasmtime execution differential (24 of 24 numeric-ABI functions equivalent).

For Python, the citable difference is version span, not a head-to-head accuracy number (disrobe does not run competing tools, so any such figure would be invented). The chart below plots each tool's claimed CPython support; the deterministic-versus-LLM distinction is annotated, not scored.

Python decompiler version coverage: disrobe vs uncompyle6, decompyle3, pycdc, pylingual

The full comparison covers every ecosystem and is candid about where a mature dedicated tool is the better choice: FFDec for deep Flash work, Ghidra or IDA for native decompilation, and the AOT-compiled languages where no source body survives.

Use it as a library

disrobe is built to be embedded, not just run from a shell. The CLI is a thin layer over the same crates, so a TUI, an IDE plugin, a web service, or a batch engine can drive the full pass set directly.

  • Rust. Every pass is its own crate (disrobe-pass-py-decompile, disrobe-pass-jvm, disrobe-pass-native, disrobe-pass-dotnet, ...) over the shared disrobe-core and disrobe-ir types; depend on the ones you need. The chain runner in disrobe-binfmt handles detection and recursive unpacking.
  • Python. import disrobe (a pyo3 abi3 module, Python 3.9+ shipping a full .pyi and py.typed, built with maturin from bindings/python): bytes in, concrete typed report objects out, deterministic, and the bindings never touch the filesystem so the caller owns all I/O. The surface spans auto, typed entry points for every major ecosystem, a generic disasm/parse/compile/decompile dispatch, a mutable CodeObject you load from a .dr envelope, edit, and re-serialize, and a register_pass/register_consumer registry for your own stages.
  • Daemon. disrobe serve speaks HTTP, gRPC, and LSP, taking base64 bytes and returning structured JSON, so any language can drive it over a socket. disrobe serve --mcp exposes the same operations as Model Context Protocol tools for AI agents.
use disrobe_ir::payload::DisasmPayload;
use disrobe_pass_native::build_disasm_payload;
use disrobe_pass_py_decompile::{decompile_pyc, NativeDecompile};
use disrobe_query::{Module, QueryResult, run_expr};

fn recover(pyc: &[u8], native: &[u8]) -> anyhow::Result<()> {
    let decompiled: NativeDecompile = decompile_pyc(pyc)?;
    let source: &str = decompiled.source.as_str();
    println!("recovered {} bytes of python source", source.len());

    let payload: DisasmPayload = build_disasm_payload(native)?;
    let module: Module = Module::from_disasm(&payload);
    let result: QueryResult = run_expr(&module, "calls-to recv")?;
    println!("{} callers of recv", result.count());
    Ok(())
}
import disrobe
from disrobe import Capabilities, CanonicalSource, ChainReport, CodeObject, Instruction, Symbol

with open("sample.bin", "rb") as f:
    chain: ChainReport = disrobe.auto(f.read())
print(chain.spec, chain.pass_count, chain.terminated)

with open("module.pyc", "rb") as f:
    recovered: CanonicalSource = disrobe.decompile("python-bytecode", f.read())
source: str | None = recovered.source

with open("packed.exe", "rb") as f:
    caps: Capabilities = disrobe.capabilities(f.read())
print(caps.format, caps.match_count)

with open("module.dr", "rb") as f:
    obj: CodeObject = CodeObject.from_dr(f.read())
obj.add_symbol(Symbol(0x401000, "decrypt_config"))
obj.add_instruction(Instruction(0x401000, "xor", ["eax", "eax"]))
patched_dr: bytes = obj.to_dr()

Full function surface and conventions: the Python-bindings reference.

The five-rung IR ladder

Every artifact climbs the same intermediate-representation ladder. This is what lets passes from different ecosystems compose through a shared .dr envelope.

   Raw  -->  Disasm  -->  MIR  -->  HIR  -->  Surface
   bytes     opcodes      mid       high      source

Unpacking and decryption passes operate at Raw; byte-exact recovery lives here. Disassembly produces Disasm. Decompilers do their structural work at MIR and HIR, then render Surface. For Python, the MIR pre-pass reconstructs nested constructs from the 3.11+ exception table before the instruction walk, and the Surface output is recompiled and verified opcode-for-opcode. See the architecture docs for the full model.

The .dr envelope

Every recovered artifact is content-addressed and persisted as a .dr envelope: an rkyv-encoded payload, a postcard metadata sidecar, and a BLAKE3 root over both. Identical input produces a byte-identical envelope, so a cache hit and a fresh run are indistinguishable, chains compose offline, and a result can be diffed, signed, or replayed deterministically. The Python CodeObject.from_dr / to_dr round-trip and the disrobe query reader both operate on the same envelope.

Common flags

Flag Effect
--json / --ndjson / --sarif Structured output (SARIF 2.1.0 for GitHub code scanning)
--llm Emit the structured metadata sidecar (18 categories, 4 packs) for LLM consumers
--backend <tool> Select an optional external decompiler instead of the in-house default
--dry-run Report what would happen, write nothing
--no-cache Bypass the .dr envelope cache (output is identical either way)
--seed <N> RNG seed for any non-deterministic backend
--i-have-authorization Gate flag for grey-zone commercial protectors and the decryption-keys metadata
DISROBE_DEBUG=<area> Env var that streams every offset, size, candidate, and classification a pass walked to stderr (nuitka, all, or a comma-list); DISROBE_DEBUG_FORMAT=json for one JSON object per event, color auto-detected on a TTY. Secret-shaped strings are auto-redacted

Safety posture

By default disrobe does not execute the sample; every default path is pure static analysis. The pickle suite is symbolic and never unpickles. The only code-execution paths, the PyArmor v6/v7 dynamic hook and the BCC native lift, sit behind explicit --allow-dynamic and --allow-bcc flags with a watchdog; run those inside a sandbox. The parsing surface is hardened against malformed input. See Forensics and malware-safety posture and the full threat model.

Limits

Recovery is bounded by what the compiler left behind, and disrobe reports those bounds rather than rounding them away.

  • Bytecode-to-source is structurally faithful but never byte-identical: .class, .dex, and CIL erase local names, generics, comments, and exact formatting.
  • Nuitka, Nim, Zig, and Crystal lower to native code, so recovered bodies are skeleton-to-partial (Nuitka) or demangling-and-symbols-only (the rest); a release ARM64 AOT Flutter .so erases bodies while a Dart kernel snapshot yields byte-exact recovery.
  • Native VM devirtualization for VMProtect, Themida, Enigma, and comparable commercial virtualizing protectors is out of scope: those are detect + carve. The generic disrobe native devirt lifter is validated on a self-authored Tigress-shape VM; a handler stream assembled at run time from a per-machine key is the information-theoretic wall.
  • Runtime-key families need the key: PyArmor v3-v5, ionCube, SourceGuardian, Zend Guard modern, ILProtector, and MaxToCode derive their key in a native loader or live process that is absent from the artifact. Where the key, the source body, or the payload was never written into the file, disrobe says so and points at what would recover it.

Documentation

The full docs site lives at 1-3-7.github.io/disrobe: architecture, the IR ladder, the chain runner, per-language guides, the Python-bindings reference, the complete CLI reference, and the safety posture. The book source is under docs/; release history is in CHANGELOG.md.

Integrations

  • GitHub Action: a composite action (uses: 1-3-7/disrobe@v0.10.0) that downloads the release binary, scans a path or glob, and uploads SARIF to code scanning.
  • pre-commit hook: a pre-commit.com hook that scans staged files and blocks a commit when a packed or obfuscated artifact is detected.
  • MCP server: a Model Context Protocol server (disrobe serve --mcp) exposing detect/decompile/IOC/behavior/strings as tools, bytes in and structured JSON out, for AI agents.
  • Editor plugins: generated scaffolds for VS Code, IDA Pro, and Ghidra under editors/. Install with bash editors/install.sh <vscode|ida|ghidra> (Linux/macOS) or .\editors\install.ps1 <vscode|ida|ghidra> (Windows).

FAQ

What is disrobe? A universal decompiler, deobfuscator, and unpacker in a single Rust binary. It recovers source and behavior from compiled, frozen, packed, and obfuscated software across 20+ ecosystems, deterministically and offline.

How do I unpack PyArmor? disrobe pyarmor unpack protected.py --out out/. The in-house unpacker handles v6 through v9-pro and recovered 72 of 72 samples in a local corpus. Add --allow-dynamic only on trusted samples, inside a sandbox.

How do I extract a PyInstaller exe? disrobe pyinstaller extract onefile.exe --out out/. The extractor covers 2.x through 6.20+, carves the embedded .pyc modules, then disrobe py decompile recovers source from each. Nuitka, cx_Freeze, py2exe, PyOxidizer, shiv, pex, and Briefcase are also supported.

How do I decompile a .pyc file? disrobe py decompile module.pyc --out recovered/. The in-house decompiler spans CPython 3.6 through 3.15 with recompile-verified output, plus a legacy 1.0-3.7 band, fully deterministic with no LLM.

How do I deobfuscate JavaScript? disrobe js deob bundle.min.js --out clean.js reverses the obfuscator.io pipeline, JS-Confuser, Jscrambler, and esoteric encoders; disrobe js unbundle un-webpacks 11 bundlers with scope-aware renaming and source-map reconstruction.

How do I decompile WebAssembly? disrobe wasm decompile module.wasm --target rust --out lifted.rs lifts to typed Rust, TypeScript, WAT, or C with DWARF recovery, covering GC, the component model, threads, SIMD, tail-call, and memory64.

How do I decompile a .NET DLL? disrobe dotnet decompile App.dll --out src/. The in-house decompiler lifts CIL to C#/F#/VB and detects 20 protectors; it reverses ConfuserEx2 constant decryption on a real sample, devirtualizes the Eazfuscator VM tier against an in-repo EazVM virtualizer of our own, and devirtualizes the KoiVM VM tier on a sample produced by the real KoiVM tool. ILSpy, dnSpy, and de4dot are optional --backend fallbacks.

How do I decompile an APK or DEX file? disrobe jvm decompile app.apk --out src/. Recovered bodies are validated by the real JVM verifier: 99% of verifiable classes pass on the committed dex corpus (102 of 103). It replays ProGuard/R8 mappings, verifies APK signatures (v1-v4), and peels Zelix, Allatori, Stringer, DashO, and DexGuard. The same engine decompiles plain .class bytecode (classfile 1.0.2 through 25).

How do I unpack UPX? disrobe native unpack packed.exe --out unpacked.bin. The NRV2B/D/E and LZMA decoders recover the executable code byte-identical: on the committed hello fixture .text and .pdata are bit-exact and the whole loaded image is ~96%, with the residual confined to loader-rebuilt relocations and IAT. disrobe also unpacks kkrunchy, FSG, MEW, NSPack, Petite, MPRESS, ASPack, PECompact, and Yoda's Crypter, and detects-and-carves the virtualized tier (VMProtect, Themida, Enigma).

How do I query a stripped binary? disrobe query sample.bin "calls-to recv" (or functions, xrefs-to, string-decoders, complexity-over N, capability network). It runs a queryable-IR layer over the in-tree disassembler with no symbols required, and disrobe capabilities maps matched behavior to MITRE ATT&CK and Malware Behavior Catalog IDs.

How do I hunt for secrets and IOCs? disrobe frisk app.apk (or a directory, or a recovered source tree). It reports leaked secrets, API endpoints, cloud buckets, Android manifest exposure, and IOCs with file, line, and column, in text, --format json, or --format sarif. Custom rules come from --pattern rules.txt; --emit-baseline then --baseline reports only new findings. Because disrobe recovers the source first, frisk scans the real code, not minified or packed bytes.

Does disrobe run an LLM or upload my files? No. Every pass is deterministic and fully offline with zero telemetry. The browser playground runs the same passes compiled to WebAssembly, entirely client-side; nothing is uploaded.

Is disrobe a good CTF tool? Yes. One static binary auto-detects and chains the whole pipeline, covers Python, JS, WASM, .NET, JVM, native packers, Lua, Go, and Ruby, and emits JSON/SARIF for scripting, with no JVM, Python, or Docker runtime to install.

Legal

Decompilation for security research, interoperability, and recovery of your own source is permitted in most jurisdictions (17 U.S.C. § 1201(f), Directive 2009/24/EC Art. 6, CDPA 1988 ss. 50B-50BA, and equivalents in CA/AU/JP). The full posture with statutory citations and a takedown channel is in LEGAL.md.

Important

Grey-zone commercial protectors are gated behind the explicit --i-have-authorization flag and never run otherwise. Use is your responsibility per the statutory framing above.

Contributing

Contributions are welcome; see the contributing guide. For security issues, open a private advisory rather than a public issue. See SECURITY.md.

License

Elastic License 2.0. Companies and security researchers may use, copy, modify, and distribute disrobe for free; attribution is required, so keep the author, copyright, and licensing notices intact. You may not provide disrobe to third parties as a hosted or managed service, and you may not remove or obscure any licensing, copyright, or other notices. The "disrobe" name and marks are reserved and granted no rights by the license. See LICENSE and NOTICE.

About

Decompile, deobfuscate, and unpack almost anything: a universal, deterministic, single-binary reverse-engineering toolkit in Rust for Python, JVM/Android, .NET, WebAssembly, JS, Go, and native packers (UPX/PyArmor/PyInstaller/Nuitka) plus 20+ more. Built for malware analysis, CTFs, and security research.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Sponsor this project

Contributors