SKILL.md
$27
Symptom in IDA/Ghidra
Likely Obfuscation
Start With
Flat CFG, single giant switch
Control flow flattening
Symbolic execution to recover CFG
Only mov instructions
movfuscator
demovfuscation / trace-based lifting
pushad/pushfd → VM entry
VM protector
Handler table extraction
XOR loop before code execution
SMC / string encryption
Dynamic analysis, breakpoint after decode
Impossible conditions (opaque predicates)
Junk code insertion
Pattern-based removal
All strings unreadable
String encryption
Hook decryption routine, or emulate
No imports in IAT
Import hiding
Trace GetProcAddress / hash resolution
1. JUNK CODE & OPAQUE PREDICATES
1.1 Junk Code Insertion
Dead code that never affects program output, added to increase analysis time.
Identification:
- Instructions that write to registers/memory never read afterward
- Function calls whose return values are discarded and have no side effects
- Loops with invariant bounds that compute unused results
Removal strategy:
- Compute def-use chains (IDA/Ghidra data flow analysis)
- Mark instructions with no downstream use as dead
- Verify removal doesn't change program behavior (trace comparison)
1.2 Opaque Predicates
Conditional branches where the condition is always true or always false, but this is non-obvious.
Type
Example
Always Evaluates To
Arithmetic
x² ≥ 0
True
Number theory
x*(x+1) % 2 == 0
True (product of consecutive ints)
Pointer-based
ptr == ptr after aliasing
True
Hash-based
CRC32(constant) == known_value
True
Deobfuscation:
- Abstract interpretation: prove the condition is constant
- Symbolic execution: Z3 proves
∀x: predicate(x) = True
- Pattern matching: recognize known opaque predicate families
- Dynamic: trace and observe the branch is never taken / always taken
import z3
x = z3.BitVec('x', 32)
s = z3.Solver()
s.add(x * (x + 1) % 2 != 0)
print(s.check()) # unsat → always true
2. SELF-MODIFYING CODE (SMC)
Runtime code patching: encrypted code is decrypted just before execution.
2.1 XOR Decryption Loop (Most Common)
lea esi, [encrypted_code]
mov ecx, code_length
mov al, xor_key
decrypt_loop:
xor byte [esi], al
inc esi
loop decrypt_loop
jmp encrypted_code ; now decrypted
2.2 Analysis Strategy
1. Identify the decryption routine (look for XOR/ADD/SUB in loops writing to .text)
2. Set breakpoint AFTER the loop completes
3. At breakpoint: dump the decrypted memory region
4. Re-analyze the dumped code in IDA/Ghidra
5. For multi-layer: repeat for each decryption stage
2.3 Automated Unpacking via Emulation
from unicorn import *
from unicorn.x86_const import *
mu = Uc(UC_ARCH_X86, UC_MODE_32)
mu.mem_map(0x400000, 0x10000)
mu.mem_write(0x400000, binary_code)
mu.emu_start(decrypt_entry, decrypt_end)
decrypted = mu.mem_read(code_start, code_length)
3. CONTROL FLOW FLATTENING (CFF)
3.1 Structure
Original sequential blocks are transformed into a dispatcher loop:
Original: A → B → C → D
Flattened: ┌──────────────────┐
│ dispatcher │
│ switch(state) │◄─────┐
├──────────────────┤ │
│ case 1: block A │──────┤
│ case 2: block B │──────┤
│ case 3: block C │──────┤
│ case 4: block D │──────┘
└──────────────────┘
Each block sets state = next_state before jumping back to the dispatcher.
3.2 Recovery Techniques
Technique
Tool
Effectiveness
Symbolic execution
angr, Triton, miasm
High — traces all state transitions
Trace-based recovery
Pin/DynamoRIO trace → reconstruct CFG
Medium — covers executed paths only
Pattern matching
Custom IDA/Ghidra script
Medium — works for known flatteners
D-810 (IDA plugin)
IDA Pro
High — specifically designed for CFF
3.3 Symbolic Deflattening (angr approach)
import angr, claripy
proj = angr.Project('./obfuscated')
cfg = proj.analyses.CFGFast()
# Find dispatcher block (highest in-degree basic block)
dispatcher = max(cfg.graph.nodes(), key=lambda n: cfg.graph.in_degree(n))
# For each case block, symbolically determine successor
for block in case_blocks:
state = proj.factory.blank_state(addr=block.addr)
# ... solve state variable to find real successor
4. MOVFUSCATOR
4.1 Concept
All computation reduced to mov instructions only (Turing-complete via memory-mapped computation tables). Created by Christopher Domas.
4.2 Identification
- Function contains only
movinstructions (no add, sub, xor, jmp, call)
- Large lookup tables in data section
- Memory-mapped flag registers
4.3 Demovfuscation
Approach
Description
demovfuscator (tool)
Static analysis, recovers original operations from mov patterns
Trace + taint analysis
Run with Pin/DynamoRIO, taint inputs, observe computation
Symbolic execution
Treat entire function as constraint system
5. VM PROTECTION (VMProtect / Themida / Code Virtualizer)
5.1 VM Architecture
Protected code → bytecode compiler → custom bytecode
Runtime: VM entry (pushad/pushfd) → fetch → decode → execute → VM exit (popad/popfd)
5.2 VM Entry Point Identification
; Typical VMProtect entry
pushad ; save all registers
pushfd ; save flags
mov ebp, esp ; VM stack frame
sub esp, VM_LOCALS_SIZE ; allocate VM context
mov esi, bytecode_addr ; bytecode instruction pointer
jmp vm_dispatcher ; enter VM loop
5.3 Handler Table Extraction
1. Find dispatcher (large switch or indirect jump via table)
2. Each case/entry = one VM handler (implements one VM opcode)
3. Map handler addresses to operations by analyzing each handler:
- Handler reads operand from bytecode stream (esi)
- Performs operation on VM registers/stack
- Advances bytecode pointer
- Returns to dispatcher
5.4 Devirtualization Approaches
Method
Description
Tool
Manual handler mapping
Reverse each handler, build ISA spec
IDA + scripting
Trace recording
Record all handler executions, reconstruct program
REVEN, Pin
Symbolic lifting
Symbolically execute handlers, lift to IR
Triton, miasm
Pattern matching
Match handler patterns to known VM families
Custom scripts
5.5 VMProtect Specifics
- Uses opaque predicates in dispatcher
- Handler mutation: same opcode, different handler code per build
- Multiple VM layers (VM inside VM)
- Integrates anti-debug and integrity checks
6. STRING ENCRYPTION
6.1 Common Patterns
Pattern
Example
Recovery
XOR loop
for (i=0; i<len; i++) s[i] ^= key;
Hook or emulate XOR function
Stack strings
mov [esp+0], 'H'; mov [esp+1], 'e'; ...
IDA FLIRT / Ghidra script to reassemble
RC4 encrypted
Encrypted blob + RC4 key in binary
Extract key, decrypt offline
AES encrypted
Encrypted blob + AES key derived at runtime
Hook after decryption
Custom encoding
Base64 + XOR + reverse
Trace the decode function, replicate
6.2 Automated String Decryption
# Ghidra script: find XOR decryption calls, emulate them
from ghidra.program.model.symbol import SourceType
decrypt_func = getFunction("decrypt_string")
refs = getReferencesTo(decrypt_func.getEntryPoint())
for ref in refs:
call_addr = ref.getFromAddress()
# extract arguments (encrypted buffer ptr, key, length)
# emulate decryption, add comment with plaintext
7. IMPORT HIDING
7.1 GetProcAddress + Hash Lookup
FARPROC resolve(DWORD hash) {
// Walk PEB → LDR → InMemoryOrderModuleList
// For each DLL, walk export table
// Hash each export name, compare with target hash
// Return matching function pointer
}
7.2 Recovery
- Identify the hash algorithm (common: CRC32, djb2, ROR13+ADD)
- Compute hashes for all known API names
- Build hash → API name lookup table
- Annotate resolved calls in IDA/Ghidra
7.3 Common Hash Algorithms
Name
Algorithm
Used By
ROR13
hash = (hash >> 13 | hash << 19) + char
Metasploit shellcode
djb2
hash = hash * 33 + char
Various malware
CRC32
Standard CRC32 of function name
Sophisticated packers
FNV-1a
hash = (hash ^ char) * 0x01000193
Modern malware
8. ANTI-DISASSEMBLY TRICKS
8.1 Techniques
Trick
Mechanism
Fix
Overlapping instructions
jmp $+2; db 0xE8 (fake call prefix)
Manual re-analysis from correct offset
Misaligned jumps
Jump into middle of multi-byte instruction
Force IDA to re-analyze at target
Conditional jump pair
jz $+5; jnz $+3 (always jumps, confuses linear disasm)
Convert to unconditional jmp
Return address manipulation
push addr; ret instead of jmp addr
Recognize push+ret as jump
Exception-based flow
Trigger exception, real code in handler
Analyze exception handler chain
Call + add [esp]
call $+5; add [esp], N; ret (computed jump)
Calculate actual target
8.2 IDA Fixes
Right-click → Undefine (U)
Right-click → Code (C) at correct offset
Edit → Patch → Assemble (for permanent fix)
9. DECISION TREE
Obfuscated binary — how to approach?
│
├─ Can you run it?
│ ├─ Yes → Dynamic analysis first
│ │ ├─ Set BP on interesting APIs (file, network, crypto)
│ │ ├─ Trace execution to understand real behavior
│ │ └─ Dump decrypted code/strings at runtime
│ │
│ └─ No (embedded/firmware/exotic arch) → Static only
│ └─ Identify obfuscation type from patterns below
│
├─ What does the code look like?
│ │
│ ├─ Giant flat switch/dispatcher loop?
│ │ ├─ State variable drives control flow → CFF
│ │ │ └─ Use D-810 or symbolic deflattening
│ │ └─ Bytecode fetch-decode-execute → VM protection
│ │ └─ Extract handlers, build disassembler
│ │
│ ├─ Only mov instructions?
│ │ └─ movfuscator → demovfuscator tool
│ │
│ ├─ XOR/ADD loop writing to .text section?
│ │ └─ SMC → breakpoint after decode, dump
│ │
│ ├─ Impossible conditions in branches?
│ │ └─ Opaque predicates → Z3 proving or pattern removal
│ │
│ ├─ Disassembly looks wrong / functions overlap?
│ │ └─ Anti-disassembly → manual re-analysis at correct offsets
│ │
│ ├─ No readable strings?
│ │ └─ String encryption → hook decrypt function or emulate
│ │
│ ├─ No imports in IAT?
│ │ └─ Import hiding → identify hash, build lookup table
│ │
│ └─ pushad/pushfd → complex code → popad/popfd?
│ └─ VM protector entry/exit → full VM analysis
│
└─ What tool to use?
├─ Known protector (VMProtect/Themida) → specific deprotection guide
├─ Custom obfuscation → combine: IDA scripting + Triton + manual
├─ CTF challenge → angr symbolic execution often fastest
└─ Malware analysis → dynamic (debugger + API monitor) first
10. TOOLBOX
Tool
Purpose
Best For
IDA Pro + Hex-Rays
Disassembly, decompilation, scripting
All-around analysis
Ghidra
Free alternative with scripting (Java/Python)
Budget-friendly RE
D-810 (IDA plugin)
Automated CFF deflattening
OLLVM-style obfuscation
miasm
IR-based analysis framework
Symbolic deobfuscation
Triton
Dynamic symbolic execution
Opaque predicate solving, CFF
REVEN
Full-system trace recording and replay
VM protector analysis
demovfuscator
movfuscator reversal
mov-only binaries
x64dbg + plugins
Dynamic analysis with scripting
Windows RE
Unicorn Engine
CPU emulation
SMC unpacking, shellcode
Capstone
Disassembly library
Custom tooling
IDA FLIRT
Function signature matching
Identify library code in stripped binaries
Binary Ninja
Alternative disassembler with MLIL/HLIL
Automated analysis