Debugging your first x86 program
Debugging the compiled programs is one important aspect of learning x86 assembly language. When working with assembly programs, the only way to step through every single instruction written in the code is to debug the program using a debugger.
GDB is one of the most popular debuggers available for debugging Linux-based executables. GDB is also extensively used in exploit development and reverse engineering. This article focuses on understanding how to use GDB to step through the instructions of a given x86 assembly program.
See the previous article in the series, How to build a program and execute an application entirely built in x86 assembly.
The target executable
The readers will be introduced to using GDB in the later sections of the article. So, we will prepare a simple binary written in x86 assembly so we can use GDB against it to understand how GDB can be used to debug binaries. Following is the program we will use.
_start:
mov eax, 8
mov eax, 0xa
mov ebx, eax
mov ecx, [esp]We created a file named mov.nasm, and it starts with a directive called global, which tells our linker where the entry point of this program is. We are specifying that the entry point of this program is _start. The first instruction MOV EAX,8 moves the value 8 into the register EAX.
In the next instruction, we are moving 0xa into the EAX register. We are telling the program that we are moving a hex value that's decimal 10. Next, using the MOV EBX, EAX instruction, we are trying to move the value of a register into another register. Lastly, the MOV ECX, [ESP] instruction will essentially move the value which is pointed by the register ESP.
So, if we specify the instruction MOV ECX, [ESP], it will try to pick the address of ESP and it will move the value that's pointed by this ESP register into ECX.
Now let's use nasm to assemble this program. Let's type the following command.
The format is going to be elf32 and the output file will be mov.o. Now let's link it using ld. This can be done using the following command.
mov is going to be the final binary. We will debug this.
Debugging using GDB and GEF
To be able to examine the registers and the values that are being moved into registers, let's open this program using gdb using the following command.
It looks as follows.
GNU gdb (Ubuntu 9.1-0ubuntu1) 9.1
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
GEF for linux ready, type `gef' to start, `gef config' to configure
78 commands loaded for GDB 9.1 using Python engine 3.8
[*] 2 commands could not be loaded, run `gef missing` to know why.
Reading symbols from ./mov...
(No debugging symbols found in ./mov)
gef➤We are seeing GEF terminal instead of a plain GDB terminal because GEF for gdb has been installed in this case. This makes our life so easy while debugging a program. GEF is an extension that automates a variety of commonly used GDB commands and displays the results with a Graphical User Interface appearance.
Setting up a breakpoint
Now let us set up a breakpoint at the entry point of this program. Type the following.
Breakpoint 1 at 0x8049000
gef➤_start is the entry point of this program and we are using the command break to set up a breakpoint at the entry point. Next, type run so the program will run and it will pause the execution at the entry point because we did set up a breakpoint.
Starting program: /home/dev/x86/mov
Breakpoint 1, 0x08049000 in _start ()
[ Legend: Modified register | Code | Heap | Stack | String ]
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── registers ────
$eax : 0x0
$ebx : 0x0
$ecx : 0x0
$edx : 0x0
$esp : 0xffffd220 → 0x00000001
$ebp : 0x0
$esi : 0x0
$edi : 0x0
$eip : 0x08049000 → <_start+0> mov eax, 0x8
$eflags: [zero carry parity adjust sign trap INTERRUPT direction overflow resume virtualx86 identification]
$cs: 0x0023 $ss: 0x002b $ds: 0x002b $es: 0x002b $fs: 0x0000 $gs: 0x0000
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── stack ────
0xffffd220│+0x0000: 0x00000001 ← $esp
0xffffd224│+0x0004: 0xffffd3d3 → "/home/dev/x86/mov"
0xffffd228│+0x0008: 0x00000000
0xffffd22c│+0x000c: 0xffffd3e5 → "SHELL=/bin/bash"
0xffffd230│+0x0010: 0xffffd3f5 → "SESSION_MANAGER=local/x86-64:@/tmp/.ICE-unix/1721,[...]"
0xffffd234│+0x0014: 0xffffd447 → "QT_ACCESSIBILITY=1"
0xffffd238│+0x0018: 0xffffd45a → "COLORTERM=truecolor"
0xffffd23c│+0x001c: 0xffffd46e → "XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg"
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── code:x86:32 ────
0x8048ffa add BYTE PTR [eax], al
0x8048ffc add BYTE PTR [eax], al
0x8048ffe add BYTE PTR [eax], al
→ 0x8049000 <_start+0> mov eax, 0x8
0x8049005 <_start+5> mov eax, 0xa
0x804900a <_start+10> mov ebx, eax
0x804900c <_start+12> mov ecx, DWORD PTR [esp]
0x804900f add BYTE PTR [eax], al
0x8049011 add BYTE PTR [eax], al
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── threads ────
[#0] Id 1, Name: "mov", stopped 0x8049000 in _start (), reason: BREAKPOINT
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── trace ────
[#0] 0x8049000 → _start()
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
gef➤There is a breakpoint and the program execution is paused. Now there are few things that we would need to observe in the preceding output.
- Code - This is the area where we can see what instruction is going to be executed next.
- Stack - This area shows the stack. It is good to know where stack is in the output as stack will be used when dealing with subroutines.
- Registers - This section shows all the registers and the values currently stored in them.
When debugging programs, we predominantly use registers and instructions in the code section. We may also use stack occasionally.
The EIP register
Now, let's take a look at the EIP register.
The value in the EIP register is 0x08049000. Let us also take a look at the address of our first instruction that is about to be executed.
0x8049005 <_start+5> mov eax, 0xa
0x804900a <_start+10> mov ebx, eax
0x804900c <_start+12> mov ecx, DWORD PTR [esp]
0x804900f add BYTE PTR [eax], al
0x8049011 add BYTE PTR [eax], alIf you look at this address, this is the same as what we have seen in the EIP register. What it means is EIP always holds the address of the next instruction to be executed. Now let's try to execute this instruction and see what happens to see the value of EAX.
$ebx : 0x0
$ecx : 0x0
$edx : 0x0
$esp : 0xffffd220 → 0x00000001
$ebp : 0x0
$esi : 0x0
$edi : 0x0
$eip : 0x08049000 → <_start+0> mov eax, 0x8Currently, it's zero. If the first instruction in our program gets executed, EAX should contain 8. Let's type si and hit enter. si command is used to do a single step, which means execute one instruction. Let us see what happened to the EAX register.
$ebx : 0x0
$ecx : 0x0
$edx : 0x0
$esp : 0xffffd220 → 0x00000001
$ebp : 0x0
$esi : 0x0
$edi : 0x0
$eip : 0x08049005 → <_start+5> mov eax, 0xaIf you observe, the value 0x8 is moved into the register EAX.
Setting up breakpoints at a specific address
Now we can type si to execute the next instruction but I'd like to show you another feature.
Now we have already set up one breakpoint earlier at the entry point of this program. There is another way to set up breakpoints i.e using the address of an instruction. For instance, let's say we want to set up a breakpoint at the address 0x804900c. When the program hits this particular address it pauses the execution. So, let's try to do that. Following is the code section with the address highlighted.
0x8049005 <_start+5> mov eax, 0xa
0x804900a <_start+10> mov ebx, eax
0x804900c <_start+12> mov ecx, DWORD PTR [esp]
0x804900f add BYTE PTR [eax], al
0x8049011 add BYTE PTR [eax], alFollowing is the way to set up a breakpoint at a specific address.
Breakpoint 2 at 0x804900c
gef➤We are required to put an asterisk and then we will type the address. This is the way to set up a breakpoint using an address. As we can observe in the preceding excerpt, breakpoint 2 is now set. Now let us just continue executing this program by typing c or continue and observe what happens.
0x804900a <_start+10> mov ebx, eax
→ 0x804900c <_start+12> mov ecx, DWORD PTR [esp]
0x804900f add BYTE PTR [eax], al
0x8049011 add BYTE PTR [eax], al
0x8049013 add BYTE PTR [eax], al
0x8049015 add BYTE PTR [eax], al
0x8049017 add BYTE PTR [eax], al
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── threads ────
[#0] Id 1, Name: "mov", stopped 0x804900c in _start (), reason: BREAKPOINT
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── trace ────
[#0] 0x804900c → _start()
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
gef➤
As we can see in the text highlighted, the program stopped executing because of a breakpoint.
Listing all breakpoints:
We can see all the breakpoints using the GDB command info breakpoints.
Num Type Disp Enb Address What
1 breakpoint keep y 0x08049000 <_start>
breakpoint already hit 1 time
2 breakpoint keep y 0x0804900c <_start+12>
breakpoint already hit 1 time
gef➤As we can observe in the output, we are able to see how many breakpoints are currently active. It shows that there are two breakpoints and we can also see the history of how many times each breakpoint is hit. In our case, each breakpoint is hit once.
Let's see what happened to the EAX register now.
$ebx : 0xa
$ecx : 0x0
$edx : 0x0
$esp : 0xffffd220 → 0x00000001
$ebp : 0x0
$esi : 0x0
$edi : 0x0
$eip : 0x0804900c → <_start+12> mov ecx, DWORD PTR [esp]The value 8 is replaced with the new value 0xa. Additionally, the instruction mov ebx, eax is also executed resulting in the value 0xa in register RBX. Now let's also execute the next instruction.
This instruction moves the value that's being pointed by the ESP register into ECX. Let us examine the stack to better understand this.
0xffffd224│+0x0004: 0xffffd3d3 → "/home/dev/x86/mov"
0xffffd228│+0x0008: 0x00000000
0xffffd22c│+0x000c: 0xffffd3e5 → "SHELL=/bin/bash"
0xffffd230│+0x0010: 0xffffd3f5 → "SESSION_MANAGER=local/x86-64:@/tmp/.ICE-unix/1721,[...]"
0xffffd234│+0x0014: 0xffffd447 → "QT_ACCESSIBILITY=1"
0xffffd238│+0x0018: 0xffffd45a → "COLORTERM=truecolor"
0xffffd23c│+0x001c: 0xffffd46e → "XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg"What is ESP pointing to? If you look at the stack, ESP is pointing to the value available on the top of the stack. So, this value is going to be moved into ECX after executing the current instruction.
Let's try to do a single step and see what happens. As mentioned earlier, It is possible to do a single step by typing si. If the previous command that was typed was si, we can just hit enter instead of typing si again. So the previously typed command will be re-executed.
Let's observe the value of ecx.
$ebx : 0xa
$ecx : 0x1
$edx : 0x0
$esp : 0xffffd220 → 0x00000001
$ebp : 0x0
$esi : 0x0
$edi : 0x0
$eip : 0x0804900f → add BYTE PTR [eax], alAs expected, the value 0x1 which was being pointed by ESP, is moved to ECX.
Intro to x86 Disassembly
Conclusion
As we have seen in this article, using GDB is useful in debugging programs written in x86 assembly.
We used a GDB extension called GEF to simplify debugging. We have discussed some of the most common use cases of GDB that can come in handy when debugging ELF executables. We have seen a variety of concepts such as setting up breakpoints, examining registers and stack, the use of EIP register and listing available breakpoints.
Next, you'll learn how to use the ObjDump tool with x86.
Sources
- Assembly Language for x86 Processors, Kip Irvine
- Modern X86 Assembly Language Programming, Daniel Kusswurm
- Linux Assembly Language Programming, Bob Neveln