How to use the ObjDump tool with x86
Having access to an analysis tool when dealing with compiled executables is always useful. In Linux, ObjDump is one such tool, which can be used to extract information from object files.
This article provides an overview of various ObjDump command-line options and their use. We will take a simple Hello World program written in x86 assembly as our target program and run ObjDump against it.
See the next previous in the series, Debugging your first x86 program.
Intro to x86 Disassembly
What is ObjDump?
As mentioned at the beginning of the article, ObjDump is a useful utility to extract information from object files. This tool comes pre-installed with the majority of the Linux distributions. Following are the help options available when running ObjDump.
Usage: objdump <option(s)> <file(s)>
Display information from object <file(s)>.
At least one of the following switches must be given:
-a, --archive-headers Display archive header information
-f, --file-headers Display the contents of the overall file header
-p, --private-headers Display object format specific file header contents
-P, --private=OPT,OPT... Display object format specific contents
-h, --[section-]headers Display the contents of the section headers
-x, --all-headers Display the contents of all headers
-d, --disassemble Display assembler contents of executable sections
-D, --disassemble-all Display assembler contents of all sections
--disassemble=<sym> Display assembler contents from <sym>
-S, --source Intermix source code with disassembly
--source-comment[=<txt>] Prefix lines of source code with <txt>
-s, --full-contents Display the full contents of all sections requested
-g, --debugging Display debug information in object file
-e, --debugging-tags Display debug information using ctags style
-G, --stabs Display (in raw form) any STABS info in the file
-W[lLiaprmfFsoRtUuTgAckK] or
--dwarf[=rawline,=decodedline,=info,=abbrev,=pubnames,=aranges,=macro,=frames,
=frames-interp,=str,=loc,=Ranges,=pubtypes,
=gdb_index,=trace_info,=trace_abbrev,=trace_aranges,
=addr,=cu_index,=links,=follow-links]
Display DWARF info in the file
--ctf=SECTION Display CTF info from SECTION
-t, --syms Display the contents of the symbol table(s)
-T, --dynamic-syms Display the contents of the dynamic symbol table
-r, --reloc Display the relocation entries in the file
-R, --dynamic-reloc Display the dynamic relocation entries in the file
@<file> Read options from <file>
-v, --version Display this program's version number
-i, --info List object formats and architectures supported
-H, --help Display this informationHow to extract assembly code
ObjDump tool can be used to extract assembly code from an already-built binary. Let us begin by going through the following assembly program to better understand the approach that can be used.
helloworld.nasm
global _start
_start:
mov edx,len
mov ecx,msg
mov ebx,1
mov eax,4
int 0x80
mov eax, 1
mov ebx, 0
int 0x80
section .rodata
msg db 'Hello, world!',0xa
len equ $ - msgAs we can notice in the preceding program, the assembly code is written in the .text section. So, we can use ObjDump to extract the .text section from the object file produced by the assembler. It is also possible to extract other sections such as .rodata. In the next few sections, we will discuss how this can be done.
Displaying header contents using ObjDump
To display the header contents from a binary, we can use the -x flag as shown below.
helloworld: file format elf32-i386
helloworld
architecture: i386, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x08049000
Program Header:
LOAD off 0x00000000 vaddr 0x08048000 paddr 0x08048000 align 2**12
filesz 0x00000094 memsz 0x00000094 flags r--
LOAD off 0x00001000 vaddr 0x08049000 paddr 0x08049000 align 2**12
filesz 0x00000022 memsz 0x00000022 flags r-x
LOAD off 0x00002000 vaddr 0x0804a000 paddr 0x0804a000 align 2**12
filesz 0x0000000e memsz 0x0000000e flags r--
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000022 08049000 08049000 00001000 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .rodata 0000000e 0804a000 0804a000 00002000 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
SYMBOL TABLE:
08049000 l d .text 00000000 .text
0804a000 l d .rodata 00000000 .rodata
00000000 l df *ABS* 00000000 helloworld.nasm
0804a000 l .rodata 00000000 msg
0000000e l *ABS* 00000000 len
08049000 g .text 00000000 _start
0804b00e g .rodata 00000000 __bss_start
0804b00e g .rodata 00000000 _edata
0804b010 g .rodata 00000000 _endAs mentioned earlier, we used the Hello World program as our target. The preceding output shows the header information extracted. This includes the metadata of the elf binary (with the details such as file format, architecture), program header, sections available in the binary(.text, .rodata) and the symbol table.
Displaying assembler contents of executable sections using ObjDump
As discussed earlier, we can use the .text section to dump the assembly code from a pre-built binary. This can be done using the -d flag as shown in the following excerpt.
Assembler contents from text section:
helloworld: file format elf32-i386
Disassembly of section .text:
08049000 <_start>:
8049000: ba 0e 00 00 00 mov $0xe,%edx
8049005: b9 00 a0 04 08 mov $0x804a000,%ecx
804900a: bb 01 00 00 00 mov $0x1,%ebx
804900f: b8 04 00 00 00 mov $0x4,%eax
8049014: cd 80 int $0x80
8049016: b8 01 00 00 00 mov $0x1,%eax
804901b: bb 00 00 00 00 mov $0x0,%ebx
8049020: cd 80 int $0x80As we can see in the preceding excerpt, the assembly code is shown but it is in AT&T syntax. It is also possible to control the output syntax. This can be done using the flag -M as shown below.
Assembler contents from text section in intel assembly syntax:
helloworld: file format elf32-i386
Disassembly of section .text:
08049000 <_start>:
8049000: ba 0e 00 00 00 mov edx,0xe
8049005: b9 00 a0 04 08 mov ecx,0x804a000
804900a: bb 01 00 00 00 mov ebx,0x1
804900f: b8 04 00 00 00 mov eax,0x4
8049014: cd 80 int 0x80
8049016: b8 01 00 00 00 mov eax,0x1
804901b: bb 00 00 00 00 mov ebx,0x0
8049020: cd 80 int 0x80As we can observe, the output is now in intel syntax. Similarly, if we want to explicitly display the assembly code in AT&T syntax, it can be done as follows.
Assembler contents from text section in AT&T assembly syntax:
helloworld: file format elf32-i386
Disassembly of section .text:
08049000 <_start>:
8049000: ba 0e 00 00 00 mov $0xe,%edx
8049005: b9 00 a0 04 08 mov $0x804a000,%ecx
804900a: bb 01 00 00 00 mov $0x1,%ebx
804900f: b8 04 00 00 00 mov $0x4,%eax
8049014: cd 80 int $0x80
8049016: b8 01 00 00 00 mov $0x1,%eax
804901b: bb 00 00 00 00 mov $0x0,%ebx
8049020: cd 80 int $0x80If we want to display assembler code from all sections, we can use the -D flag.
Assembler contents from all sections in intel assembly syntax:
helloworld: file format elf32-i386
Disassembly of section .text:
08049000 <_start>:
8049000: ba 0e 00 00 00 mov edx,0xe
8049005: b9 00 a0 04 08 mov ecx,0x804a000
804900a: bb 01 00 00 00 mov ebx,0x1
804900f: b8 04 00 00 00 mov eax,0x4
8049014: cd 80 int 0x80
8049016: b8 01 00 00 00 mov eax,0x1
804901b: bb 00 00 00 00 mov ebx,0x0
8049020: cd 80 int 0x80
Disassembly of section .rodata:
0804a000 <msg>:
804a000: 48 dec eax
804a001: 65 6c gs ins BYTE PTR es:[edi],dx
804a003: 6c ins BYTE PTR es:[edi],dx
804a004: 6f outs dx,DWORD PTR ds:[esi]
804a005: 2c 20 sub al,0x20
804a007: 77 6f ja 804a078 <msg+0x78>
804a009: 72 6c jb 804a077 <msg+0x77>
804a00b: 64 21 0a and DWORD PTR fs:[edx],ecxSimilarly, we can extract assembler contents using ObjDump in AT&T syntax.
Displaying debug information using ObjDump
We can use the -g flag of ObjDump to display the debug information from a binary. The following excerpt shows the output from a compiled C Program.
jump: file format elf64-x86-64
Contents of the .eh_frame section (loaded from jump):
00000000 0000000000000014 00000000 CIE
Version: 1
Augmentation: "zR"
Code alignment factor: 1
Data alignment factor: -8
Return address column: 16
Augmentation data: 1b
DW_CFA_def_cfa: r7 (rsp) ofs 8
DW_CFA_offset: r16 (rip) at cfa-8
DW_CFA_nop
DW_CFA_nop
[REDACTED FOR BREVITY]Displaying contents of the symbol table using ObjDump
According to Oracle docs, "The symbol table contains information to locate and relocate symbolic definitions and references. The assembler creates the symbol table section for the object file. It makes an entry in the symbol table for each symbol that is defined or referenced in the input file and is needed during linking. The symbol table is then used by the link editor during relocation".
ObjDump’s -t flag can be used to display the Symbol table from an executable.
helloworld: file format elf32-i386
SYMBOL TABLE:
08049000 l d .text 00000000 .text
0804a000 l d .rodata 00000000 .rodata
00000000 l df *ABS* 00000000 helloworld.nasm
0804a000 l .rodata 00000000 msg
0000000e l *ABS* 00000000 len
08049000 g .text 00000000 _start
0804b00e g .rodata 00000000 __bss_start
0804b00e g .rodata 00000000 _edata
0804b010 g .rodata 00000000 _endAs we can observe in the preceding excerpt, each symbol used in the program is referenced in the symbol table.
See the next article in the series, How to diagnose and locate segmentation faults in x86 assembly.
Intro to x86 Disassembly
Sources
- Symbol tables, Oracle
- Assembly Language for x86 Processors, Kip Irvine
- Modern X86 Assembly Language Programming, Daniel Kusswurm
- Linux Assembly Language Programming, Bob Neveln