Reverse engineering

Debugging TLS callbacks

SecRat
January 15, 2014 by
SecRat

TLS (thread local storage) calls are subroutines that are executed before the entry point . There is a section in the PE header that describes the place of a TLS callback. Malwares employ TLS callbacks to evade debugger messages. When a particular malware employed with TLS callbacks is loaded into a debugger, the malware finishes its work before the debugger stops at the entry point.

Let's start with a simple example of a TLS callback in C:

Become a certified reverse engineer!

Become a certified reverse engineer!

Get live, hands-on malware analysis training from anywhere, and become a Certified Reverse Engineering Analyst.

[c]

/***************************************** TLS Example Program

Compile With MSVC

********************************************/

#include

#pragma comment(linker, "/INCLUDE: tls_used")

void NTAPI TlsCallBac(PVOID h, DWORD dwReason, PVOID pv);

#pragma data_seg(".CRT$XLB")

PIMAGE_TLS_CALLBACK p_thread_callback = TlsCallBac;

#pragma data_seg()

void NTAPI TlsCallBac(PVOID h, DWORD dwReason, PVOID pv)

{

MessageBox(NULL, "In TLS", "In TLS", MB_OK);

return;

}

int main(int argc , char**argv)

{

MessageBox(NULL, "In Main", "In Main", MB_OK);

return 0;

}

[/c]

After running this program, the "In TLS" Message box will pop up first rather than "In Main." This proves that TLS callbacks are executed before the entry point.

Following is the dumpbin output of the exe compiled using the above code:

[python]

FILE HEADER VALUES

14C machine (x86)

4 number of sections

52C01E9D time date stamp Sun Dec 29 18:37:41 2013

0 file pointer to symbol table

0 number of symbols

E0 size of optional header

103 characteristics Relocations stripped Executable

32 bit word machine

OPTIONAL HEADER VALUES

10B magic # (PE32)

8.00 linker version

7000 size of code

5000 size of initialized data

0 size of uninitialized data

1256 entry point (00401256)

1000 base of code

8000 base of data

400000 image base (00400000 to 0040CFFF)

1000 section alignment

1000 file alignment

4.00 operating system version

0.00 image version

4.00 subsystem version

0 Win32 version

D000 size of image

1000 size of headers

0 checksum

3 subsystem (Windows CUI)

0 DLL characteristics

100000 size of stack reserve

1000 size of stack commit

100000 size of heap reserve

1000 size of heap commit

0 loader flags

10 number of directories

0 [ 0] RVA [size] of Export Directory

9524 [ 3C] RVA [size] of Import Directory

0 [ 0] RVA [size] of Resource Directory

0 [ 0] RVA [size] of Exception Directory

0 [ 0] RVA [size] of Certificates Directory

0 [ 0] RVA [size] of Base Relocation Directory

0 [ 0] RVA [size] of Debug Directory

0 [ 0] RVA [size] of Architecture Directory

0 [ 0] RVA [size] of Global Pointer Directory

9260 [ 18] RVA [size] of Thread Storage Directory

9218 [ 40] RVA [size] of Load Configuration Directory

0 [ 0] RVA [size] of Bound Import Directory

8000 [ F8] RVA [size] of Import Address Table Directory

0 [ 0] RVA [size] of Delay Import Directory

0 [ 0] RVA [size] of COM Descriptor Directory

0 [ 0] RVA [size] of Reserved Directory

[/python]

The Thread Storage Directory is filled up.

The TLS directory is defined in MSDN as

(http://msdn.microsoft.com/en-us/magazine/cc301808.aspx):

[python]

typedef struct _IMAGE_TLS_DIRECTORY { UINT32 StartAddressOfRawData;

UINT32 EndAddressOfRawData; PUINT32 AddressOfIndex;

PIMAGE_TLS_CALLBACK *AddressOfCallBacks; UINT32 SizeOfZeroFill;

UINT32 Characteristics;

} IMAGE_TLS_DIRECTORY, *PIMAGE_TLS_DIRECTORY

[/python]

Let's try to look at a sample that employs TLS callbacks.

Supplying it to PEID says it has been packed with NULLSoft packer.

Note: The first layer packer is irrelevant to this analysis, This packer basically creates injects inside a new process, which is the unpacked image: .

This is a valid MZ image.

If you look at the MZ image, you will notice a weird thing about the address of the entry point:

[python]

00000108 00000000 DD 00000000 ; AddressOfEntryPoint = 0

0000010C 00100000 DD 00001000 ; BaseOfCode = 1000

00000110 00800000 DD 00008000 ; BaseOfData = 8000

00000114 00004000 DD 00400000 ; ImageBase = 400000

[/python]

As you can see, over there the address of the entry point is 0 but, at the same time, the TLS table is supplied:

[python]

000001A0 60920000 DD 00009260 ; TLS Table address = 9260

000001A4 18000000 DD 00000018 ; TLS Table size = 18 (24.)

[/python]

Here is the dump of the TLS table

[python]

E4 96 00 00 F2 96 00 00 FE 96 00 00 0E 97 00 00 24 97 00 00 40 97 00 00 5A 97 00 00 72 97

00 00

8C 97 00 00 A2 97 00 00 B2 97 00 00 CC 97 00 00 DE 97 00 00 EC 97 00 00 FE 97 00 00 16

98 00 00

24 98 00 00 30 98 00 00 3E 98 00 00 48 98 00 00

[/python]

that gives us the location of TLS entry point . There are two ways to catch TLS calls:

1 : Change the Ollydbg setting to the system breakpoint:

2 : Set up a hardware breakpoint at 0x7C9011A4

We will use the second method, which is more preferable. After loading the TLS application, it will stop here in the debugger:

[python]

7C901194 8BEC MOV EBP,ESP

7C901196 56 PUSH ESI

7C901197 57 PUSH EDI

7C901198 53 PUSH EBX

7C901199 8BF4 MOV ESI,ESP

7C90119B FF75 14 PUSH DWORD PTR SS:[EBP+14]

7C90119E FF75 10 PUSH DWORD PTR SS:[EBP+10]

7C9011A1 FF75 0C PUSH DWORD PTR SS:[EBP+C]

7C9011A4 FF55 08 CALL DWORD PTR SS:[EBP+8] ; TLS Callback

7C9011A7 8BE6 MOV ESP,ESI

7C9011A9 5B POP EBX

7C9011AA 5F POP EDI

7C9011AB 5E POP ESI

7C9011AC 5D POP EBP

7C9011AD C2 1000 RETN 10

[/python]

Stepping inside the call leads us here. Now, to fix the PE header, we need fix the entry point of the application to the exact location of the TLS callback and the Zero TLS table value:

[python]

00401350 |. 56 PUSH ESI

00401351 |. 56 PUSH ESI

00401352 |. 56 PUSH ESI

00401353 |. 56 PUSH ESI

00401354 |. 56 PUSH ESI

00401355 |. C700 16000000 MOV DWORD PTR DS:[EAX],16

0040135B |. E8 57170000 CALL me.00402AB7

00401360 |. 83C4 14 ADD ESP,14

00401363 |. 6A 16 PUSH 16

00401365 |. 58 POP EAX

00401366 |. 5E POP ESI

00401367 |. C3 RETN

00401368 |> 3935 D4AC4000 CMP DWORD PTR DS:[40ACD4],ESI

0040136E |.^74 DB JE SHORT me.0040134B

00401370 |. 8B0D E0AC4000 MOV ECX,DWORD PTR DS:[40ACE0]

00401376 |. 8908 MOV DWORD PTR DS:[EAX],ECX

00401378 |. 33C0 XOR EAX,EAX

0040137A |. 5E POP ESI

0040137B . C3 RETN

[/python]

Change entry point = 0x0401350;

[python]

000000FC 00700000 DD 00007000 ; SizeOfCode = 7000 (28672.)

00000100 00500000 DD 00005000 ; SizeOfInitializedData = 5000 (20480.)

00000104 00000000 DD 00000000 ; SizeOfUninitializedData = 0

00000108 50134000 DD 00401350 ; AddressOfEntryPoint = 401350

0000010C 00100000 DD 00001000 ; BaseOfCode = 1000

00000110 00800000 DD 00008000 ; BaseOfData = 8000

00000114 00004000 DD 00400000 ; ImageBase = 400000

[/python]

After this step, TLS callbacks won't be called and you can start debugging your application from entry point.

SecRat
SecRat

SecRat works at a start-up. He's interested in Windows Driver Programming.