Reverse engineering obfuscated assemblies [updated 2019]
In previous articles that talked about .NET reverse engineering, we covered almost every aspect of reversing .NET assemblies, we explained how this kind of binary is compiled, executed, how we can compile it, decompile it, how to apply patches, as well as the concept of round trip engineering and how to bypass strong name signatures.
In this paper I'll introduce you code obfuscation and how we can deal with obfuscated assemblies when talking about reverse engineering.
As previously said (for more information please check the references section), every high level .NET based language is translated to the same low level language which is the Common Intermediate Language. With the help of some reverse engineering tools, we can easily translate this low level language to any high level one that should embarrass any software vendor, because at this point we are not talking about protecting a software from being tampered or reverse engineered; we are talking about protecting the whole source code of our work.
What is obfuscation?
To enhance code security, obfuscation can be discussed in different ways, so we will use a general approach. If you are not willing to share your C# or VB.NET source code, obfuscating your code is not a luxury anymore. The act of obfuscating your code means that you make it unreadable or very hard to read.
Basically, the general aspect of an assembly's source code can be obfuscated in many different ways, the most common and maybe the first used technique is by renaming each and every identifier within a project. This task is called refactoring and is not easy at all since all modules, even very distant ones that may reference a particular function, method or even a variable, must be modified to use the new identifier; thus, for each "refactored "entity, the whole source code must be analyzed and modified accordingly.
Example of clear VB.NET code
[vbnet]
Public Sub checkLicence()
If Not File.Exists((Application.StartupPath & "lic.dat")) Then
Interaction.MsgBox("license file missing. Cannot save file.", MsgBoxStyle.Critical, "License not found")
Me.isRegistered = False
Me.LblStat.ForeColor = Color.Red
Me.LblStat.Text = "Unregistered Crack Me"
Me.btnEnableMe.Enabled = False
Me.btnSaveAs.Enabled = False
Else
Me.LblStat.ForeColor = Color.Green
Me.LblStat.Text = "File saved!”
Me.isRegistered = True
End If
End Sub
[/vbnet]
Refactored code
[vbnet]
Public Sub yrwhN7kpasz22jd0i89nGf ()
If Not File.Exists((Application.StartupPath & "lic.dat")) Then
Interaction.MsgBox("license file missing. Cannot save file.", MsgBoxStyle.Critical, "License not found")
Me.h7Nk9bsjHamO9g59vbYTkA = False
Me.NalJAvU5hkaP07v49gAk.ForeColor = Color.Red
Me.NalJAvU5hkaP07v49gAk.Text = "Unregistered Crack Me"
Me.8yXnAaga8h9gaXv5z26g.Enabled = False
Me.nBjLAz83QgDh7n9xTha.Enabled = False
Else
Me.NalJAvU5hkaP07v49gAk.ForeColor = Color.Green
Me.NalJAvU5hkaP07v49gAk.Text = "File saved!”
Me.h7Nk9bsjHamO9g59vbYTkA = True
End If
End Sub
[/vbnet]
The use of unprintable characters is also used so that identifiers are not even readable. Some techniques also use string encryption while embedding within the code the encryption key and the decryption function. Using some existing tools, we can mix a very large variety of techniques to produce "much" obfuscated code. In this case, the code above may look like this:
[vbnet]
Public Sub #~3a ()
If Not File.Exists((Application.StartupPath & "¡ÂÓÒÈÏàä")) Then
Interaction.MsgBox("¬×ÄÇÍÙÆÐÊÕÈØäÒÏÉß ÈÏàäØÖÈ¡ÂÓÒ", MsgBoxStyle.Critical, " ÙÆÈ ÒÏÉß¡ÂÓÒ Ïàä ")
Me.#r~ = False
Me.#~1.ForeColor = Color.Red
Me.#~1.Text = "¡ÂÓÒ Ïàä ß ÈÏàäØÖ"
Me.#~F.Enabled = False
Me.#~1q.Enabled = False
Else
Me.#~1.ForeColor = Color.Green
Me.#~1.Text = "ØäÒÂÓàä ßØÖÈ”
Me.#r~ = True
End
[/vbnet]
The way of obfuscating a code may differ from one coder to another and from one tool to another, as seen in previous samples. Renaming classes, methods, parameters, namespaces, fields, etc., makes deducting the work or the result of a given code very difficult. In some cases we can also take advantage of overloading to give the same name to many methods.
Encrypting strings and decrypting them at runtime makes them unreadable at code level, and to make life more painful to the reverse engineer, the use of control flow alternation can transform short and structured for, while, if … statements to a lot of goto statements which will result in a big code mess. Anyway, the purpose of this article is not to show how to protect (obfuscate) a code but how to deal with an obfuscated .NET application when talking about reverse engineering.
There is no generic method to deal with an obfuscated assembly. Sure we can de-obfuscate some assemblies protected with some commercial tools, which will lead to a clear and understandable code, but if an assembly is manually obfuscated or if it's obfuscated with a kind of private obfuscator, there is -theoretically- no way to deal with its clear code.
For this reason, I created a little Reverse Me and obfuscated it with a commercial and well known obfuscator to explain how we can reverse engineer an obfuscated assembly. Then we will see how easy it is to de-obfuscate assemblies that are protected with a lot of commonly used obfuscators.
The options used to produce an obfuscated version of our target are: symbol renaming scheme with use of hashes to produce names like "c263cas5636dbcf53e3", encrypting strings, encrypting constant values and arrays, encrypting method bodies, protecting against ILDASM, protecting , encrypting and compressing resources, etc. As you see it's really hard to do all of these things manually and guess what, once in front of the mess produced, I wondered if I was able to reverse engineer this little reverse me without de-obfuscating it!
Our target is as simple as an application that requests a password to start; if provided, it will show us the main application form; otherwise it closes without saying a thing:
Figure 1 Authentication Form
Well, let's load our obfuscated reverse me on Reflector and see what we get:
All we can see is that there are two forms called: Form1 and Form2 and all the rest is a non understandable suite of numbers and letters which is a sure sign of obfuscation. In normal situations, method names and control names should be shown as they were typed by the coder, but instead we get this:
[caption id="" align="alignnone" width="607"] Click to Enlarge[/caption]
At this point we can start looking into methods bodies:
But dealing with such a mess is always about "guessing" how as a developer we would do things. Running the application to get a good understanding of its behavior is very important. Our target is a simple two form application: the first is asking us for a password to gain access to the second. Things could be much more complicated with a real application, but I'll show you the basics of doing it.
Basically, we want to get rid of the authentication process (Figure 1), and this should make us aware of some interesting points. The first is that the application MUST contain something similar to "this.something.text" as a kind of construct that retrieves the password typed by the user. The second is the fact that the reverse me closes if the password typed in is not correct. So, we should wonder how "closing" an application is done when using a dot NET programming language.
Usually we talk about "Application.Exit()", "Environnement.Exit()" or just the evil way "END". One more thing about our target is the "important" fact that the application should show another form when giving the correct password. How is this generally done when programming in .NET language? This depends on how the programmer imagined it, but it's good to keep in mind usual methods like "SomeFormName.Show()" for example. All these kinds of things should be kept in mind when reviewing the obfuscated code of a target.
Now let's get back to our target to put this to practice. Everything should be done manually. The more efficient way is decompiling the whole solution (reverse me) into a single file containing all the code, then searching for constructs that may interest us using any text editor. To accomplish this, we will use Reflector as explained below:
Right click on it, and then select "Export Source Code …"
In the window shown, choose an output directory then click on "Start":
Depending on which language you chose, you must get a ready to use Visual Studio Project as seen here:
[caption id="" align="alignnone" width="606"] Click to Enlarge[/caption]
You can explore the whole solution but this will just make things harder for us. The content of the shown folder in the picture above is the code files of our two main forms. After exploring the first (Form1.vb in my case) I found nothing interesting so I moved to the second file, Form2.vb:
Without even searching manually, the first things that pop into my eyes were interesting constructs and instruction, without talking about the presence of "If…End If" and "Select Case" statement. Now let's get back to Reflector to explore this method:
Private Sub c22b9b9f9db35767fd5cec1339dc58624(ByVal c654d57623e59119a528090eeb2dd6380 As Object, ByVal c9354cf71bf985f587bbc167cd0ed7723 As EventArgs)
[caption id="" align="alignnone" width="606"] Click to Enlarge[/caption]
Even if the method body is obfuscated, we can guess what it may do:
How did I know that the last explained line shows Form1? Click on cebbfccbdfdbd2.c7e7a4074b4c8c3890c682eeea4ca5def.Show and see that this is a property declared as Form1:
[caption id="" align="alignnone" width="606"] Click to Enlarge[/caption]
If we pay enough attention to this obfuscated code we can get a lot of information. The application assigns the value of "box2" to the variable "text" after declaring "box2" as TextBox then uses this value to make Boolean comparison. The rest is explained in the picture above.
The idea is to force our reverse me to accept any typed password and show us the second form to avoid closing the whole program. What we want to do is to patch our reverse me. Let's switch to IL view to try to figure out how we can deal with this mess and properly modify our target. We will use Reflector and Reflexil add in, but I'm not going to discuss how to install it so please refer to "Demystifying dot NET reverse engineering – PART 3: Advanced Byte Patching" for details.
[caption id="" align="alignnone" width="608"] Click to Enlarge[/caption]
The most important parts of the code are marked, for more explanation about CIL, IL instructions, their functions and the actual bytes representation, please refer to "Demystifying dot NET reverse engineering – PART 2: Introducing Byte Patching."
L_0036: bne.un.s L_0066: This instruction transfers control to a target instruction (short form) when two unsigned integer values or unordered float values are not equal, which basically means, if the password typed is not correct, it will jump to the line labeled with L_006 then it will run System.Environement.Exit(). At this point I discovered that I spoke very early since this is the most important instruction in all of this code!
Anyway, we are not supposed to know the correct password. This starts making sense, and lets us think about how we can bypass the problem of correct/incorrect password. Instead of closing the application if the password is incorrect, we can reverse it and force the program to show us this second form instead. The instruction that does this is Beq.s which is, "Transfers control to a target instruction (short form) if two values are equal."
To simply change the condition from false to true, we will use Reflexil as shown below:
Right click on the instruction then select Edit:
Change the old instruction to the new one then click Update. Now save our modification to test the patched version of our reverse me:
This produces a patched version of our target:
Now try to double click on this modified version of our reverse me. If you supply an incorrect password (or even if you leave it blank) it will show you the second form as expected:
Hopefully you learned something from this paper, and before ending, you should know that Reflexil can find which obfuscator I used and produce a clean assembly which we can reverse easily; just try "Obfuscator Search…":
And this will work fine with many obfuscators…
Sources
- Reverse ME
- .NET reverse engineering - Infosec Resources
- .NET reverse engineering - Infosec Resources
- Reflexil