Disassembler Mechanized Part 2: Generating C# and MSIL code
Introduction
In the previous papers, we have showcased the essential configuration in terms of external DLL importing into the solution and NuGet package installation. As we have stated earlier, the process of making the custom disassembler incorporates several development cycle layers, and we have already covered user interface designing, getting assembly origin information, and decompiling assembly members in the previous article. Now, we shall carry on our voyage by explaining the process of getting disassembled code in C# and MSIL language.
The .NET CLR provisions several programming languages, such as VC#, VisualBasic.NET, F# and managed C++. Components written for example in VB.NET or C++ can easily be reprocessed in code written in another language, for instance C#. As we know, code from these high-level languages gets compiled to a common Intermediate Language (IL) which runs in the Common Language Runtime (CLR). There are typically multiple reasons of code disassembling, ranging from interoperability purposes to recovering lost source code or finding security vulnerabilities. Disassembling can assist to audit the implementation of security-sensitive features such as authentication, authorization, and encryption. Disassembling .NET clients for security purposes can also facilitate ensuring that the software performs the expected tasks without hidden features such as spy- or adware.
UI Design recap
It doesn't matter how we design this software: our main goal is to implement the disassembling features. Before moving forward, it is mandatory to come up with the controls we are placing in the user interface design of this software, because we shall have to direct a particular form control to respond to an event; for instance, treeview control, which displays entire members' modules of the assembly, will display the corresponding MSIL or C# source during selection of any methods. Although this software implements numerous forms of control, as per the requirement of this paper, we are elaborating on these necessary controls only, which shall be confronted during coding.
Control
Control Name
Event
tabPage1= C# code (Decompiled)
tabPage2= IL code (Decompiled)
tabPage3= MsgBox Injector
tabPage4= exe Injector
Getting Started
The moment the user uploads a .NET built assembly, the treeview control is activated and shall produce entire contents of the assembly in terms of modules and methods. As per the proposed functionality of this paper, we have to show the corresponding source code of an assembly in the form of C# of IL language. Here, we shall utilize the treeview control, which streamlines our job in terms of when we select a particular method or content of the assembly; the equivalent original source code (C#, MSIL) will appear in the rich text box which is located in the tab control. Hence, we will create an AfterSelect event for treeview control and place the following code as:
[c]
private void tvMembers_AfterSelect(object sender, TreeViewEventArgs e)
{
try
{
populateCsharpCode();
populateILCode();
}
catch
{
MessageBox.Show("Expand the Namespace");
return;
}
}
We have put two methods inside the tvMemebers_AfterSelect() as populateCsharpCode() which expresses the C# source code, and rest displays the MSIL code.
C# Code Disassembling
In this section, we shall express the process of yielding C# source code from a selected method in the treeview control. We have seen the process of generating original source code earlier in erstwhile popular disassemblers, for instance ILSpy, ILPeek and Reflector. We are in fact implementing the same functionality and features in our software.
Hence, the very first line of code in the populateCsharpCode() reads an assembly from the text box control into a dynamic type variable, and later by using this variable, we are enumerating the main modules residing in the assembly through the loop, with the loop constructed as:
[c]
var assembly = AssemblyDefinition.ReadAssembly(txtURL.Text);
IEnumerator enumerator = assembly.MainModule.Types.GetEnumerator();
while (enumerator.MoveNext())
{
… ..
}
In the loop, we shall define an object of TypeDefinition type, which possesses the modules of the assembly, and this is also used further to explorer the methods inside any selected modules as:
[c]
TypeDefinition td = (TypeDefinition)enumerator.Current;
IEnumerator enumerator2 = td.Methods.GetEnumerator();
while (enumerator2.MoveNext())
{..}
Now, we get the reference of the method from the selected modules in the MethodDefinition object, and create an AstBuilder class object which typically performs the de-compilation process.
[c]
MethodDefinition method_definition = (MethodDefinition)enumerator2.Current;
AstBuilder ast_Builder = null;
We again go through the current modules in the assembly using the foreach construct, and pass the current selected method reference to the AstBuilder class in order to disassemble its C# source code as:
[c]
foreach (var typeInAssembly in assembly.MainModule.Types)
{
ast_Builder = new AstBuilder(
new ICSharpCode.Decompiler.DecompilerContext (assembly.MainModule) { CurrentType = typeInAssembly });
In this implementation, we are showing methods only in the contents portion. Hence, we also have to confirm that either we are selecting methods or other members of the assembly as:
[c]
foreach (var method in typeInAssembly.Methods)
if (method.Name == tvMembers.SelectedNode.Text)
{
….
}
}
}
Finally in the if condition block, we first flush the data in the rich text box control and pass the selected method parameters in the AddMethod() of AstBuilder class. Then we produce the output in the rich text box control using string builder class object:
[c]
rtbCsharpCode.Clear();
ast_Builder.AddMethod(method);
StringWriter output = new StringWriter();
ast_Builder.GenerateCode(new PlainTextOutput(output));
string result = output.ToString();
rtbCsharpCode.AppendText(result);
output.Dispose();
We have discussed and elaborated the line by line code meaning; so far, in the following table, we can obtain the complete C# source code disassembling code as:
[c]
private void populateCsharpCode()
var assembly = AssemblyDefinition.ReadAssembly(txtURL.Text);
IEnumerator enumerator = assembly.MainModule.Types.GetEnumerator();
while (enumerator.MoveNext())
{
TypeDefinition td = (TypeDefinition)enumerator.Current;
IEnumerator enumerator2 = td.Methods.GetEnumerator();
while (enumerator2.MoveNext())
{
MethodDefinition method_definition = (MethodDefinition)enumerator2.Current;
foreach (var typeInAssembly in assembly.MainModule.Types)
{
ast_Builder = new AstBuilder(new ICSharpCode.Decompiler.DecompilerContext (assembly.MainModule) { CurrentType = typeInAssembly });
foreach (var method in typeInAssembly.Methods)
if (method.Name == tvMembers.SelectedNode.Text)
{
rtbCsharpCode.Clear();
ast_Builder.AddMethod(method);
StringWriter output = new StringWriter();
ast_Builder.GenerateCode(new PlainTextOutput(output));
string result = output.ToString();
rtbCsharpCode.AppendText(result);
output.Dispose();
}
}
}
}
}
IL Code Disassembling
[c]
….
if (method_definition.Name == tvMembers.SelectedNode.Text && !method_definition.IsSetter && !method_definition.IsGetter)
{
rtbILCode.Clear();
ILProcessor cilProcess = method_definition.Body.GetILProcessor();
foreach (Instruction ins in cilProcess.Body.Instructions)
{
rtbILCode.AppendText(ins + Environment.NewLine);
}
}
……….
The previous demonstration of C# source code was pretty exhaustive compared to IL code producing. In this segment, we will convert MSIL code from the selected method of the current assembly module. It is however almost the same process as the earlier section implementation, but this time we don't need to rely or call on AstBuilder class method in order to disassemble the code. Rather, a couple of .NET framework in-built classes such as ILProcessor are sufficient to produce IL code of the select method. Here what we are doing for each loop construct: we are just enumerating all corresponding IL code instructions and placing them into rich text box control. The following table presents the whole code of IL code disassembling:
[c]
private void populateILCode()
var assembly = AssemblyDefinition.ReadAssembly(txtURL.Text);
IEnumerator enumerator = assembly.MainModule.Types.GetEnumerator();
while (enumerator.MoveNext())
{
if (td.Name == tvMembers.SelectedNode.Parent.Text)
{
IEnumerator enumerator2 = td.Methods.GetEnumerator();
while (enumerator2.MoveNext())
{
MethodDefinition method_definition = (MethodDefinition)enumerator2.Current;
if (method_definition.Name == tvMembers.SelectedNode.Text && !method_definition.IsSetter && !method_definition.IsGetter)
{
rtbILCode.Clear();
ILProcessor cilProcess = method_definition.Body.GetILProcessor();
foreach (Instruction ins in cilProcess.Body.Instructions)
{
rtbILCode.AppendText(ins + Environment.NewLine);
}
}
}
}
}
}
Testing
It is important to test both of the implementations that we have described earlier. We shall demonstrate the C# source generation process. In order to fulfill our goal, we need an exe or DLL file, the source code of which we shall generate by using this software. The following DumySoftware.exe application is typically a login authentication mechanism, and it blocks our way in case we do not enter the correct user name and password.
Hence, we open this application .exe file into Spyware Injector & Decompiler software. It will display .exe file contents with its origin information. The moment we expand the main modules of this assembly in the treeview control and select any method, we find its C# source code in the control tab:
We can also view the MSIL code as we had seen such code using ILDASM.exe utility. The process of MSIL code disassembling is similar to C# code decompilation. We first have to select the method from the tab control and switch on the IL code tab as following.
Final Note
This is the second version of "disassembler mechanized", accompanied with additional features of development in the custom disassembler. The goal of this paper is to summarize knowledge of how to make a disassembler which produces code from a .NET assembly in both C# and IL format languages. We have observed the process of obtaining C# code in a step by step, detailed manner, along with generation of MSIL code too. In the next article, we shall present the development of custom .exe or code injection tactics in the form of both message box and spyware.