{smartassembly} software for code obfuscation

{smartassembly} is a tool for ensuring that the source code your commercial .NET application isn't visible to anyone with .NET Reflector. Matteo, who writes for us about encryption in .NET, asked if he could write a review of {smartassembly} for Simple-Talk. Because we like the product too, and Red Gate Software had recently taken over the product, we were happy to agree.

Introduction

.NET developers are generally aware that, when compiling a .NET program, the output obtained in no longer a native code executable file, but a file, called an assembly, that uses CIL (Common Intermediate Language) to store the instructions to process when the assembly is executed inside the .NET Framework environment.

Without listing all the advantages of this type of environment, it is interesting that a .NET assembly can be reverted  to its original form by a process called reverse engineering.  Tools such as .NET Reflector allow the users to explore a .NET assembly, revealing the code written by the assembly’s developer. This gives useful information about how the assembly really works.

However, this brings its own problems since it has implications for the

  • protection of the intellectual property of the assembly’s creator and owner
  • secure conservation inside the assembly of private data
  • potential insertion of malicious code by some attackers that could easily be distributed in a modified and dangerous version of the assembly itself.

To avoid those problems, the assembly can be obfuscated. This article is about {smartassembly}, a software tool that permits the obfuscation of .NET assemblies. 

What is obfuscation  

We’ll introduce obfuscation techniques by mentioning some academic articles about the subject. to illustrate how important obfuscation is considered by computer scientists.

As stated by Kuzurin et al.:

” To obfuscate a program means to bring it into such a form which hampers asmuch as possible the extraction of some valuable information concerning algorithms, data structures, secrete keys, etc. from the text of a program. Obfuscation can be viewed as a special case of encryption. One minor difference is that plaintext (original program) needs not be efficiently extractable from cryptogram (obfuscated program). “

Notice that:

” … a cryptogram itself must be an executable code equivalent to the original program.”

The objective of obfuscation is therefore the “remix” of software code (and, with .NET Framework technology, metadata) in order to attempt to prevent an attacker from understanding how the code was developed without preventing the software from running.

The obfuscation takes place by transforming the original code into different code that preserves its functionalities. Different techniques are utilized to achieve this goal. As observed by Dalla Preda et. al.:

“The obfuscating transformations can be classified according to the kind of information they target. Layout transformations remove source code formatting and scramble identifiers. Control transformations affect the control flow of the program, while data transformations  operate on the data structures used in the program.”

Examples of those type of transformations can be found here.

{smartassembly} obfuscation software

{smartassembly} is a well known obfuscation tool for the .NET platform.

The principal features of {smartassembly} are:

  • Assembly security features (obfuscation of code, anti-disassembler e anti-decompiler options, strong name signature.)

     

  • Reorganization of Assemblies (dependencies merging, compression and embedding)

     

  • Code-optimization features (pruning, memory management, automatic sealing of classes)

     

It is developed as windows application with a very intuitive, descriptive and appealing user interface.

925-image002-630x472.jpg

 In this article we will present the obfuscation capabilities of {smartassembly}. Information about its other features can  be found at the Red Gate website.

Obfuscation

To explain how {smartassembly} works, we wrote a simple console application. In it, we put a method, a property and a class-scoped variable. The program outputs the “This is a {smartassembly} demo application” on a console window and then waits for a key-press.

By browsing the application itself, rather than the source file, with .NET Reflector we obtained something like this that revealed the original source:

925-image004-630x478.jpg

Now we’ll use {smartassembly} to obfuscate the program. We’ll start by choosing the Obfuscation feature that we want from the list of available features. 

The first such feature is called, by {smartassembly}, Obfuscation.

This will automatically rename classes and methods of the assembly so as to prevent anyone understanding the source when it is decompiled. {smartassembly} allows us to select either printable characters in ASCII format or non printable character in Unicode format. For simplicity, we’ll choose to use printable ASCII character. Notice that we can select which portion of code to obfuscate. We can select all the assembly or part of it. After obfuscation, when utilizing .NET Reflector, the code appears as in the following picture:

925-image006-630x478.jpg

All the classes and methods were renamed as we expected.

You’ll see from the illustration that it is now more difficult to understand the program’s code even though, with a small amount of work, it is still possible. But, as real applications have many thousands of  lines of code, this would be very hard.

Notice that the property ConsoleMessage remains the same. This is because properties are pure metadata linking. The “real” property is defined on the ‘get’ and ‘set’  declaration of the same.  In this exemple, code for the ConsoleMessage is contained in the get_ConsoleMessage and set_ConsoleMessage methods defined on the compiled assembly.  This means that the properties declaration can be completely removed, instead of just obfuscating them. The application will still work as long as you don’t use Reflection with them.  The removal of such type of code can be done with the Pruning feature of {smartassembly}.

Finally, if you compare the left column of  .NET Reflector with the previous one you will see that {smartassembly} added two attributes to the assembly. The first (DoNotDistributeAttribute) stated that the assembly cannot be distributed. This because we were using a trial version of the program. The second (PoweredByAttribute) “signs” the assembly as having been obfuscated by {smartassembly}.. At first sight this attribute seemed to be only a marketing ploy by the {smartassembly} producer. However, we were able to get information about it by the {smartassembly} development team. This is what they said to us:

…  it was added 2 years ago by request of Lutz Roeder  (the .NET Reflector creator). If you browse an assembly protected by {smartassembly} with .NET Reflector, the tool will crash when trying to understand the source code (the protection will also crash other decompilers).

When .NET Reflector crashes, it requested permission to send an email with information on the exception to Lutz Roeder, the developer of the tool.

The problem is, as many applications are protected by {smartassembly} and this tool is very popular, he received a lot of emails regarding false errors.

So, he asked us to add a specific custom attribute in order to be able to do the following:

  • When the tool crashes while trying to understand the source code, it looks for the “{smartassembly}.PoweredByAttribute” attribute.

  • If the attribute is not found, it proposes to send an exception report.

  • If the attribute is found, it shows the “This item is obfuscated and cannot be translated.” message.

And now, {smartassembly} itself uses this attribute internally to detect if an assembly was previously protected or if the information from a dependency must be added to the exception report.

Reference Dynamic Proxy

If you see the previous image, the Console.WriteLine and the Console.ReadKey() methods  were not obfuscated. Those methods are defined on the .NET Framework base classes and so they cannot be processed by {smartassembly}. However it is still possible to obfuscate the .NET Framework base classes calls (and in general external assembly calls) by utilizing the Reference Dynamic Proxy feature of {smartassembly}.  This functionality creates a run-time proxy to the referenced assembly. In this way, the assembly calls the proxy, that calls the external assembly. The calls to the proxy are obfuscated. Because the process is done at run-time, there is no way to de-obfuscate the proxy to get its calls because it exists only at run-time.

After building the assembly with this feature enabled, we obtain something like this:

925-image008-630x446.jpg

The two calls to the Console class were obfuscated as we would expect.

String Encoding

From the previous output, we see that the text to be printed on the console appears inside the assembly in its original form. Suppose that we have strings that we want to keep secret. We can utilize the String Encoding features of {smartassembly}. The encoding can be done in a simple way, by the fixed substitution of characters, or the encoding can utilize the assembly structure in order to take a decision on how to encode and decode the assembly’s fixed string. This second option makes a big improvement to the security of the assembly. Any modification to the assembly structure or content will make it impossible to decode strings to their original form.

After building the new assembly, the .NET Reflector’s output is as follow:

925-image010-630x478.jpg

We can see that the original string disappears, to be replaced by a hexadecimal value. On the left column, we see also an additional resource file added to the assembly.  This indicates that {smartassembly} “moved”  the static string into an assembly’s resource and encrypts it by utilizing the Resources Compression and Encryption features included on the product itself. The first time a managed resource is needed at runtime, {smartassembly} automatically decompresses and decrypts it. After that, the performance is the same as with non-compressed resources.

Control Flow Obfuscation

If the protection achieved is not enough, we can utilize the Control Flow Obfuscation feature of  {smartassembly}. After building the assembly, we obtain something like this:

925-image012-630x478.jpg

In this case, .NET Reflector is not able to translate the CIL code in the original C# code. We obtained an additional level of obfuscation.

Add Incorrect Metadata  

The last level of security provided by {smartassembly} consists of the addition of incorrect metadata to the assembly. As we know, .NET Framework assemblies uses a particular internal area, called the manifest, where particular data, called metadata, is stored. This metadata is  retrieved from assemblies with reflection. Decompilers use the auto-description metadata to perform reverse engineering.  By preventing decompilers from using the metadata, we can prevent the assembly’s decompilation.

We can add incorrect metadata to our assembly by choosing the ‘Other Protection’ tab of {smartassembly}. Once the assembly is built, .NET Reflector detects incorrect metadata and cannot browse the assembly structure even though the program still works.

The output is given by:

925-image014-630x480.jpg

Conclusion

In this article we’ve touched on all the obfuscation features provided by {smartassembly} software. We have seen how obfuscation technology can be very useful for the protection of intellectual property, and for software security. The practice of obfuscating one’s own work is therefore worth considering.

However, you might need to consider that obfuscated assemblies can, potentially, present a performance overhead due to the transformations applied. This  is partially mitigated by the fact that, when classes and methods are modified, the new names utilized have generally small sizes than the original ones, and so the assembly size could end up being smaller.  This is even more evident when pruning is used.

Finally, the transformed assembly needs to be re-tested  to avoid exceptions that could be introduced when transformations are applied. This will lead to extra cost due to the additional work that must be done. Fortunately, {smartassembly} permits the generation of the pdb file (that, we know, contains the information for the debugger) that can be used inside Visual Studio IDE to simplify the final tests.