The Wayback Machine - https://web.archive.org/web/20080420165109/http://www.devx.com:80/microsoftISV/Article/11351
Destination .NET! Platform Tools, Technologies & Resources
1 3 5 7 9
2 4 6 8 10
Architecture and Infrastructure with a sprinkling about SQL Server, ERP and SaaS.
Pushing the envelope with software ecosystems
From VB4, C++ and Java to working on .Net Since Beta, 1.0.
Is your application compatible with Windows Vista? Make sure today by taking our self-test. Just follow the five steps of the Works with Windows Vista program so that you and your customers can be confident in your solution’s compatibility. Read More >>
What product/topic are you most interested in?
(Choose your top answer.)
Windows Vista
Windows Server 2008
2007 Microsoft Office system
SQL Server 2008
ASP.NET
Visual Studio 2008
Windows Mobile
Software as a Service
A little bit of everything
Just browsing, thanks

View Results
Whether you love the site or hate it, we want to know. Tell us what topics to cover, help us improve things, or just sound off on something we could've done better. Send your feedback directly to the editor by email.
 Print Print
Rate this item | 0 users have rated this item.
Obfuscation: Cloaking your Code from Prying Eyes
Prevent customers from stealing your algorithms, and crackers from changing your code, with the obfuscator in VS.NET and 2003. Let's take a look at some of the ingenious techniques it uses to mask your programs intent.�

Semi-compiled languages such as Java and the Microsoft Intermediate Language (MSIL) are particularly easy to disassemble or reverse engineer. Unlike native code, the intermediate byte codes contain complete variable names, such that disassembly generates almost the exact source code of the original program. The only notable absence is the comments from the original source code. Everything else is there.

For ISVs and other commercial developers who want to protect their intellectual property, this ease of disassembly poses a significant and well-known problem: Algorithms can be reconstructed and studied, and program code can be reconstituted and customized. (Even in-house, noncommercial applications are vulnerable to source-code access made possible by disassembly. For example, passwords to databases, or embedded in SQL statements are now easily accessible to users. Likewise, sites that use outside Web hosts are at risk if they upload their ASP.NET code, because staff at the hosting facility can reconstruct all the programs should they wish to.)

Moreover, the tools that hackers or even curious users might need to reverse engineer code are widely available. Microsoft offers its own MSIL disassembler, called ILDASM, at no cost. The Anakrino tool is an open-source disassembler for .NET (go to http://www.saurik.com/net/exemplar/); and various other companies offer equivalent tools on a commercial basis.

Protecting Your Code
The most effective way to protect your code from these forms of reverse engineering and snooping is to obfuscate it. ("Obfuscate" means "...to make opaque (so) as to be difficult to perceive or understand"�American Heritage Dictionary, 3rd Ed.) Tools today perform this trick by various means that primarily focus on making the variable names meaningless, encrypting strings and literals, and inserting misleading directives that render disassembled code uncompilable.

The upcoming release of Visual Studio (called VS.NET 2003 and code-named Everett) sports an integrated obfuscating tool that Microsoft suggests running as a final pass on .NET intermediate code. The obfuscator is the so-called "lite" version of a more robust obfuscating utility, Dotfuscator, sold by Preemptive Solutions, a Cleveland-based company that got its start obfuscating Java code. Dotfuscator, uses a remarkable variety of techniques to make disassembly futile or, at least, very difficult.

Overload induction is Preemptive Solutions' name for its patented technique of changing variable names in the intermediate code (Obfuscators never touch source code, nor even need to reference it.) It takes advantage of the fact that the same identifier name can be used for classes and methods with different signatures. And within different namespaces, variables can use the same name without colliding. Dotfuscator exploits these lexical features to rename as many items as possible to the letter 'A.' The company claims that on some code 33% of references can be renamed to A and another 10% to B. This transformation makes disassembled code extremely hard to understand. Consider the following example:

Disassembled code without obfuscation:


private void CalcPayroll(SpecialList employeeGroup) {
   while (employeeGroup.HasMore()) {
	   employee = employeeGroup.GetNext(true);
       employee.updateSalary();
       DistributeCheck(employee);
    }
}

Same code with obfuscation:

private void a(a b) {
    while (b.a()) {
        a = b.a(true);
        a.a();
        a(a);
    }
}

It is clear that both snippets perform the same logic. However, it is extraordinarily difficult to determine what the second snippet is doing, much less which fields and methods exactly are being accessed.

This renaming feature can be configured so that if you're building a DLL, let us say, the APIs are untouched. Interestingly, this feature alone visibly shrinks code simply by the reduction of so many variable names to just one character.

String encryption gets around a security problem that exists even in native code: String literals are easy to extract from binaries. For example, running the UNIX strings utility on any binary will generate a list of all ASCII literals in the file. In its most benign form, this list reveals only copyright information and whose libraries are included in the executable. However, if the program accesses databases, strings will reveal all the SQL commands. And if passwords are buried in the module, they are revealed as well.

With intermediate code, there are additional dangers. By examining the references to a given string, a cracker can figure out where password-protected code begins, and then can patch the file to jump there. To solve the problem of literals as human-readable text, most obfuscators encrypt strings. A small runtime penalty is incurred when the string is accessed, due to the decryption overhead. Interestingly, native code is at a disadvantage here because to achieve the same effect, developers must encrypt and decrypt each string manually, whereas an obfuscator performs this operation automatically.

Control-flow obfuscation is a technique designed to mislead disassemblers. It inserts goto statements in the code that effectively end up performing the original sequence of instructions but in a round-about way that makes it hard to follow the logic flow. Here is an example.

Disassembled intermediate code before control-flow obfuscation:

// Code Snippet copyright 2000, Microsoft Corp, from WordCount.cs
// sample app
public int CompareTo(Object o) {
  int n = occurrences - ((WordOccurrence)o).occurrences;
  if (n == 0) {
    n = String.Compare(word, ((WordOccurrence)o).word;
  }
  return (n);
}

Same code after control-flow obfuscation:

public virtual int a(object A_0) {
  int local0;
  int local1;

  local0 = this.a - (c) A_0.a;
  if (local0 != 0)
          goto i0;
      goto i1;
      while (true) {
          return local1;
          i0: local1 = local0;
      }
      i1: local0 = System.String.Compare(this.b, (c) A_0.b);
      goto i0;
}

As can be seen, a bogus test is inserted, then a goto is performed. At the goto destination, the original statement (in obfuscated form) is executed, then another goto statement returns control to the original branch in the logic flow. Notice the unexecuted and just misleading while() loop. In this small snippet, after close comparison with the original, it's possible to figure out what's real and what's not. However, on a large routine without the benefit of the source code, these misdirecting interpositions create a hugely time-consuming effort. The idea here is to make the restitution of the original coding intent so demanding that hackers will move on to other, perhaps simpler, challenges. This particular technique adds small amounts of code to the binaries and so creates some overhead for the obfuscated portions. If this is a problem, only routines that need this extra level of protection should be subject to this particular technique.

Getting your own obfuscator for .NET
As indicated previously, the upcoming VS.NET 2003 environment contains an obfuscator. It applies only the overload induction transform. For developers who are not using VS.NET, but still want access to this tool, it can be downloaded from Preemptive Solutions. To get the full complement of techniques described here, the complete professional version is available as a paid commercial product for $1495, with discounted pricing for two or more copies. Several other obfuscators for .NET MSIL are listed here.

Additional Resource

  • An interesting survey of all kinds of code-obfuscation techniques.

  • Page 1 of 1
    Andrew Binstock is the principal analyst at Pacific Data Works LLC and a frequent contributor to this site. Previously he was the director of PriceWaterhouseCooperss Global Technology Forecasts. His book Practical Algorithms for Developers co-written with John Rex is in its 12th printing at Addison-Wesley and in use at more than 30 university computer-science programs. .
    Submit article to:
    Extending your solution to run on Microsoft technology is easier than ever. Through NXT, you can reach more customers, increase revenues and slash development time and costs, accelerating both your time to market and profitability. Get the details on NTX. >>
    Sign up for your free e-mail newsletters today!
    DevX Windows Developer Update

    More Newsletters
    Live from the Web! Bring the Windows Live Messenger Experience to Your Web Applications
    Design and Use of Moveable and Resizable Graphics, Part 1
    Extending the Existing CLR Type
    WPF Meets the iPhone
    Get Proper Filtered Results from a Data View



    JupiterOnlineMedia

    internet.com earthweb.com Devx.com mediabistro.com Graphics.com

    Search:

    Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

    Jupitermedia Corporate Info

    Copyright 2008 Jupitermedia Corporation All Rights Reserved.
    Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

    Web Hosting | Newsletters | Tech Jobs | Shopping | E-mail Offers