When I had the opportunity to spent some time during Red Gate’s recent “down tools” week on a project of my choice, the obvious project was an F# add-in for Reflector . To be honest, this was a bit of a misnomer as the amount of time in the designated week for coding was really less than three days, so it was always unlikely that very much progress would be made in such a small amount of time (and that certainly proved to be the case), but I did learn some things from the experiment.
Like lots of problems, one useful technique is to take examples, get them to work, and then generalise to get something that works across the board. Unfortunately, I didn’t have enough time to do the last stage.
The obvious first step is to take a few function definitions, starting with the obvious hello world, moving on to a non-recursive function and finishing with the ubiquitous recursive Fibonacci function.
let rec printMessage message =
let foo x =
(x + 1)
let rec fib x =
if (x >= 2) then (fib (x – 1) + fib (x – 2)) else 1
The major problem in decompiling these simple functions is that Reflector has an in-memory object model that is designed to support object-oriented languages. In particular it has a return statement that allows function bodies to finish early. I used some of the in-built functionality to take the IL and produce an in-memory object model for the language, but then needed to write a transformer to push the return statements to the top of the tree to make it easy to render the code into a functional language. This tree transform works in some scenarios, but not in others where we simply regenerate code that looks more like CPS style.
The next thing to get working was library level bindings of values where these values are calculated at runtime.
let x = [1 ; 2 ; 3 ; 4]
let y = List.map (fun x -> foo x) x
The way that this is translated into a set of classes for the underlying platform means that the code needs to follow references around, from the property exposing the calculated value to the class in which the code for generating the value is embedded.
One of the strongest selling points of functional languages is the algebraic datatypes, which allow definitions via standard mathematical-style inductive definitions across the union cases.
type Foo =
| Something of int
type ‘a Foo2 =
| Something2 of ‘a
Such a definition is compiled into a number of classes for the cases of the union, which all inherit from a class representing the type itself. It wasn’t too hard to get such a de-compilation happening in the cases I tried.
What did I learn from this?
Firstly, that there are various bits of functionality inside Reflector that it would be useful for us to allow add-in writers to access. In particular, there are various implementations of the Visitor pattern which implement algorithms such as calculating the number of references for particular variables, and which perform various substitutions which could be more generally useful to add-in writers. I hope to do something about this at some point in the future.
Secondly, when you transform a functional language into something that runs on top of an object-based platform, you lose some fidelity in the representation. The F# compiler leaves attributes in place so that tools can tell which classes represent classes from the source program and which are there for purposes of the implementation, allowing the decompiler to regenerate these constructs again. However, decompilation technology is a long way from being able to take unannotated IL and transform it into a program in a different language. For a simple function definition, like Fibonacci, I could write a simple static function and have it come out in F# as the same function, but it would be practically impossible to take a mass of class definitions and have a decompiler translate it automatically into an F# algebraic data type.
What have we got out of this?
Some data on the feasibility of implementing an F# decompiler inside Reflector, though it’s hard at the moment to say how long this would take to do. The work we did is included the 6.5 EAP for Reflector that you can get from the EAP forum.
All things considered though, it was a useful way to gain more familiarity with the process of writing an add-in and understand difficulties other add-in authors might experience.