Subterranean IL: Callvirt and virtual methods

Next up is a look at the details of callvirt and what happens when you call a virtual method. However, in order to do that, we first need to understand some of the low-level CLR datastructures, and what exactly an object instance is.

Object instances

Object instances are actually suprisingly small and lightweight things – they comprise of a sync block index used for locking on the object, a pointer to an internal MethodTable structure corresponding to the type that this object is an instance of, and the instance fields. That’s it. Everything else – method implementations, interfaces, supertype information – is part of the MethodTable structure for the type (this is the structure pointed to by Type.TypeHandle). And, most importantly, within the MethodTable structure is stored all the method and interface information used for dynamic method dispatch through callvirt.

MethodTable

Within each MethodTable structure is a list of all the methods on the type, including inherited methods, in inheritance order. The first entries are always the four virtual methods on System.Object (ToString, Equals, GetHashCode and Finalize), followed by the virtual methods on the next supertype, and so on, finishing off with virtual and non-virtual methods declared on the type itself. Each entry in the table points to its IL implementation, which can either be its own overriding implementation or the same implementation as its supertype. For example, the following class definitions:

will result in MethodTable method lists looking something like this:

This means that to callvirt ClassA.Method1 on an object the CLR simply has to follow the method table pointer in the object instance, go to the 5th method table entry, and execute the method pointed to by that entry. This will execute the corresponding implementation for the run-time type of the object.

Calling interface methods

When you implement an interface on a class, the implementing methods are all declared as virtual sealed, and therefore will have entries in the MethodTable vtable. Mapping an interface method onto it’s implementation requires a secondary structure called an IVMap for each type. This has an entry for each interface known by the system, and each entry points at the implementing methods in the type’s vtable. Needless to say, the details are quite complicated, change in each version of the CLR, and involve quite a bit of optimization, so I won’t go into them here. For more details you can read Sasha’s blog post about how the 2.0 CLR does it.

Whence callvirt?

It follows that in order to call a method virtually, the this pointer on the stack when you call the method has to be of type O (heap object reference), as that is the only stack entry type that is guarenteed to have a MethodTable pointer through which this dynamic dispatch mechanism can work. Managed pointers (&) that call can use don’t have the necessary information to perform the dynamic dispatch.

There is another difference between call and callvirtcallvirt always checks if the this is a null reference, and throws NullReferenceException if it is, before calling the method, whereas call won’t thrown a NullReferenceException until the this pointer is actually accessed using ldarg.0 within the method. The reason for this should be clear – if this is null, there’s no MethodTable pointer to use to work out the correct run-time method to call!

Value types and callvirt

An unboxed value type is stored as the concatenation of all its fields, with no sync block index or MethodTable pointer. This isn’t a problem when using call as the CLR knows statically which method it should call, so it can simply jump directly to the method implementation. However, you can’t call a method on a value type using callvirt as there’s no MethodTable pointer it can use to figure out the correct method implementation. This is where value type boxing comes in.

To help demonstrate this, I’m going to use the types from my previous blog post, but with an additional interface:

If you recall, calling a method on a value type requires a this pointer of type &. What happens if we simply substitute callvirt for the call to IncrementableStruct::Increment?

As expected, PEVerify doesn’t like it:

What about if we call the Increment method through the IIncrementable interface instead?

Again, PEVerify fails it:

This is all as expected.

To be able to use callvirt to call a method on a value type through an interface requires the box instruction. This takes a value type from the stack, generates an object instance complete with sync block index and MethodTable pointer, and copies the value type bytes into that object. However, the box instruction requires the actual value type to be on the stack, rather than the address of the value type, like so (annotated with the stack state):

So?

As we’ve seen, due to how dynamic method dispatch works on .NET, callvirt only works when the this pointer is of type O. To call an interface method on a value type requires the value type to be boxed first, which provides all the necessary information for dynamic dispatch to work. As my next post will detail, this complicates things for generic methods, where the generic type could be a reference or value type at runtime.

Next time: finally, generic methods!