Next up is a look at the details of callvirt
and what happens when you call a virtual method. However, in order to do that, we first need to understand some of the low-level CLR datastructures, and what exactly an object instance is.
Object instances
Object instances are actually suprisingly small and lightweight things – they comprise of a sync block index used for locking on the object, a pointer to an internal MethodTable
structure corresponding to the type that this object is an instance of, and the instance fields. That’s it. Everything else – method implementations, interfaces, supertype information – is part of the MethodTable
structure for the type (this is the structure pointed to by Type.TypeHandle
). And, most importantly, within the MethodTable
structure is stored all the method and interface information used for dynamic method dispatch through callvirt
.
MethodTable
Within each MethodTable
structure is a list of all the methods on the type, including inherited methods, in inheritance order. The first entries are always the four virtual methods on System.Object
(ToString
, Equals
, GetHashCode
and Finalize
), followed by the virtual methods on the next supertype, and so on, finishing off with virtual and non-virtual methods declared on the type itself. Each entry in the table points to its IL implementation, which can either be its own overriding implementation or the same implementation as its supertype. For example, the following class definitions:
1 2 3 4 5 6 |
public class ClassA { public virtual void Method1() { /* ... */ } } public class ClassB : ClassA { public virtual void Method2() { /* ... */ } } |
will result in MethodTable
method lists looking something like this:
1 2 3 4 5 6 7 8 9 10 |
|-----------| |-----------| |-----------| | Object | | ClassA | | ClassB | |-----------| |-----------| |-----------| |ToString | |ToString | |ToString | |Equals | |Equals | |Equals | |GetHashCode| |GetHashCode| |GetHashCode| |Finalize | |Finalize | |Finalize | |-----------| |Method1 | |Method1 | |-----------| |Method2 | |-----------| |
This means that to callvirt
ClassA.Method1
on an object the CLR simply has to follow the method table pointer in the object instance, go to the 5th method table entry, and execute the method pointed to by that entry. This will execute the corresponding implementation for the run-time type of the object.
Calling interface methods
When you implement an interface on a class, the implementing methods are all declared as virtual sealed
, and therefore will have entries in the MethodTable
vtable. Mapping an interface method onto it’s implementation requires a secondary structure called an IVMap
for each type. This has an entry for each interface known by the system, and each entry points at the implementing methods in the type’s vtable. Needless to say, the details are quite complicated, change in each version of the CLR, and involve quite a bit of optimization, so I won’t go into them here. For more details you can read Sasha’s blog post about how the 2.0 CLR does it.
Whence callvirt
?
It follows that in order to call a method virtually, the this
pointer on the stack when you call the method has to be of type O
(heap object reference), as that is the only stack entry type that is guarenteed to have a MethodTable
pointer through which this dynamic dispatch mechanism can work. Managed pointers (&
) that call
can use don’t have the necessary information to perform the dynamic dispatch.
There is another difference between call
and callvirt
– callvirt
always checks if the this
is a null reference, and throws NullReferenceException
if it is, before calling the method, whereas call
won’t thrown a NullReferenceException
until the this
pointer is actually accessed using ldarg.0
within the method. The reason for this should be clear – if this
is null, there’s no MethodTable
pointer to use to work out the correct run-time method to call!
Value types and callvirt
An unboxed value type is stored as the concatenation of all its fields, with no sync block index or MethodTable
pointer. This isn’t a problem when using call
as the CLR knows statically which method it should call, so it can simply jump directly to the method implementation. However, you can’t call a method on a value type using callvirt
as there’s no MethodTable
pointer it can use to figure out the correct method implementation. This is where value type boxing comes in.
To help demonstrate this, I’m going to use the types from my previous blog post, but with an additional interface:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
public interface IIncrementable { void Increment(int incrementBy); } public class IncrementableClass : IIncrementable { public int Value; public void Increment(int incrementBy) { Value += incrementBy; } } public struct IncrementableStruct : IIncrementable { public int Value; public void Increment(int incrementBy) { Value += incrementBy; } } |
If you recall, call
ing a method on a value type requires a this
pointer of type &
. What happens if we simply substitute callvirt
for the call to IncrementableStruct::Increment
?
1 2 3 4 |
ldarga 0 ldc.i4.5 callvirt instance void IncrementableStruct::Increment(int32) ret |
As expected, PEVerify doesn’t like it:
1 |
[offset 0x00000005] Callvirt on a value type method. |
What about if we call the Increment
method through the IIncrementable
interface instead?
1 2 3 4 |
ldarga 0 ldc.i4.5 callvirt instance void IIncrementable::Increment(int32) ret |
Again, PEVerify fails it:
1 2 3 4 |
[offset 0x00000005] [found address of value 'IncrementableStruct'] [expected ref 'IIncrementable'] Unexpected type on the stack. [offset 0x00000005] Call to base type of valuetype. |
This is all as expected.
To be able to use callvirt
to call a method on a value type through an interface requires the box
instruction. This takes a value type from the stack, generates an object instance complete with sync block index and MethodTable
pointer, and copies the value type bytes into that object. However, the box
instruction requires the actual value type to be on the stack, rather than the address of the value type, like so (annotated with the stack state):
1 2 3 4 5 6 |
.method public static void CallIncrement( valuetype IncrementableStruct obj) { ldarg.0 // IncrementableStruct box IncrementableStruct // O[IncrementableStruct] ldc.i4.5 // O[IncrementableStruct],int32 callvirt instance void IIncrementable::Increment(int32) ret |
So?
As we’ve seen, due to how dynamic method dispatch works on .NET, callvirt
only works when the this
pointer is of type O
. To call an interface method on a value type requires the value type to be boxed first, which provides all the necessary information for dynamic dispatch to work. As my next post will detail, this complicates things for generic methods, where the generic type could be a reference or value type at runtime.
Next time: finally, generic methods!
Load comments