Why enumerator structs are a really bad idea

If you’ve ever poked around the .NET class libraries in Reflector, you probably would have noticed that the generic collection classes all have implementations of their IEnumerator as a struct rather than a class. As you will see, this design decision has some rather unfortunate side effects…

As is generally known in the .NET world, mutable structs are a Very Bad Idea; and there are several other blogs around explaining this (Eric Lippert’s blog post explains the problem quite well). In the BCL, the generic collection enumerators are all structs, but they need to be mutable as they have to keep track of where they are in the collection as they’re enumerating. This bit me quite hard when I was coding a wrapper around a LinkedList<int>.Enumerator. It boils down to this code:

The key line here is the MoveNext method. When I initially coded this, I thought that the call to m_Enumerator.MoveNext() would alter the enumerator state in the m_Enumerator class variable and so the enumeration would proceed in an orderly fashion through the collection. However, when I ran this code it went into an infinite loop – the m_Enumerator.MoveNext() call wasn’t actually changing the state in the m_Enumerator variable at all, and my code was looping forever on the first collection element. It was only after disassembling that method that I found out what was going on

The MoveNext method above results in the following IL:

Here, the important line is 0002 – m_Enumerator is accessed using the ldfld operator, which does the following:

Finds the value of a field in the object whose reference is currently on the evaluation stack.

So, what the MoveNext method is doing is the following:

The enumerator instance being modified by the call to MoveNext is the one stored in the CS$0$0001 variable on the stack, and not the one in the EnumeratorWrapper class instance. Hence why the state of m_Enumerator wasn’t getting updated.

Hmm, ok. Well, why is it doing this? If you have a read of Eric Lippert’s blog post about this issue, you’ll notice he quotes a few sections of the C# spec. In particular, 7.5.4:

…if the field is readonly and the reference occurs outside an instance constructor of the class in which the field is declared, then the result is a value, namely the value of the field I in the object referenced by E.

And my m_Enumerator field is readonly! Indeed, if I remove the readonly from the class variable then the problem goes away, and the code works as expected. The IL confirms this:

Notice on line 0002, instead of the ldfld we had before, we’ve got a ldflda, which does this:

Finds the address of a field in the object whose reference is currently on the evaluation stack.

Instead of loading the value, we’re loading the address of the m_Enumerator field. So now the call to MoveNext modifies the enumerator stored in the class rather than on the stack, and everything works as expected.

Previously, I had thought enumerator structs were an odd but interesting feature of the BCL that I had used in the past to do linked list slices. However, effects like this only underline how dangerous mutable structs are, and I’m at a loss to explain why the enumerators were implemented as structs in the first place. (interestingly, the SortedList<TKey, TValue> enumerator is a struct but is private, which makes it even more odd – the only way it can be accessed is as a boxed IEnumerator!).

I would love to hear people’s theories as to why the enumerators are implemented in such a fashion…

Note to self: never ever ever code a mutable struct.