I came across
ThreadLocal<T> while I was researching
ConcurrentBag. To look at it, it doesn’t really make much sense. What’s all those extra
Cn classes doing in there? Why is there a
GenericHolder<T,U,V,W> class? What’s going on? However, digging deeper, it’s a rather ingenious solution to a tricky problem.
Declaring that a variable is thread static, that is, values assigned and read from the field is specific to the thread doing the reading, is quite easy in .NET:
private static string s_ThreadStaticField;
ThreadStaticAttribute is not a pseudo-custom attribute; it is compiled as a normal attribute, but the CLR has in-built magic, activated by that attribute, to redirect accesses to the field based on the executing thread’s identity.
TheadStaticAttribute provides a simple solution when you want to use a single field as thread-static. What if you want to create an arbitary number of thread static variables at runtime? Thread-static fields can only be declared, and are fixed, at compile time. Prior to .NET 4, you only had one solution – thread local data slots. This is a lesser-known function of
Thread that has existed since .NET 1.1:
LocalDataStoreSlot threadSlot = Thread.AllocateNamedDataSlot("slot1");
string value = "foo";
string gettedValue = (string)Thread.GetData(threadSlot);
Each instance of
LocalStoreDataSlot mediates access to a single slot, and each slot acts like a separate thread-static field.
As you can see, using thread data slots is quite cumbersome. You need to keep track of
LocalDataStoreSlot objects, it’s not obvious how instances of
LocalDataStoreSlot correspond to individual thread-static variables, and it’s not type safe. It’s also relatively slow and complicated; the internal implementation consists of a whole series of classes hanging off a single thread-static field in
Thread itself, using various arrays, lists, and locks for synchronization.
ThreadLocal<T> is far simpler and easier to use.
ThreadLocal provides an abstraction around thread-static fields that allows it to be used just like any other class; it can be used as a replacement for a thread-static field, it can be used in a
List<ThreadLocal<T>>, you can create as many as you need at runtime. So what does it do? It can’t just have an instance-specific thread-static field, because thread-static fields have to be declared as
static, and so shared between all instances of the declaring type. There’s something else going on here.
The values stored in instances of
ThreadLocal<T> are stored in instantiations of the
GenericHolder<T,U,V,W> class, which contains a single
ThreadStatic field (
s_value) to store the actual value. This class is then instantiated with various combinations of the
Cn types for generic arguments.
In .NET, each separate instantiation of a generic type has its own static state. For example,
GenericHolder<int,C0,C1,C2> has a completely separate
s_value field to
GenericHolder<int,C1,C14,C1>. This feature is (ab)used by
ThreadLocal to emulate instance thread-static fields.
Every time an instance of
ThreadLocal is constructed, it is assigned a unique number from the static
s_currentTypeId field using
Interlocked.Increment, in the
FindNextTypeIndex method. The hexadecimal representation of that number then defines the specific
Cn types that instantiates the
GenericHolder class. That instantiation is therefore ‘owned’ by that instance of
This gives each instance of
ThreadLocal its own
ThreadStatic field through a specific unique instantiation of the
GenericHolder class. Although
GenericHolder has four type variables, the first one is always instantiated to the type stored in the
ThreadLocal<T>. This gives three free type variables, each of which can be instantiated to one of 16 types (
C15). This puts an upper limit of 4096 (163) on the number of
ThreadLocal<T> instances that can be created for each value of T. That is, there can be a maximum of 4096 instances of
ThreadLocal<string>, and separately a maximum of 4096 instances of
However, there is an upper limit of 16384 enforced on the total number of
ThreadLocal instances in the AppDomain. This is to stop too much memory being used by thousands of instantiations of
GenericHolder<T,U,V,W>, as once a type is loaded into an AppDomain it cannot be unloaded, and will continue to sit there taking up memory until the AppDomain is unloaded. The total number of
ThreadLocal instances created is tracked by the
So what happens when either limit is reached? Firstly, to try and stop this limit being reached, it recycles
GenericHolder type indexes of
ThreadLocal instances that get disposed using the
s_availableIndices concurrent stack. This allows
GenericHolder instantiations of disposed
ThreadLocal instances to be re-used. But if there aren’t any available instantiations, then
ThreadLocal falls back on a standard thread local slot using
TLSHolder. This makes it very important to dispose of your
ThreadLocal instances if you’ll be using lots of them, so the type instantiations can be recycled.
The previous way of creating arbitary thread-static variables, thread data slots, was slow, clunky, and hard to use. In comparison,
ThreadLocal can be used just like any other type, and each instance appears from the outside to be a non-static thread-static variable. It does this by using the CLR type system to assign each instance of
ThreadLocal its own instantiated type containing a thread-static field, and so delegating a lot of the bookkeeping that thread data slots had to do to the CLR type system itself! That’s a very clever use of the CLR type system.