Object Overhead: The Hidden .NET Memory Allocation Cost

When developing a .NET application, one of the least visible sources of memory consumption is the overhead required by an object simply to exist. In applications that create a lot of small objects, this overhead can be a major or even a dominant factor in the total memory requirements for the application.

The amount of memory required to keep an object in memory varies depending on whether or not the application is running as a 32 or 64 bit process, and it can be surprisingly high. On a 32-bit system, every object has an 8 byte header – which means that in most cases it must have at least 3 fields to be more than 50% padding. This isn’t the whole story, though: in order to exist this object has to be referenced from somewhere – this increases the amount of memory needed for an object simply to exist to 12 bytes..

On 64-bit systems, the situation is worse. The object header is increased to 16 bytes, and 8 bytes are required for a reference, so every object needs 24 bytes simply to exist. If an application’s memory usage is due to many small objects, switching from 32 to 64 bits will make the situation much worse, not better! The out of memory conditions might be avoided but the resource requirements might increase by up to a factor of 3..

This overhead imposes limits on the number of objects that can reasonably be created by a .NET application. The 12 bytes to exist limit suggests that the maximum number of objects that can be created on a 32-bit system is around 170 million or so – but this many objects would be useless, as no data could be associated with them. Adding 4 bytes of data decreases the limit to 130 million, but 75% of the memory used by the application will be overhead, which is very inefficient..

A more practical way of looking at the number of objects that can reasonably be created is to think about the desirable level of efficiency. For a given amount of memory wasted by being given over to .NET infrastructure, it’s possible to work out the number of objects that should exist and what their average size should be. To reduce the overhead to 10%, for example, each object on a 32-bit system must store an average of around 80 bytes of data – for a total size of 88 bytes, 10% of which is the 8 byte header. This suggests a more reasonable limit of 24 million objects. On 64-bit systems, the objects should be around twice as large to get the same efficiency..

This sounds hard to achieve – 80 bytes usually means that objects must have 10 fields, and such large numbers of fields is usually considered a bad ‘code smell’, indicating classes that are too complex and in need of factoring. There’s a simple practical solution, though: design the application so that any bulk data storage it requires is done by arrays of value types – an array’s overhead is fixed no matter how large it is, and value types have no overhead at all if they are not boxed. Raw arrays or the classes found in System.Collections.Generic are both suitable for this purpose. The details of how these arrays are accessed can be hidden by creating classes that provide an abstract interface to the data they represent and only keeping instances of these classes in memory for the parts of the data that are actually being used at any given point in time..

Unfortunately, this solution may introduce a new form of inefficiency: heap fragmentation. If arrays are being created and destroyed a lot, or resized a lot (which in .NET amounts to the same thing) it is possible that the pattern of creations and garbage collections can result in .NET leaving large holes in memory that will reduce the size of the largest array that it can allocate. This problem can result in an application gradually running out of memory even though it has no memory leaks and its memory requirements are not otherwise increasing over time. I covered this issue and some possible ways to work around it in an earlier article..

All objects created by the CLR are subject to this hidden memory cost, which can result in an application using many times more memory than expected. For bulk in-memory data storage, swarms of small objects can push the cost up to unacceptable levels, especially on 64 bit systems. Reducing the number of objects kept in memory at any one time, perhaps by increasing the number of fields in individual objects or by storing bulk data in large data structures, is an effective way to increase the capacity and efficiency of .NET applications.