C# via Java: Arrays

The one primitive type that hasn’t been covered is the array. An array contains a fixed number of items, and each item is a value of the array’s element type. The array elements are individually indexed starting from zero.

In comparison to the other primitive types, which are all value types, arrays are reference types. That means that variables of array types don’t contain the array directly where the variable is defined, but instead contains a reference to the array. In code, array types are specified using square brackets after the type of the array elements. For example, an array of double values is a double[], an array of ushorts is ushort[], and an array of arrays of integers (each element of the top-level array is a reference to another array) is int[][].

Array elements

As arrays are primitives, the runtime needs to provide built-in operations to create arrays of a fixed size, read and write individual array elements, and get the length of an array. The Java and .NET runtimes provides specific instructions to read and write elements of each primitive type:

Operation Java .NET
Create a new array. newarray newarr
Read integer value from int[]. iaload ldelem.i4
Read integer value from ushort[]. N/A ldelem.u2
Store double value in double[]. dastore stelem.r8
Read reference type value from an array of some reference type. aaload ldelem.ref
Get the length of an array arraylength ldlen

What does this mean? Well, every array of a primitive value type is a completely separate type, with its own type information created at runtime. Just as an int cannot be used in place of a ushort, an int[] cannot be used in place of a ushort[] – as different instructions are needed to access arrays of different element types. You can’t use a ldelem.u1 to read an element from a Object[].

However, the same instructions are used to read and write elements of arrays containing reference types – aaload (Java) or ldelem.ref (.NET) is used to read an element from a String[], Object[], int[][] (loading an element of type int[]), or any other reference type. However, as we’re dealing just with primitive types in this post, we’ll ignore this for now, and consider all arrays of a reference element type to be equivalent.

So, a byte[] is a separate type to a short[], which is a separate type to a <ref type>[]. And this is indeed the case in Java. The type descriptors for each type of array is created when the runtime is loaded, inherits directly from Object, and they have special built-in operations and syntax in the Java language that map directly onto bytecode instructions. They do not implement any interfaces, they have no methods, and are not directly convertible to other types of arrays.

In particular, arrays are completely separate to the collection and list interfaces in Java’s class libraries – explicit wrapper methods are needed to convert between the two.

But this is not the case in .NET. In the CLR, all arrays inherit from System.Array (which is declared as abstract). This type defines several methods, and implements IList. So all the array types created by the CLR inherit these methods and interface implementations. This means that an instance of any array can be treated as a collection type, and can be used wherever an IEnumerable, ICollection, or IList is required. If the array element type is a value type, the array performs the necessary operations to box and unbox the elements in the array when the array is accessed using those interfaces.

SZArrayHelper

But it does not stop there. Starting in .NET 2, array types are also provided with their own implementations of the generic IEnumerable<T>, ICollection<T>, and IList<T> types. And you can find this implementation in mscorlib.dll. Open it up in a decompiler, and navigate to System.SZArrayHelper.

If you have a look around this class, you’ll notice some strange calls looking like this:

This call to UnsafeCast casts the ‘this’ reference to T[], where T is the method’s generic type parameter. But these methods are delared on the type SZArrayHelper, which is not an array.

What’s actually going on is that, at runtime, the IL of the methods in SZArrayHelper is grafted onto each array type as it is created by the CLR. These methods are used to implement the generic IList<T> interface, with T instantiated to the element type of the array.

These interface implementations are applied to each array type, allowing the primitive arrays to be used where a generic collection type defined in the class libraries is expected. Similar to the primitive types in the previous post, the CLR provides extra functionality to arrays, on top of the same built-in operations provided by the runtime. This functionality is partly inherited from the System.Array type, and partly patched in to array types at runtime. In Java, the arrays only have the operations provided by built-in instructions.

In the next post, we’ll continue looking at primitives provided by the runtime, and the differences in the language-level operators provided by C# and Java.