A Visual Lexicon of LINQ

LINQ is best learned from examples, but few LINQ resources supply, along with the code, pictures that illustrate what each associated LINQ operator in the code is doing. This article is a visual index of all LINQ operators, that explain clearly with code and illustrations what even the most arcane LINQ operators actually do. To increase your enlightenment, it is accompanied with a reference chart to provide even more detail. Michael Sorens is, with these two articles, determined to persuade you of the power of LINQ.

 

Microsoft’s Language-Integrated Query (LINQ) “introduces standard, easily-learned patterns for querying and updating data” to quote MSDN. Or does it? Something purportedly so easy should not need quite so many reference pages in MSDN. Yes, I’m playing devil’s advocate a bit here, as there is more to LINQ than just writing the queries, depending on what type of LINQ provider you need. When it comes to learning the LINQ patterns themselves, though, learning-by-example is likely the quickest route to mastery. Microsoft’s venerable 101 LINQ Samples page is a good start, though the examples there favor the use of query syntax. If you prefer lambda syntax, LINQ 101 Samples – Lambda Style provides a “translation”. Both of those resources, though helpful, feel somewhat dated. The simply named LINQ Samples, on the other hand, provides a much better user experience for navigating around the set of LINQ operators. Those are all good resources as far as they go but they are all text! When it comes to understanding a LINQ operator quickly, I submit that you can grasp it much faster with a picture.

As an example, consider the following chunk of code. It computes student grade averages for each department then restricts the list to just those students whose grade is higher than their department’s average. Immediately below the code is a visualization of the restriction step—the LINQ where operator in the code. On the left you see the list of all students in a department, each marked with either a green checkmark (indicating a student is included) or a red X (indicating a student is excluded). On the right, that list is pruned down to students with grades exceeding 44.8 in this case. Between them you see light grey connecting lines showing how the students flow from the left to the right.

This visualization is a screen shot of OzCode, a Visual Studio extension that provides some impressive debugging aids. This image, as mentioned, illustrates the where operator. As you might suspect, you can click on the different operators at the top (from, let, from, where, or select) to see the input-to-output data flow for each operator.

This article is a visual index of all LINQ operators based on such OzCode renderings, which gives you an immediate, intuitive grasp of a given operator. For most entries, OzCode renderings are used unaltered, but for a small number I have enhanced the illustrations to make it even more clear what a given operator is doing. (Many of the code fragments come from Microsoft’s 101 LINQ Samples page or from the MSDN reference page Enumerable Methods. )

Most examples are shown using lambda syntax because all LINQ operators are available with lambda syntax while only a small number of them exist for query syntax. In fact, take a look at the accompanying wallchart to see at a glance which ones are available in query syntax, along with other key properties of all the LINQ operators, including which operators use deferred execution, how much of a sequence a given operator actually consumes, and more. Click to download the wallchart here:

C:\Users\ms\AppData\Local\Temp\SNAGHTMLc57c30c.PNG

If you want to leverage all the power of LINQ from your PowerShell code as well, see the third part in this series, High Performance PowerShell with LINQ.

Aggregate

Count

Returns the number of elements in a sequence. When the result is expected to be greater than Int32.MaxValue(), use LongCount.

If you specify an optional condition, Count returns the number of elements in a sequence that satisfies that condition.

LongCount

Returns as an Int64 the number of elements in a sequence. Use LongCount rather than Count when the result is expected to be greater than Int32.MaxValue(). LongCount, like Count, allows an optional condition.

Visualizes identically to Count.

Sum

Computes the sum of a sequence of values.

If you specify an optional transformation function, Sum computes the sum of a sequence of values after applying that transformation on each element. Note the additional column of values as compared to the straight Sum method above. Those values, adjacent to the input sequence values, reveal the transformed values.

Average

Computes the average of a sequence of values. If you specify an optional transformation function, Average computes the average of a sequence of values after applying that transformation on each element.

Visualizes identically to Sum.

Max

Returns the maximum value in a sequence. If you specify an optional transformation function, Max returns the maximum value in a sequence after applying that transformation on each element. (See the Min method for an example with the transformation function.)

Min

Returns the minimum value in a sequence. If you specify an optional transformation function, Min returns the minimum value in a sequence after applying that transformation on each element, as shown here. Note the additional column of values as compared to the Max method. Those values, adjacent to the input sequence values, reveal the transformed values.

Aggregate

Applies an accumulator function over a sequence. Note the additional column of values adjacent to the input sequence values, revealing the accumulated value with every element. You specify a two-argument function to perform an arbitrary aggregation function of your choice. The first parameter is the accumulated results so far, which is initialized to the default value for the element type (in this case, zero), and the second parameter is the sequence element.

If you specify an initial seed, Aggregate applies an accumulator function over a sequence with that initial seed value. While the seed could just be (depending on your needs) some constant integer or constant string, it could also create an object that your accumulator function will call methods against, as shown next. Here a StringBuilder is created that is used in each step of the aggregation. Note the additional column of values adjacent to the input sequence values, revealing the accumulated value with every element.

Conversion

Cast

Casts the elements of an IEnumerable to the specified type, effectively converting IEnumerable to IEnumerable<T>, which then makes the sequence amenable to further LINQ operations. The example shows an ArrayList, which is not a generic type, converted to generic with Cast.

OfType

Filters the elements of an IEnumerable based on a specified type (similar to Where). Note that, despite what the OzCode visualization here leads you to believe by rendering “4.0” as just “4”, OfType does not include types that are implicitly castable, only exact type matches. That’s why 4.0 passes the filter but 3 does not.

ToArray

Creates an array from an IEnumerable<T>. Note that execution of this LINQ query is deferred through the Select step, but the ToArray method is immediately executed, causing the entire query to execute.

ToList

Creates a List<T> from an IEnumerable<T>. Note that execution of this LINQ query is deferred through the Select step, but the ToList method is immediately executed, causing the entire query to execute.

ToDictionary

Creates a Dictionary<TKey, TValue> from an IEnumerable<T> according to a specified key selector function (g => g.Key in this example). The value of the dictionary entry (TValue) is just the current input element from the sequence unless you specify the optional element selector function, in which case the value is computed with that function. The example here uses g => g.ToList() for the element selector function, generating a List<string> for each dictionary entry.

A Dictionary is a one-to-one map, and is editable after creation. Querying on a non-existent key throws an exception. Contrast this with ToLookup.

ToLookup

Creates a Lookup<TKey, TElement> from an IEnumerable<T> according to a specified key selector function (c => c.Length in this example—its values appear in the second column). If the optional element selector function is also provided, the value of the lookup element (TElement) is computed with that function (not used in this example; see ToDictionary for a sample usage).

A Lookup is a one-to-many map that is not mutable after creation. Querying on a non-existent key returns an empty sequence. Contrast thiswith ToDictionary.

Note that Lookup<TKey,TValue> is roughly comparable to a Dictionary<TKey,IEnumerable<TValue>>. (Thanks to Mark Gravell for this tip on Stack Overflow.) Compare the results view here with that for ToDictionary to see the differences.

Elements

First

Returns the first element of a sequence. Throws an exception if the sequence contains no elements. Note that evaluation stops at the first element in the sequence; the remainder of the sequence is not evaluated.

If you specify an optional condition, First returns the first element in a sequence that satisfies that condition. Throws an exception if no elements satisfy the condition. Note that evaluation stops at the first element satisfying the condition in the sequence; the remainder of the sequence is not evaluated.

FirstOrDefault

Returns the first element of a sequence, or a default value if the sequence contains no elements. Note that evaluation stops at the first element in the sequence; the remainder of the sequence is not evaluated.

If you specify an optional condition, FirstOrDefault returns the first element in a sequence that satisfies that condition, or a default value if the sequence contains no elements. Note that evaluation stops at the first element satisfying the condition in the sequence; the remainder of the sequence is not evaluated.

Last

Returns the last element of a sequence. Throws an exception if the sequence contains no elements. The entire sequence must be evaluated to get to the last element.

If you specify an optional condition, Last returns the last element of a sequence that satisfies that condition. Throws an exception if the sequence contains no elements. The entire sequence must be evaluated to identify the target element, even if it ends up not being the actual last one in the sequence.

LastOrDefault

Returns the last element of a sequence, or a default value if the sequence contains no elements. The entire sequence must be evaluated to get to the last element.

If you specify an optional condition, LastOrDefault returns the last element of a sequence that satisfies that condition, or a default value if the sequence contains no elements. The entire sequence must be evaluated to identify the target element, even if it ends up not being the actual last one in the sequence.

Visualizes identically to FirstOrDefault.

ElementAt

Returns the element at a specified index in a sequence. Throws an exception if the index is out of range.

ElementAtOrDefault

Returns the element at a specified index in a sequence, or a default value if the index is out of range.

Single

Returns the only element of a sequence. Throws an exception if the sequence contains more than one element.

If you specify an optional condition, Single returns the only element in a sequence that satisfies that condition. Throws an exception if either no elements or more than one element satisfy the condition.

SingleOrDefault

Returns the only element of a sequence or a default value if the sequence is empty. Throws an exception if the sequence contains more than one element.

If you specify an optional condition, SingleOrDefault returns the only element in a sequence that satisfies that condition, or a default value if the sequence is empty. Throws an exception if the sequence contains more than one element.

Generation

Range

Generates a sequence of integral numbers within a specified range.

Repeat

Generates a sequence that contains a repeated value a specified number of times.

Empty

Returns an empty IEnumerable<T> that has the specified type argument.

DefaultIfEmpty

Returns the default value of the sequence’s elements (or, if a type parameter is explicitly specified, that type’s default value) in a singleton collection if the sequence is empty.

If the sequence is non-empty, simply returns the original sequence.

Grouping

GroupBy

Groups the elements of a sequence according to a specified key selector function (pet => pet.Age in this example—its values appear in the second column). In the result, the second group is expanded to show its contents, containing 2 members of age 4. Notice that the elements of the group are objects of the original type, Pet.

If you specify an optional projection function, GroupBy further projects the elements for each group with that function (pet => pet.Name in this next example). In the result, the second group is expanded to show its contents, containing 2 members of age 4. Notice that the elements of the group are now comprised of just the projected property, the pet’s name.

Join

Cross Join

Correlates the elements of two sequences based on matching keys. If the first sequence has no corresponding elements in the second sequence, it is not represented in the result. Join is equivalent to an inner join in SQL.

Group Join

Correlates the elements of two sequences based on equality of keys and groups the results. If the first sequence has no corresponding elements in the second sequence, it is still represented in the result but its group contains no members. In the example, notice that user Chuck (id=4) has no books associated with him. Group Join is equivalent to a left outer join in SQL.

Concat

Concatenates two sequences into a single sequence; further LINQ operations would then operate on the new, combined sequence.

Zip

Applies a specified function to the corresponding elements of two sequences, producing a new sequence of the results. If the first sequence is longer than the second, one element past the common length will be evaluated (“d” in the illustration) at which point a determination is made that the second sequence has been consumed, and further evaluation stops (so “e” is not evaluated). If the second sequence is longer than the first, its extra values will not be evaluated at all. Note that “zip” in this context has nothing to do with zip archives!

Ordering

OrderBy

Sorts the elements of a sequence in ascending order according to a key selector function. (i => i.Count) in this example—its values appear in the second column).

OrderByDescending

Sorts the elements of a sequence in descending order according to a key selector function.

Visualizes identically to OrderBy.

ThenBy

Performs a subsequent ordering of the elements in a sequence in ascending order according to a key selector function (d => d.Month in this example—its values appear in the second column). Note that unlike most other LINQ operators, which accept an IEnumerable<T> input, ThenBy accepts an IOrderedEnumerable<T> input—which happens to be the output of OrderBy.

ThenByDescending

Performs a subsequent ordering of the elements in a sequence in descending order according to a key selector function. Note that unlike most other LINQ operators, which accept an IEnumerable<T> input, ThenBy accepts an IOrderedEnumerable<T> input—which happens to be the output of OrderBy.

Visualizes identically to ThenBy.

Reverse

Inverts the order of the elements in a sequence.

Partitioning

Take

Returns a specified number of elements from the start of a sequence. Evaluation of the sequence stops after that as no further elements are needed.

Skip

Bypasses a specified number of elements in a sequence and then returns the remaining elements.

TakeWhile

Returns elements from the start of a sequence as long as a specified condition is true. Evaluation of the sequence stops after that as no further elements are needed.

SkipWhile

Bypasses elements in a sequence as long as a specified condition is true and then returns the remaining elements.

Projection

Select

Applies a specified transformation to each element of a sequence; this transformation is generally referred to as “projection”. Often you might project into a new object that is a subset of the original object, essentially discarding unneeded properties. In the illustration, just one property of the DateTime object is needed for further processing so the sequence is transformed to a new sequence with just the DayOfYear property. But you are not limited to just a subset; you can combine values from different sequences (see the Cross Join example) or you can even transform elements (see the Repeat example).

SelectMany

Projects each element of a sequence to an IEnumerable<T> and flattens the resulting sequences into a single sequence. If, in the illustration, Select had been used instead of SelectMany, each element of the result would be a list of User objects (i.e. a list of string arrays) rather than a list of strings, as shown, and the result would be just a 2-element list rather than a 6-element list.

Quantifiers

Any

Determines whether any element of a sequence (i.e. at least one element) satisfies a condition. All elements of the sequence need to be evaluated to provide a false result (first figure). However, if at any time during evaluating the sequence an element evaluates to true, the sequence evaluation stops at that element (second figure). Of course, if only the last element satisfies the condition, all elements will need to be evaluated and true will be returned.

All

Determines whether all elements of a sequence satisfy a condition. All elements of the sequence need to be evaluated to provide a true result (first figure). However, if at any time during evaluating the sequence an element evaluates to false, the sequence evaluation stops at that element (second figure). Of course, if only the last element fails to satisfy the condition, all elements will need to be evaluated and false will be returned.

Contains

Determines whether a sequence contains a specified element. The sequence may, of course, contain objects of an arbitrary type. In the case of strings, however, note that this method matches against each element in its entirety. Contrast this to the string method Contains that determines whether a string matches against a substring. (See the example for Any .)

SequenceEqual

Determines whether two sequences are equal; specifically, if the two sequences contain the same elements in the same order. When dealing with value types, as in the illustration, the use is intuitive: the lists differ at the third position so a determination has been made that they are different, and no further elements of the sequence need to be evaluated.

But if you use reference types, the elements are matched with reference equality; they need to be the actual, same object, not just objects with all the same property values. If you use different reference objects then the sequences are not equal:

By using the same actual object, then the sequences are considered equal:

Note that you can modify this behavior of using reference equality by either having the type implement IEquatable<T>, or using an overload that accepts an IEqualityComparer. Then you tailor the behavior as you wish.

Restriction (Filtering)

Where

Filters a sequence of values based on a predicate.

Sets

Distinct

Returns distinct elements from a sequence. Note that the sequence does not need to be sorted.

Union

Produces the set union of two sequences. Includes elements in both sequences but without duplication.

Intersection

Produces the set intersection of two sequences. Just those elements that exist in both sequences appear in the result.

Except

Produces the set difference of one sequence with a second sequence. Just those elements that exist in the first sequence and do not exist in the second sequence appear in the result.

Conclusion

This article serves as a visual dictionary of LINQ operators showing not just code samples but a visualization of what each associated LINQ operator in the code is doing. But there’s more! Accompanying this article is a handy wallchart that condenses the information here even further, plus adds some more technical specs. For example, it shows you at a glance which operators use deferred execution and which use immediate execution. Click here to download the PDF reference chart: