High Performance PowerShell with LINQ

Comments 0

Share to social media

Contents

 

My recent article A Visual Lexicon of LINQ and accompanying wallchart provided a new, easier-to-use reference for LINQ operators when working in C#. But since one of PowerShell’s mantras is “anything you can do in C# can be done in PowerShell, too!” it is only fitting to provide a LINQ reference for PowerShell as well. This reference again itemizes every LINQ operator (and in the same order as the original article) and gives some emphasis to potential performance gains from using LINQ where possible when doing operations on large data sets.

PowerShell is an interpreted scripting language, and so is slow at using an iterative loop of any great size. By ‘slow’, it can be the difference between ten minutes in PowerShell as opposed to six seconds in C#! If a loop iterates more than sixteen times, the code of the loop is compiled dynamically as .NET code and the dynamic method is then invoked inside the loop. Also, because any scripting language presents a potential security risk, .NET must run a security check on the stack, which slows the loop down. LINQ has many aggregate and filtering functions that can be used instead of the PowerShell equivalent, and they are very likely to give you an appreciable performance improvement—but at the cost of a tedious overhead in writing the script, as detailed next.

General Notes

Why do I mention a tedious overhead of use? In PowerShell, it is straightforward to access conventional C# methods—for example,"abc".Replace("a", "A") works just fine in PowerShell. However, most LINQ operators are static extension methods and many of those require delegate arguments, so invoking them from PowerShell is more involved. (You could say that technically those arguments are anonymous functions or even lambda expressions but I’ve chosen to use the term delegate here as it is shorter and practically equivalent in this context.) For this reason, I’ve given, in this article working examples of the use of every LINQ operator, along with the C# equivalent and the conventional PowerShell way of doing the same thing. This should enable you to judge for yourself.

Format of Entries

For each entry you will find:

  1. A useful yet simple (or simple-ish) example of how to use the LINQ operator in C# as a reference point.
  2. A translation of that example into PowerShell, i.e. calling the actual LINQ operator from PowerShell.
  3. An alternate version of code to perform the same operation using native PowerShell (i.e. doing it “the PowerShell way”).

Deferred vs. Immediate Execution

Unlike conventional functions and methods, many LINQ methods use deferred execution instead of immediate execution. That is, an expression with some LINQ operators does not actually calculate and yield a result until it is actually needed. (The wallchart shows you at a glance which operators this applies to.) Rather, those operators return a query (essentially an IEnumerable<T>). Unless you are doing further LINQ operations on it, you need to convert it with a LINQ operator that materializes the result to work with it further in PowerShell; ToArray or ToList are the most common LINQ methods for doing that.

Calling a LINQ Operator

LINQ operators are static extension methods. In PowerShell plain static method calls require this syntax:

Extension methods in C#, as you’ll recall, look like any other method available from an object, e.g.

But in PowerShell, the ObjectInstance moves into the first argument position of the static call:

The only other piece you need to know is the ClassName to use, which will always be Linq.Enumerable. Thus, numbers.Sum() in C# becomes [Linq.Enumerable]::Sum($numbers) in PowerShell. However, that will only work if the $numbers array has the correct type and by default it does not. The next section explains further.

Explicit Argument Typing

PowerShell is a dynamically typed language rather than a statically typed language. And it is not strongly-typed (because a variable can change its type at runtime). But PowerShell does support explicit typing of variables if you choose to use it—and for LINQ calls you have no choice in the matter! Consider the rather innocuous LINQ call in C#:

This would seem to translate to PowerShell readily as:

Unfortunately, the result of that expression is an error:

Cannot find an overload for “Sum” and the argument count: “1”.

That error makes you think the argument count is wrong but in fact it is the type of the argument that is incorrect, as we can demonstrate.

The Sum method does not take an Object array, but it has overloads for a variety of numerical types. Thus you need to explicitly type the array; here is one way to do that:

Creating and Passing Delegates

Passing simple arguments to LINQ, as just shown, require explicit typing. This is even more important when using a LINQ operator with an argument that is a delegate (or anonymous function). Consider the ubiquitous Where operator. Let’s say you have an array of date objects and you wish to filter those. In C# you might write:

In PowerShell, you write the equivalent delegate like this:

Then can make the call:

Or if you prefer to write a single expression, you still need explicit typing (I’ve added line breaks just for clarity):

Generic LINQ Operators

Just 3 LINQ operators are generic: Cast, OfType, and Empty. As it turns out, calling a generic, static, extension LINQ method requires a rather convoluted incantation in PowerShell. Consider this standard LINQ call in C# to filter a list to just strings:

The first step for PowerShell conversion, oddly enough is to rewrite that—still in C#— so that it can be translated to PowerShell (yes, I know this is ugly!):

But now, you can get there from here (i.e. you can write it in PowerShell):

PS> $stuff = @("12345", 12, "def")
PS> $stringType = "".GetType() # set to your target type
PS> $ofTypeForString =
    [Linq.Enumerable].GetMethod("OfType").MakeGenericMethod($stringType)
# The last comma below wraps the array arg $stuff within another array
PS> $ofTypeForString.Invoke($null, (,$stuff))

LINQ Chaining

One of the strong benefits of LINQ is its chaining capability. Here’s a simple example in C#:

While you can still do this in PowerShell, it unfortunately does not allow that wonderfully smooth fluent syntax because of the way you have to write calls to extension methods explained above. You are limited to only conventional method calls:

Performance

Why bother with all this tedious overhead mentioned so far? In a word, performance! PowerShell was never designed to compete in terms of speed with the likes of C#. And most of the time that is perfectly fine. With small data structures or simple programs you may never even notice performance that is sub-optimal. But PowerShell is a first-class language, so you could write elaborate code dealing with huge data structures. That is when performance should definitely be kept in mind.

First, here’s a handy little function showing one way to measure performance. Note that if you just want the performance numbers, the built-in Measure-Command cmdlet would work fine, but I wanted to get two outputs: the performance and the actual result of the evaluation (the reason for this is explained just a bit further down).

To use this, simply wrap the expression you wish to evaluate in a string and pass it to the function. This code compares the LINQ Sum method with three other ways to do the same thing in native PowerShell:

It has long been known that the foreach operator is much more snappy than piping data to the ForEach-Object cmdlet (see e.g. Thomas Lee’s Performance with PowerShell ). Also, PowerShell Engine Improvements reveals that WMF 5.1 (released January 2017) has made substantial improvements in the core PowerShell engine; of specific interest is that piping to the ForEach-Object cmdlet is twice as fast as it used to be (but still foreach prevails). Also, be sure to take a look at the short list of PowerShell scripting performance considerations from Microsoft.

Those considerations are important, yes, but take a look at the actual results here: LINQ outperforms even the best native PowerShell by an order of magnitude!

Basic Command

Time (milliseconds)

[Linq.Enumerable]::Sum($numbers)

0.4

($numbers | Measure-Object -sum).Sum

79.8

$numbers | ForEach { $sum += $_ } -Begin { $sum = 0 } -End { $sum }

156.0

$sum = 0; foreach ($n in $numbers) { $sum += $n }; $sum

29.5

One minor detail to note on the above figures: The first time you invoke a LINQ expression in a PowerShell session there is some overhead loading the assembly. That slows down the performance by an order of magnitude—so that it is only as fast as the fastest native PowerShell call (the foreach loop). But for every invocation thereafter in the same session, you get the much faster execution times.

Note that I am not claiming that every LINQ operator will show this dramatic performance difference. It seems to hold true for the important aggregate operators like Sum (Count, Average, etc.) but I have not performance-tested the whole gamut of other LINQ operators.

One final consideration when doing performance studies of LINQ operators in PowerShell: you need to take into account whether the operator uses deferred or immediate execution. Sum, used in the above example, uses immediate execution. That is, it produces an output that can be consumed by the rest of your PowerShell code. So if you are comparing a LINQ expression with a non-LINQ one, make sure you’re comparing like expressions. That is, if you use an operator with deferred execution, you need to include something like a ToArray call to realize the results as part of your measurement. That’s why I wrote the Measure-Expression function above to report not just the execution time but also the result of the expression; you will see right away if you’re doing a valid “apples-to-apples” comparison.

LINQ to PowerShell Lexicon

Aggregate

Count

Returns the number of elements in a sequence. When the result is expected to be greater than Int32.MaxValue(), use LongCount. If you specify an optional condition, Count returns the number of elements in a sequence that satisfies that condition.

LINQ in C#

LINQ in PowerShell

Native PowerShell

or

LongCount

Returns as an Int64 the number of elements in a sequence. Use LongCount rather than Count when the result is expected to be greater than Int32.MaxValue(). LongCount, like Count, allows an optional condition.

Works identically to Count.

Sum

Computes the sum of a sequence of values. If you specify an optional transformation function, Sum computes the sum of a sequence of values after applying that transformation on each element.

LINQ in C#

LINQ in PowerShell

Native PowerShell

PS> [int[]] $numbers = 1..10000
PS> function func($n) { if ($n % 3) { $n } else { -$n } }

# basic command
PS> ($numbers | Measure-Object -Sum).Sum
PS> $numbers | ForEach { $sum += $_ } -Begin { $sum = 0 } -End { $sum }
PS> $sum = 0; foreach ($n in $numbers) { $sum += $n }; $sum

# command with transformation
PS> ($numbers | ForEach { func $_  } | Measure-Object -Sum).Sum
PS> $sum = 0; foreach ($n in $numbers) { $sum += func $n }; $sum

Average

Computes the average of a sequence of values. If you specify an optional transformation function, Average computes the average of a sequence of values after applying that transformation on each element.

LINQ in C#

LINQ in PowerShell

Native PowerShell

PS> [int[]] $numbers = 1..10000
PS> function func($n) { if ($n % 5) { 100 * $n } else { $n } }

# basic command
PS> ($numbers | Measure-Object -Average).Average
PS> $numbers | ForEach { $sum += $_ } -Begin { $sum = 0 } -End { $sum / $numbers.Length }
PS> $sum = 0; foreach ($n in $numbers) { $sum += $n }; $sum / $numbers.Length

# command with transformation
PS> ($numbers | ForEach { func $_  } | Measure-Object -Average).Average
PS> $sum = 0; foreach ($n in $numbers) { $sum += func $n }; $sum / $numbers.Length

Max

Returns the maximum value in a sequence. If you specify an optional transformation function, Max returns the maximum value in a sequence after applying that transformation on each element.

LINQ in C#

LINQ in PowerShell

Native PowerShell

PS> [int[]] $numbers = 1..10000
PS> function func($n) { if ($n % 5) { 100 * $n } else { $n } }

# basic command
PS> ($numbers | Measure-Object -Maximum).Maximum
PS> $numbers | ForEach {if ($_ -gt $max) {$max=$_}} -Begin {$max=[int]::MinValue} -End {$max}
PS> $max=[int]::MinValue; foreach ($n in $numbers) { if ($n -gt $max) {$max=$n}}; $max

# command with transformation
PS> ($numbers | ForEach { func $_  } | Measure-Object -Maximum).Maximum
PS> $max=[int]::MinValue; foreach ($n in $numbers) {$n=func $n; if ($n -gt $max) {$max=$n}}; $max

Min

Returns the minimum value in a sequence. If you specify an optional transformation function, Min returns the minimum value in a sequence after applying that transformation on each element.

LINQ in C#

LINQ in PowerShell

Native PowerShell

PS> [int[]] $numbers = 1..10000
PS> function func($n) { if ($n % 5) { -100 * $n } else { $n } }

# basic command
PS> ($numbers | Measure-Object -Minimum).Minimum
PS> $numbers | ForEach {if ($_ -lt $min) {$min=$_}} -Begin {$min=[int]::MaxValue} -End {$min}
PS> $min=[int]::MaxValue; foreach ($n in $numbers) { if ($n -lt $min) {$min=$n}}; $min

# command with transformation
PS> ($numbers | ForEach { func $_  } | Measure-Object -Minimum).Minimum
PS> $min=[int]::MaxValue; foreach ($n in $numbers) {$n=func $n; if ($n -lt $min) {$min=$n}}; $min

Aggregate

Applies an accumulator function over a sequence. You specify a two-argument function to perform an arbitrary aggregation function of your choice. The first parameter is the accumulated results so far, which is initialized to the default value for the element type, and the second parameter is the sequence element.

LINQ in C#

LINQ in PowerShell

Native PowerShell

If you specify an initial seed, Aggregate applies an accumulator function over a sequence with that initial seed value. While the seed could just be (depending on your needs) some constant integer or constant string, it could also create an object that your accumulator function will call methods against, as shown next. Here a StringBuilder is created that is used in each step of the aggregation.

LINQ in C#

LINQ in PowerShell

Conversion

Cast

Casts the elements of an IEnumerable to the specified type, effectively converting IEnumerable to IEnumerable<T>, which then makes the sequence amenable to further LINQ operations. Alternately, it can be used like OfType which filters based on a specified type. However, whereas OfType ignores members that are not convertible to the target type, Cast throws an exception when it encounters such members, as the examples here reveal.

LINQ in C#

LINQ in PowerShell

As discussed in the introduction, generic calls need to be rewritten before being translated:

And that translates to PowerShell as:

PS> $stuff = @("12345", 12, "def")
PS> $stringType = "".GetType() # set to your target type
PS> $castForString =
    [Linq.Enumerable].GetMethod("Cast").MakeGenericMethod($stringType)
# The last comma below wraps the array arg $stuff within another array
PS> $castForString.Invoke($null, (,$stuff))
Unable to cast object of type 'System.Int32' to type 'System.String' 

Native PowerShell

PS> $stuff = @("12345", 12, "def")
PS> $stuff | ForEach-Object {
        if ($_ -is [string]) { $_ } else { throw "$($_): incompatible type" }
    }
12: incompatible type 

OfType

Filters the elements of an IEnumerable based on a specified type.

LINQ in C#

LINQ in PowerShell

As discussed in the introduction, generic calls need to be rewritten before being translated:

And that translates to PowerShell as:

PS> $stuff = @("12345", 12, "def")
PS> $stringType = "".GetType() # set to your target type
PS> $ofTypeForString =
    [Linq.Enumerable].GetMethod("OfType").MakeGenericMethod($stringType)
# The last comma below wraps the array arg $stuff within another array
PS> $ofTypeForString.Invoke($null, (,$stuff)) 
12345
def

Native PowerShell

ToArray

Creates an array from an IEnumerable<T>.

LINQ in C#

LINQ in PowerShell

If you have a deferred LINQ query, you can view its result set in PowerShell as if it were seemingly an array or list but you cannot access its member elements until you actually complete the LINQ invocation. An example shows this simply. Here, $query looks like an array or list when evaluated in line (2), but line (3) shows it is not:

(1)> $query = [Linq.Enumerable]::Range(0,4)
(2)> $query
0
1
2
3
(3)> $query[3]
Unable to index into an object of type System.Linq.Enumerable+<RangeIterator>d__110

Rather, you need to use ToArray to realize the results of the query:

ToList

Creates a List<T> from an IEnumerable<T>.

LINQ in C#

LINQ in PowerShell

If you have a deferred LINQ query, you can view its result set in PowerShell as if it were seemingly an array or list but you cannot access its member elements until you actually complete the LINQ invocation. An example shows this simply. Here, $query looks like an array or list when evaluated in line (2), but line (3) shows it is not:

(1)> $query = [Linq.Enumerable]::Range(0,4)
(2)> $query
0
1
2
3
(3)> $query[3]
Unable to index into an object of type System.Linq.Enumerable+<RangeIterator>d__110

Rather, you need to use ToList to realize the results of the query:

ToDictionary

Creates a Dictionary<TKey, TValue> from an IEnumerable<T> according to a specified key selector function (person => person.SSN in this example).

A Dictionary is a one-to-one map, and is editable after creation. Querying on a non-existent key throws an exception. Contrast this with ToLookup.

(C# adapted from LINQ: Quickly Create Dictionaries with ToDictionary.)

LINQ in C#

LINQ in PowerShell

The value of the dictionary entry (TValue) is just the current input element from the sequence unless you specify the optional element selector function, in which case the value is computed with that function. The next example creates a composite full name for the value.

# Use the same setup as above, then just...
PS> $fullNameDelegate = [Func[Person,string]] { '{0} {1}' -f $args[0].FirstName, $args[0].Surname }
PS> $dict = [Linq.Enumerable]::ToDictionary($peopleList, $keyDelegate, $fullNameDelegate)
PS> $dict

Key  Value       
---  -----       
1001 Bob Smith   
2002 Jane Doe    
3003 Fester Adams

PS> $dict['1001']
Bob Smith

Native PowerShell

PowerShell uses hash tables natively. They work very much like .NET dictionaries but you have a Name and Value instead of a Key and Value:

# Use the same setup as above, then just...
PS> $peopleList | foreach { $hash = @{} } { $hash[$_.SSN] = $_ }
PS> $hash

Name                           Value                                                                                                                       
----                           -----                                                                                                                       
2002                           Person                                                                                                                      
3003                           Person                                                                                                                      
1001                           Person                                                                                                                      

[508]: $hash['1001']

SSN  FirstName Surname
---  --------- -------
1001 Bob       Smith  

PS> $peopleList | foreach { $hash = @{} }
        { $hash[$_.ssn] = ('{0} {1}' -f $_.FirstName, $_.Surname) }
PS> $hash

Name                           Value                                                                                                                       
----                           -----                                                                                                                       
2002                           Jane Doe                                                                                                                    
3003                           Fester Adams                                                                                                                
1001                           Bob Smith

PS> $hash['1001']
Bob Smith

ToLookup

Creates a Lookup<TKey, TElement> from an IEnumerable<T> according to a specified key selector function (c => c.Length in this example). If the optional element selector function is also provided, the value of the lookup element (TElement) is computed with that function (not used in this example; see ToDictionary for a sample usage).

A Lookup is a one-to-many map that is not mutable after creation. Querying on a non-existent key returns an empty sequence. Contrast this with ToDictionary. (Note that Lookup<TKey,TValue> is roughly comparable to a Dictionary<TKey,IEnumerable<TValue>>. Thanks to Mark Gravell for this tip on Stack Overflow.)

LINQ in C#

LINQ in PowerShell

PS> [string[]]$colors = @(
    "green",
    "blue",
    "red",
    "yellow",
    "orange",
    "black"
)
PS> $lengthDelegate = [Func[string,int]] { $args[0].Length }
PS> $lookup = [Linq.Enumerable]::ToLookup($colors, $lengthDelegate)
# Keys in the Lookup are string lengths per the given delegate
PS> $lookup[6]
yellow
orange
# But the result is not a list or array yet!
PS> $lookup[6].GetType().Name
Grouping
PS> $6LetterColors = [Linq.Enumerable]::ToArray($lookup[6])
PS> $6LetterColors[0]
yellow

Native PowerShell

ToLookup groups objects by a certain key… which is just what Group-Object does when you specify the -AsHashTable parameter.

# Use the same setup as above, then just...
PS> $groups = $colors | Group-Object -Property Length -AsHashTable
PS> $groups

Name                           Value                                                                                                                       
----                           -----                                                                                                                       
6                              {yellow, orange}                                                                                                            
5                              {green, black}                                                                                                              
4                              {blue}                                                                                                                      
3                              {red}                                                                                                                       

Elements

First

Returns the first element of a sequence. Throws an exception if the sequence contains no elements. Note that evaluation stops at the first element in the sequence; the remainder of the sequence is not evaluated.

If you specify an optional condition, First returns the first element in a sequence that satisfies that condition. Throws an exception if no elements satisfy the condition. Note that evaluation stops at the first element satisfying the condition in the sequence; the remainder of the sequence is not evaluated.

LINQ in C#

LINQ in PowerShell

Native PowerShell

FirstOrDefault

Returns the first element of a sequence, or a default value if the sequence contains no elements. Note that evaluation stops at the first element in the sequence; the remainder of the sequence is not evaluated.

If you specify an optional condition, FirstOrDefault returns the first element in a sequence that satisfies that condition, or a default value if the sequence contains no elements. Note that evaluation stops at the first element satisfying the condition in the sequence; the remainder of the sequence is not evaluated.

LINQ in C#

LINQ in PowerShell

Native PowerShell

Last

Returns the last element of a sequence. Throws an exception if the sequence contains no elements. The entire sequence must be evaluated to get to the last element.

If you specify an optional condition, Last returns the last element of a sequence that satisfies that condition. Throws an exception if the sequence contains no elements. The entire sequence must be evaluated to identify the target element, even if it ends up not being the actual last one in the sequence.

LINQ in C#

LINQ in PowerShell

Native PowerShell

LastOrDefault

Returns the last element of a sequence, or a default value if the sequence contains no elements. The entire sequence must be evaluated to get to the last element.

If you specify an optional condition, LastOrDefault returns the last element of a sequence that satisfies that condition, or a default value if the sequence contains no elements. The entire sequence must be evaluated to identify the target element, even if it ends up not being the actual last one in the sequence.

LINQ in C#

LINQ in PowerShell

Native PowerShell

ElementAt

Returns the element at a specified index (zero-based) in a sequence. Throws an exception if the index is out of range.

LINQ in C#

LINQ in PowerShell

Native PowerShell

ElementAtOrDefault

Returns the element at a specified index (zero-based) in a sequence, or a default value if the index is out of range.

LINQ in C#

LINQ in PowerShell

Native PowerShell

Single

Returns the only element of a sequence. Throws an exception if the sequence contains more than one element.

If you specify an optional condition, Single returns the only element in a sequence that satisfies that condition. Throws an exception if either no elements or more than one element satisfy the condition.

LINQ in C#

LINQ in PowerShell

Native PowerShell

SingleOrDefault

Returns the only element of a sequence or a default value if the sequence is empty. Throws an exception if the sequence contains more than one element.

If you specify an optional condition, SingleOrDefault returns the only element in a sequence that satisfies that condition, or a default value if the sequence is empty. Throws an exception if the sequence contains more than one element.

LINQ in C#

LINQ in PowerShell

Native PowerShell

PS> $numbers = @(2, 0, 5, -11, 29)
PS> $result = @($numbers | Where { $_ -gt 42 }) # force into an array
PS> if ($result.Length -gt 1) { throw “Sequence contains more than one element" }
PS> if ($result.Length -eq 1) { $result[0] } else { 0 }
0

Generation

Range

Generates a sequence of integral numbers within a specified range.

LINQ in C#

LINQ in PowerShell

Native PowerShell

Repeat

Generates a sequence that contains a repeated value a specified number of times.

LINQ in C#

LINQ in PowerShell

Native PowerShell

Empty

Returns an empty IEnumerable<T> that has the specified type argument.

LINQ in C#

LINQ in PowerShell

And that translates to PowerShell as:

PS> $stringType = "".GetType() # set to your target type
PS> $emptyForString =
    [Linq.Enumerable].GetMethod("Empty").MakeGenericMethod($stringType)
# The last comma below wraps the array arg $stuff within another array
PS> $emptyList = $emptyForString.Invoke($null, @()) 
PS> $emptyList.Count
0
PS> $emptyList.GetType().name
String[]

Native PowerShell

DefaultIfEmpty

Returns the default value of the sequence’s elements (or, if a type parameter is explicitly specified, that type’s default value) in a singleton collection if the sequence is empty. In this first example, the list is not empty so it returns the original sequence.

LINQ in C#

LINQ in PowerShell

But if the condition is changed so the filtered list has no elements, then:

Native PowerShell

PS> $words = @( "one","two","three" )
PS> $filteredWords = $words | Where-Object { $_.Length -eq 2 }
PS> if ($filteredWords) { $filteredWords } else { "unknown" }
one
two

# Again, the filter is changed here to now filter out all items
PS> $filteredWords = $words | Where-Object { $_.Length -eq 3 }
unknown

Grouping

GroupBy

Groups the elements of a sequence according to a specified key selector function (pet => pet.Age). In the result, the second group is expanded to show its contents, containing 2 members of age 4. Notice that the elements of the group are objects of the original type, Pet.

LINQ in C#

LINQ in PowerShell

If you specify an optional projection function, GroupBy further projects the elements for each group with that function (pet => pet.Name in this next example). In the result, the second group is expanded to show its contents, containing 2 members of age 4. Notice that the elements of the group are now comprised of just the projected property, the pet’s name.

Native PowerShell

# Use the same setup as above, then just...
PS> $groups = $pets | Group-Object -Property Age
Count Name                      Group                                                                                                                      
----- ----                      -----                                                                                                                      
    1 8                         {Pet}                                                                                                                      
    2 4                         {Pet, Pet}                                                                                                                 
    1 1                         {Pet}   
PS> $groups[1].Group
Name  Age
----  ---
Boots   4
Daisy   4

Join

Cross Join

Correlates the elements of two sequences based on matching keys. If the first sequence has no corresponding elements in the second sequence, it is not represented in the result. Join is equivalent to an inner join in SQL.

LINQ in C#

LINQ in PowerShell

Native PowerShell

# Use the same setup as above, then just...
PS> $users | ForEach-Object { 
        $user = $_
        $book = $books | Where-Object Id -eq $user.Id
        if ($book) { "{0} => {1}" -f $user.Name, $book.Title }
    }
Sam => Beowulf
Dean => Bates Motel
Crowley => Inferno
Castiel => Heaven Can Wait

Group Join

Correlates the elements of two sequences based on equality of keys and groups the results. If the first sequence has no corresponding elements in the second sequence, it is still represented in the result but its group contains no members. In the example, notice that user Chuck (id=4) has no books associated with him. Group Join is equivalent to a left outer join in SQL.

LINQ in C#

LINQ in PowerShell

While this example is extremely similar to that for Cross Join above, it has one key change—requiring a list in the delegate—that is inexplicably causing it to fail, as noted towards the bottom.

class User
{
    [int] $Id;
    [string] $Name;

    User($id, $name) {
        $this.Id = $id
        $this.Name = $name
    }
}

class Book
{
    [int] $Id;
    [string] $Title;
    Book($id, $title) {
        $this.Id = $id
        $this.Title = $title
    }
}

[User[]]$users = @(
    [User]::new(1, "Sam"),
    [User]::new(6, "Dean"),
    [User]::new(3, "Crowley"),
    [User]::new(4, "Chuck"),
    [User]::new(5, "Castiel")
)

[Book[]]$books = @(
    [Book]::new(3, "Inferno"),
    [Book]::new(1, "Inferno"),
    [Book]::new(9, "Bliss"),
    [Book]::new(5, "Heaven Can Wait"),
    [Book]::new(1, "Beowulf"),
    [Book]::new(6, "Bates Motel")
)

PS> $outerKeyDelegate = [Func[User,int]] { $args[0].Id }
PS> $innerKeyDelegate = [Func[Book,int]] { $args[0].Id }

# Thanks to reader "ili" for deciphering the needed middle type here!
# Turns out we need an IEnumerable[Book] rather than Book[] as I tried.
PS> $resultDelegate = [Func[User,[Collections.Generic.IEnumerable[Book]],string]]
         { '{0} => {1}' -f $args[0].Name, $args[1].Count }

PS> [Linq.Enumerable]::GroupJoin(
         $users, $books, $outerKeyDelegate, $innerKeyDelegate, $resultDelegate)

Sam => 2 
Dean => 1 
Crowley => 1 
Chuck => 0 
Castiel => 1

Concat

Concatenates two sequences into a single sequence; further LINQ operations would then operate on the new, combined sequence.

LINQ in C#

LINQ in PowerShell

Native PowerShell

Zip

Applies a specified function to the corresponding elements of two sequences, producing a new sequence of the results. If the first sequence is longer than the second, one element past the common length will be evaluated (“d” in the example) at which point a determination is made that the second sequence has been consumed, and further evaluation stops (so “e” is not evaluated). If the second sequence is longer than the first, its extra values will not be evaluated at all. Note that “zip” in this context has nothing to do with zip archives!

LINQ in C#

LINQ in PowerShell

Native PowerShell

Ordering

OrderBy

Sorts the elements of a sequence in ascending order according to a key selector function.

LINQ in C#

LINQ in PowerShell

Native PowerShell

PS> $StringData = @("unn", "dew", "tri", "peswar", "pymp")
# simple strings; no properties need to be used; compare to next example
PS> $StringData | Sort-Object 
dew
peswar
pymp
tri
unn 

OrderByDescending

Sorts the elements of a sequence in descending order according to a key selector function.

LINQ in C#

LINQ in PowerShell

Native PowerShell

ThenBy

Performs a subsequent ordering of the elements in a sequence in ascending order according to a key selector function (d => d.Month in this example). Note that unlike most other LINQ operators, which accept an IEnumerable<T> input, ThenBy accepts an IOrderedEnumerable<T> input—which happens to be the output of OrderBy.

LINQ in C#

LINQ in PowerShell

Native PowerShell

ThenByDescending

Performs a subsequent ordering of the elements in a sequence in descending order according to a key selector function. Note that unlike most other LINQ operators, which accept an IEnumerable<T> input, ThenBy accepts an IOrderedEnumerable<T> input—which happens to be the output of OrderBy.

Works identically to ThenBy except you set the Descending property to true in the native PowerShell example.

Reverse

Inverts the order of the elements in a sequence.

LINQ in C#

LINQ in PowerShell

Native PowerShell

PS> $StringData = @("unn", "dew", "tri", "peswar", "pymp")
# Careful! This call modifies the *original* array and does *not* output it.
PS> [array]::Reverse($StringData)
# Then to see the output:
PS> $StringData
pymp
peswar
tri
dew
unn

Partitioning

Take

Returns a specified number of elements from the start of a sequence. Evaluation of the sequence stops after that as no further elements are needed.

LINQ in C#

LINQ in PowerShell

Native PowerShell

Skip

Bypasses a specified number of elements in a sequence and then returns the remaining elements.

LINQ in C#

LINQ in PowerShell

Native PowerShell

TakeWhile

Returns elements from the start of a sequence as long as a specified condition is true. Evaluation of the sequence stops after that as no further elements are needed.

LINQ in C#

LINQ in PowerShell

Native PowerShell

PowerShell does not have an equivalent one-liner to do a TakeWhile, but with the Take-While function created by JaredPar, you could just do this:

SkipWhile

Bypasses elements in a sequence as long as a specified condition is true and then returns the remaining elements.

LINQ in C#

LINQ in PowerShell

Native PowerShell

PowerShell does not have an equivalent one-liner to do a SkipWhile, but with the Skip-While function created by JaredPar, you could just do this:

Projection

Select

Applies a specified transformation to each element of a sequence; this transformation is generally referred to as “projection”. Often you might project into a new object that is a subset of the original object, essentially discarding unneeded properties. In the illustration, the sequence is transformed to a new sequence with just the DayOfYear property.

LINQ in C#

LINQ in PowerShell

Native PowerShell

SelectMany

Projects each element of a sequence to an IEnumerable<T> and flattens the resulting sequences into a single sequence. If, in the illustration, Select had been used instead of SelectMany, each element of the result would be a list of User objects (i.e. a list of string arrays) rather than a list of strings, as shown, and the result would be just a 2-element list rather than a 6 element list.

LINQ in C#

LINQ in PowerShell

class DayTally
{
    [DateTime] $Day;
    [string[]] $User;
    
    DayTally([DateTime] $day, [string[]] $user) {
        $this.Day = $day;
        $this.User = $user;
    }
}

[DayTally[]]$days = @(
    [DayTally]::new(
        (Get-Date -Year 2017 -Month 10 -Day 23), 
        [string[]] @( "user1", "user2", "user5", "user4" ));
    [DayTally]::new(
        (Get-Date -Year 2017 -Month 2 -Day 5), 
        [string[]] @( "user3", "user6" ));
)

# Careful with the delegate signature! E.g. change 'string[]' to 'string' and watch what happens
PS> [Func[DayTally,string[]]] $delegate = { return $args[0].User }
PS> [Linq.Enumerable]::SelectMany($days, $delegate)
user1
user2
user5
user4
user3
user6

Native PowerShell

# Use the same setup as above, then just...
PS> $days | Select -ExpandProperty User
user1
user2
user5
user4
user3
user6

Quantifiers

Any

Determines whether any element of a sequence (i.e. at least one element) satisfies a condition. All elements of the sequence need to be evaluated to provide a false result (first figure). However, if at any time during evaluating the sequence an element evaluates to true, the sequence evaluation stops at that element (second figure). Of course, if only the last element satisfies the condition, all elements will need to be evaluated and true will be returned.

LINQ in C#

LINQ in PowerShell

Native PowerShell

PowerShell does not have an equivalent one-liner to do Any, but there are a variety of suggestions to implement Test-Any in this StackOverflow post. The key is stopping the pipeline once you make a determination; see the discussion on StackOverflow for details.

All

Determines whether all elements of a sequence satisfy a condition. All elements of the sequence need to be evaluated to provide a true result (first figure). However, if at any time during evaluating the sequence an element evaluates to false, the sequence evaluation stops at that element (second figure). Of course, if only the last element fails to satisfy the condition, all elements will need to be evaluated and false will be returned.

LINQ in C#

LINQ in PowerShell

Native PowerShell

PowerShell does not have an equivalent one-liner to do All, but this StackOverflow post shows how to implement a Test-All function. (Note, however, that that function does not optimizing performance in terms of stopping the pipeline once a determination is made; see comments on Any.)

Contains

Determines whether a sequence contains a specified element. The sequence may, of course, contain objects of an arbitrary type. In the case of strings, however, note that this method matches against each element in its entirety. Contrast this to the string method Contains that determines whether a string matches against a substring. (See the example for All.)

LINQ in C#

LINQ in PowerShell

Native PowerShell

SequenceEqual

Determines whether two sequences are equal; specifically, if the two sequences contain the same elements in the same order. When dealing with value types, as in the illustration, the use is intuitive: the lists differ at the third position so a determination has been made that they are different, and no further elements of the sequence need to be evaluated. Note that if you use reference types, the elements are matched with reference equality; they need to be the actual, same object, not just objects with all the same property values

LINQ in C#

LINQ in PowerShell

Native PowerShell

This is similar, in that it compares two sequences, but it is order-independent. This example will return true here while the LINQ expression above returned false.

Restriction (Filtering)

Where

Filters a sequence of values based on a predicate.

LINQ in C#

LINQ in PowerShell

Native PowerShell

Sets

Distinct

Returns distinct elements from a sequence. Note that the sequence does not need to be sorted.

LINQ in C#

LINQ in PowerShell

Native PowerShell

Union

Produces the set union of two sequences. Includes elements in both sequences but without duplication.

LINQ in C#

LINQ in PowerShell

Native PowerShell

Intersection

Produces the set intersection of two sequences. Just those elements that exist in both sequences appear in the result.

LINQ in C#

LINQ in PowerShell

Native PowerShell

Except

Produces the set difference of one sequence with a second sequence. Just those elements that exist in the first sequence and do not exist in the second sequence appear in the result.

LINQ in C#

LINQ in PowerShell

Native PowerShell

Conclusion

So, yes! LINQ can be done in PowerShell. Depending on the operator, it can be rather burdensome to do so. PowerShell is designed to be quick and easy to use, so be sure that you need the performance boost that LINQ can offer and, indeed, make sure that there is a performance boost for your data, and most importantly that the resulting data is correct. As part of your analysis, you may find it useful to have the accompanying wallchart that, for example, shows you at a glance which operators use deferred execution and which use immediate execution. Click here to download the PDF reference chart.

About the author

Michael Sorens

See Profile

Michael Sorens is passionate about productivity, process, and quality. Besides working at a variety of companies from Fortune 500 firms to Silicon Valley startups, he enjoys spreading the seeds of good design wherever possible, having written over 100 articles, more than a dozen wallcharts, and posted in excess of 200 answers on StackOverflow. You can also find his open source projects on SourceForge and GitHub (notably SqlDiffFramework, a DB comparison tool for heterogeneous systems including SQL Server, Oracle, and MySql). Like what you have read? Connect with Michael on LinkedIn