Down the Rabbit Hole- A Study in PowerShell Pipelines, Functions, and Parameters

For an experienced programmer to learn a new language can be a journey quite like Alice's in wonderland. Paradoxes, unexpected twists, blind tangents, bafflements and nice surprises. Michael comes to the rescue of anyone learning PowerShell with an explanation of how to use PowerShell functions.

Contents

1326-img51.gif

PowerShell⢠is a shell language for Windows® that combines the best of a scripting language with the best of a compiled language. PowerShell lets you do anything batch files could do but with much more power, elegance, and simplicity, and it lets you do anything C# can do too; indeed, you can easily invoke any .NET code you like from within PowerShell. So naturally, PowerShell provides the capability to create functions, so that you can encapsulate code as you would with a “regular” programming language. To create and use PowerShell functions effectively, however, you must properly understand the rules of engagement.

“Why, sometimes I’ve believed as many as six impossible things before breakfast.”
–The White Queen. Chapter 5, Through the Looking Glass (Lewis Carroll)

Functions, beyond their most basic form, are fraught with pitfalls and seemingly perplexing results. Indeed, you may on occasion find yourself thinking ” that is just not possible!?!”

As you read this article, you may discover that there are fewer “impossibilities” than you first thought…

“Oh dear, what nonsense I’m talking!”
–Alice. Chapter 2, Through the Looking Glass (Lewis Carrol)

What’s the connection to Alice in Wonderland? Learning a new technology is a journey quite analogous to that which Alice experienced: unexpected twists, tangents leading off to nowhere, delightful surprises… This is the second installment of my PowerShell adventure; see my first tale on this journey at Harnessing PowerShell’s String Comparison and List-Filtering Features.

Syntax of a Function Call

In Calling Functions in PowerShell, Thomas Lee provides a succinct example of one pitfall that even experienced PowerShell scripters will occasionally tumble into-writing a basic function call that produces unexpected results. Here is my version of his script, revised for clarity and exposition:

Function

function f($a, $b, $c)

{

    “`$a=$a”

    “`$b=$b”

    “`$c=$c”

}

Call

write-host -BackgroundColor Red ‘(1) f(“Foo”,”Bar”,”foobar”)’

f(“Foo”,”Bar”,”foobar”)

write-host -BackgroundColor Red ‘(2) f “Foo”,”Bar”,”foobar”‘

f “Foo”,”Bar”,”foobar”

#write-host -BackgroundColor Red ‘(3) f (“Foo” “Bar” “foobar”)’

#f (“Foo” “Bar” “foobar”)

write-host -BackgroundColor Green ‘(4) f “Foo” “Bar” “foobar”‘

f “Foo” “Bar” “foobar”

write-host -BackgroundColor Green ‘(5) f -c “foobar” -a “Foo” -b “Bar”‘

f -c “foobar” -a “Foo” -b “Bar”

Result

(1) f(“Foo”,”Bar”,”foobar”)

$a=Foo Bar foobar

$b=

$c=

(2) f “Foo”,”Bar”,”foobar”

$a=Foo Bar foobar

$b=

$c=

(4) f “Foo” “Bar” “foobar”

$a=Foo

$b=Bar

$c=foobar

(5) f -c “foobar” -a “Foo” -b “Bar”

$a=Foo

$b=Bar

$c=foobar

Example 1: Writing a basic function call

I think I should understand that better, if I had it written down: but I can’t quite follow it as you say it.
–Alice, Chapter 9, Alice’s Adventures in Wonderland (Lewis Carroll)

This script defines a function f that takes three parameters and simply writes out their values. The main code (the Call section of the table) writes out a function call variation and then executes that variation, producing the tidy, though curious, result shown.

Well, then, let me explain it a bit further:

Variation (1) looks like a typical function call (e.g. C#) but all the parameters end up in $a, with nothing in $b or $c! (I’ve highlighted it in red, because this is not the desired outcome.) This result occurs because parentheses in PowerShell do not signify function arguments; rather they signify an expression to execute. In this case, the expression returns an array of 3 elements. That result is then supplied as the first positional parameter of the function, leaving nothing left for the remaining parameters. You can combat this pitfall easily with the Set-StrictMode cmdlet:

Alas, just like in Visual Basic®, PowerShell does not enable strict mode by default-it is up to you to enable it in your scripts and modules in order to catch the improper use of parentheses when calling a function, (among other things). Do not be lulled into thinking of strict mode as a fail-safe for all pitfalls, however! Even just adding a space between the function name and the left parenthesis makes strict mode turn a blind eye to a potential problem-see the syntax pitfalls section of the accompanying wallchart.

Variation (2) produces… the same thing! This is because commas also indicate members of an array, even when not contained in parentheses. The results are thus the same as for Variation (1) but through quite a different mechanism.

Variation (3) is commented out in the code because it is not syntactically valid. Remember that parentheses indicate an expression to evaluate, but a space-separated list is not a valid expression. Space-separated items are valid only when used as individual parameters in a function call.

 Variation (4) (highlighted in green) reveals the true way to call a function with positional parameters-separate the parameters with white space, and no parentheses!

Finally, variation (5) shows that you are not limited to positional parameters; you may also use named parameters, in which case the order is not significant.

Typed Parameters

Just as PowerShell does not set strict mode by default, it does not require you to strongly type your functional parameters (sigh). So you can define a function to square an integer like this:

Executing f 15 yields 225. But passing a string (e.g. f “a”) results in an error on the line attempting to use $x as an integer. While in this case the error is readily revealed, your code may or may not make it so obvious. PowerShell does not require you to type your parameters but it does allow you to. Thus, it is much better to let the function’s signature serve as the arbiter of good inputs by specifying the type of the parameters (using any .NET type name):

Input Sources and Precedence

Cmdlets are similar to functions; the primary difference is that a cmdlet is written in a compiled .NET language while a function is written in PowerShell. With the advent of PowerShell 2.0, however, you can create functions that can perform as much “heavy lifting” as cmdlets written in C#. Notably, you can create a function that can accept either direct inputs or pipeline inputs, which is the point of interest here.

There are three possible input sources for a function: no input, direct input, and pipeline input. These sources may be combined within a single function, e.g. one parameter can accept pipeline input while another takes direct input. But you can also combine them for a single parameter, allowing a single parameter to accept direct or pipeline input.

Since PowerShell has only three input sources, the question of precedence is quite straightforward. (Compare this to, for example, Windows Presentation Foundation (WPF) dependency properties, where the precedence list contains 11 possible sources!) Without further ado, the precedence list for PowerShell is just this:

Direct input -or- Pipeline input

Default value

 That is, you may supply direct input or pipeline input, but not both. If neither is supplied, the default value, if any, applies.

For illustrative purposes, consider the oft-used Get-ChildItem cmdlet. Here is its syntax summary:

Focusing on the Path parameter, the documentation for that parameter says that the default value is “.” indicating the current directory, and that the parameter accepts pipeline input. (All parameters accept direct input so it is not explicitly mentioned in the documentation.) Thus, a call with no input…

…applies the default argument, returning a listing of the contents of the current directory. If you provide an explicit parameter, you may do so either as direct input or as pipeline input. Hence, these two statements are exactly equivalent and return the contents of the “tmp” directory:

If you attempt to supply both direct input and pipeline input…

… PowerShell generates an error that indicates the problem in a rather circuitous fashion:

What that message is really saying is that, after applying one input to the Path parameter, the command also attempted to provide a second input to the same parameter. But since any parameter can have input only from one source at a time, the error says it does not have a match for the second input source to any parameter.

Input Combinations

There are a number of possible combinations of input source and parameter types to consider.

Direct Input/Positional Parameters

Parameters in the call are matched to parameters in the function signature by their order or position: the first parameter in the signature takes the first parameter from the call, the second takes the second, and so on. If the signature specifies more parameters than you supply in the call, the leftover parameters assume either default values (if you have provided them) or null.

Call

f $x $y $z

Signature

f ( $a, $b, $c)

Implicit Assignment

$a = $x, $b = $y, $c = $z

Direct Input/Named Parameters

Parameters in the call are matched to parameters in the function signature by their names; the order in the call is irrelevant. If the signature specifies more parameters than you supply in the call, the leftover parameters assume either default values (if you have provided them) or null.

Call

f -C $z -B $y -A $x

Signature

f ( $a, $b, $c)

Implicit Assignment

$a = $x, $b = $y, $c = $z

Default Values

Here, parameter $b has a default value assigned in the function’s signature, while parameter $c does not. A call providing only a value for parameter $a results in $b having the default value 25 and $c having the value $null.

Call

f $x

Signature

f ( $a, $b = 25, $c)

Implicit Assignment

$a = $x, $b = 25, $c = $null

Pipeline Input/Explicit Parameters

Values in the pipeline are fed to a parameter that is set up to accept it. To wire it up, all you need to do is to attach the ValueFromPipeline attribute to a parameter in the signature. Assuming you have structured the body of the function appropriately (see next section), you will receive multiple outputs (one for each pipeline input), allowing you to continue pipelining if desired. The table below illustrates mixing a pipeline input ($a) with direct inputs ($b, $c, and $d), where $b has an explicit value via a named parameter that overrides its default, $c has neither default nor value, and $d has no explicit value but has a default.

Call

$x1,$x2,$x3 | f -B 111

Signature

f [Parameter(ValueFromPipeline=$True)]$a, $b=-1, $c, $d=999)

Implicit

Assignment

$a = $x1, $b = 111, $c = $null, $d=999

$a = $x2, $b = 111, $c = $null, $d=999

$a = $x3, $b = 111, $c = $null, $d=999

You may, in fact, specify more than one parameter to receive pipeline input. In such a case, each parameter sees the same value from the pipeline at the same time. Example 2 does so for $p1 and $p3 but not for $p2:

Function

function Get-PipelineToMultipleParameters(

      [Parameter(ValueFromPipeline=$True)]$p1 = “default”,

      $p2 = “non-pipe”,

      [Parameter(ValueFromPipeline=$True)]$p3 = “xyz”

)

{

      Process { return “`$p1=$p1 :: `$p2=$p2 :: `$p3=$p3” }

}

Call

25,49 | Get-PipelineToMultipleParameters

Result

$p1=25 :: $p2=non-pipe :: $p3=25

$p1=49 :: $p2=non-pipe :: $p3=49

 Example 2: Specifying multiple parameters to receive the pipeline input

The call sends two values as pipeline input to the Get-PipelineToMultipleParameters function resulting in one line of output for each input. $p1 and $p3 will always contain the same pipeline input. Since $p2 does not take input from the pipeline, nor does it have a direct input supplied, it assumes its default value.

Pipeline Input/Implicit

Even if your function defines no parameters at all, it can still process pipeline input. The standard PowerShell technique for iterating through an array (see about_Foreach) is to pipe the array into a foreach loop and access each object in the array with the special $_ variable (foreach has a convenient alias “%”  because it is used so often in this type of context):

You can feed pipeline input into your function in just the same way. 

Call

$x1,$x2,$x3 | f

Signature

f

Implicit Assignment

$_ = $x1

$_ = $x2

$_ = $x3

The next section expounds upon this deceptively simple looking practice in depth.

Expanding the Notion of a Function for Pipeline Input

The discourse thus far has operated under the assumption that a function has this basic structure:

function <name> ( { <parameter> } )

{

      <statement list>

}

That structure works fine if you are only interested in direct input. To allow for pipeline input, you need to expand a bit more of the true structure of a PowerShell function:

function <name> ( { <parameter> } )

{

      begin   { <statement list> }

      process { <statement list> }

      end     { <statement list> }

}

The begin block runs once, at the start of pipeline input before any of the input is read. The process block runs once for each object in the pipeline. At the conclusion of pipeline input, the end block runs. The PowerShell documentation (about_Functions) provides two examples that are worth examining here. The first shows that the begin block indeed executes before the pipeline is opened and that the end block executes only once all pipeline data has been received. Note that the special $input variable contains the pipeline data.

Function

function Get-PipelineBeginEnd

{

    begin {“Begin: The input is $input”}

    end {“End:   The input is $input” }

}

Call

1,2,4 | Get-PipelineBeginEnd

Result

Begin: The input is

End:   The input is 1 2 4

Example 3: Showing begin and end blocks in a PowerShell function

The next example shows how you could actually consume the pipeline data sequentially using the process block. And I mean consume non-metaphorically-observe that when the end block is executed the $input variable is now empty!

Function

function Get-PipelineInput

{

    process {“Processing:  $_ ” }

    end {“End:   The input is: $input” }

}

Call

1,2,4 | Get-PipelineInput

Result

Processing:  1

Processing:  2

Processing:  4

End:   The input is:

Example 4: Consuming the pipeline data sequentially using the process block

The other important point to note from this example is that $input represents all the pipeline data (what it contains will vary depending on what you have consumed so far) while the $_ special variable, on the other hand, contains the current pipeline object.

Let me add a couple of further examples to reveal even more about pipeline manipulation. Example 4, above, showed how you could process an individual pipeline object with the $_ variable. But you can also process each pipeline object with the $input variable itself. This next example is effectively the same as Example 4, but now it uses $input in the end block instead of $_ in the process block. Watch out though; the $_ variable is doing something very different now! Previously, it received elements from the pipeline feeding the function, which did not appear explicitly in the code. Here, it is receiving elements from the explicitly shown, local pipeline.

Function

function Get-PipelineInputFromInput

{

  end {

    $input | % {“Processing:  $_ ” }

    “`input after processing: $input”

  }

}

Call

1,2,4 | Get-PipelineInputFromInput

Result

Processing:  1

Processing:  2

Processing:  4

input after processing:

Example 5: Processing each pipeline object with the $input variable

One reason to use $input to process your functional pipeline in the end block rather than $_ in the process block is that you have more control over the pipeline with the former. With $_, you have access to each pipeline object just once. But $input is actually a .NET IEnumerator object. As such, you can invoke its Reset method to restore the position to the start of its data. In Example 6, a couple of extra lines are added to the end of the function showing that you can do repeated processing of the pipeline if you like.

Function

function Get-PipelineInputFromInput

{

    $input | % {“Processing:  $_ ” }

    “`input after processing: $input”

    $input.Reset()

    “`input after reset: $input”

}

Call

1,2,4 | Get-PipelineInputFromInput

Result

Processing:  1

Processing:  2

Processing:  4

input after processing:

input after reset: 1 2 4

Example 6: Repeated processing of the pipeline

One other way to work with $input that I often find easier is to convert from an enumerator to an array. Add a line of code like this as your first (and only) access of the $input enumerator to copy your pipeline data into a regular array:

…and you can then manipulate it however you please (thanks to Dmitry Sotnikov for this tip).

I introduced one other subtle difference in Example 6; I removed the end block. That change is completely cosmetic-a function without explicit begin, process, and end blocks operates as if all the code is in the end block implicitly. Similarly, you can have all your code run implicitly in the process block by simply creating a filter instead of a function:

Function

filter Get-PipelineInput

{

    “Processing:  $_ “

}

Call

1,2,4 | Get-PipelineInput

Result

Processing:  1

Processing:  2

Processing:  4

Example 7: Creating a filter to run all the code in the process block

Function Template Wallchart

All of the above has laid the groundwork for the true goal of this article: to illustrate how to structure a function to accept multiple input sources. It is also just as important to understand how not to. The wallchart accompanying this article distills a key set of templates and evaluates each template against possible inputs. Here is just a thumbnail of the wallchart, which is downloadable from the bottom of the article.

1326-PS_functions_wallchart_1_0_1_Thumbn

Equivalence Classes of Input

The central fixture of the wallchart is a matrix of most typical function structures with a report of how each one performs against all possible inputs, characterized by the six equivalence classes listed across the top of the wallchart:

  1. No Input
  2. Null
  3. Empty String
  4. Scalar
  5. List
  6. List with Null/Empty

Because this set of inputs may come either from direct input or from pipeline input, there are actually 12 tests to be performed on each function. The results of executing each function template with each of the 12 tests comprise the main body of the matrix.

The Test Vehicle

The test program, shown in its entirety here, is quite short. The bulk of it is a list of the twelve inputs and the desired result for each input, each introduced with the Assert-Expression command.

This test program relies on two library modules. TestParamFunctions is a custom library module that enumerates each function template shown on the wallchart. Assert-Expression, from my open-source CleanCode library1, provides the Assert-Expression function that lets you validate the execution of an expression against a desired result. For the above program to run, you must install the library modules so that PowerShell can find them (either in a user folder at  ...
  user\documents\windowspowershell\modules

 or a system folder at  ...
  windows\system32\windowspowershell\V1.0\modules
).

There are other validation packages available, of course, but I prefer my version because it provides a unique color-coded result that lets you instantly identify the parts that fail. It provides in concise fashion: the inputs to the test, indication of passing tests in green, indication of failing tests in red, and it individually marks which elements of the expected results mismatch with actual results. Here is an excerpt for one function template with all 12 input combinations.

1326-img46.jpg

Figure 1: Excerpt of test results from one function template

Notice that the test program above specifies the set of inputs to send to each test, but actually has no knowledge about the list of tests itself! The magic is in this line of code:

…which simply enumerates all the tests defined in the TestParamFunctions module. Thus, if you want to add your own variations, you need only edit the library module and the test program will automatically incorporate them in the test suite.

Template Organization

The 10 templates shown in the wallchart are arranged into three groups. The goal is to make each template more sophisticated until a template can handle all the input possibilities.

Group A introduces the most basic function structure for the three available techniques to access an input: the direct parameter, the $_ variable for pipeline input, and the $input variable for pipeline input. This naïve approach delivers only about half of the correct answers: the direct parameter technique allows all the tests with direct input to pass, while the two pipeline techniques allow all the tests with pipeline input to pass.

Group B adds a default value to each parameter in the hope of improving the “No input” case. This goal is met moving from template A_1 to template B_1 but, alas, does nothing to improve the results for B_2 or B_3.

The templates in Group C bring sufficient sophistication to handle both direct and pipeline inputs. Group C is subdivided into two templates that use the pipeline input from $_ (templates C_1 and C_2) and two templates that use $input (templates C_3 and C_4).

The only difference between C_1 and C_2 is the Parameter attribute attached to the $item parameter in C_2. The template yields much better results when the attribute is present, as is appropriate for a pipeline-receptive input, yielding 11 out of 12 correct for C_2 compared to 8 out of 12 correct for C_1.

Templates C_3 and C_4 also differ in the Parameter attribute, but C_3 needs one additional statement to check the existence of the $item parameter; without that statement-and with StrictMode enabled in the main test program-PowerShell emits an error on one of the 12 tests. (It seems rather peculiar that a parameter in the function’s signature could be undefined in its own body, but such is the case here!)

Which Template Delivers the Best Results?

The wallchart indicates that only C_3 and C_4 are robust enough to provide the correct result for all 12 input combinations. But what does “correct” mean in this context?

When I use a word it means just what I choose it to mean – neither more nor less.
–Humpty Dumpty, Chapter 6, Through the Looking Glass

Though computing is the ultimate “black-and-white” realm-dealing with 1’s and 0’s-shades of grey often pop up when you least expect them. Most of the test cases actually are black and white. The interesting case is when you supply no input. For direct input, this is when you invoke the function without an argument, in which case the parameter assumes the default value. But when you consider pipeline inputs, what constitutes no input? The answer I propose is: an empty list. So I contend an empty list piped in should yield a default value for the parameter. But you could also argue that, from a pipeline, you would never get a default value; instead the “worst” you could get would be $null. That philosophical difference is the difference between template C-2 and C-4. Two other considerations come to mind when considering which is the “right” answer:

(1) Elegance of symmetry, in which the results are exactly symmetric between direct input and pipeline input: in the results of template C-3 and C-4, both rows of results are identical column-for-column.

(2) Elegance of code: template C-4 is simpler than template C-3 while achieving the same result, even though it technically violates the requirement of indicating (with the Parameter attribute) that a parameter accepts pipeline input. But template C-2 is clearly more elegant than template C-4 using, in some sense, more of the built-in functionality to achieve pipeline inputs.

Point (1) favors template C-3 or C-4; point (2) favors template C-2. Pick whichever fits best with your own style.

Conclusion

There is always more to the story, but the above provides you with the vast majority of the information you need to be productive with PowerShell functions. Applying one of my favorite rules of writing, it is time for me to draw to a close:

Begin at the beginning and go on till you come to the end: then stop.
–The King, Chapter 12, Alice’s Adventures in Wonderland

If you have been reluctant to try your hand at PowerShell functions due to their seemingly idiosyncratic nature, this reference and the accompanying wallchart should give you the tools you need to go at it with confidence!

Footnotes

1 My open source CleanCode library contains building blocks for C#, Perl, Java, SQL, and JavaScript, and will soon have a new section devoted to PowerShell, which will include the Assert-Expression module discussed in this article. Look for it in the fourth quarter of 2011.

Epilogue

1326-img4D.gif

‘Would you tell me, please, which way I ought to go from here?’
‘That depends a good deal on where you want to get to,’ said the Cat.
‘I don’t much care where – ‘ said Alice.
‘Then it doesn’t matter which way you go,’ said the Cat.
‘- so long as I get somewhere,’ Alice added as an explanation.
‘Oh, you’re sure to do that,’ said the Cat, ‘if you only walk long enough.’
–Alice, Chapter 9, Alice’s Adventures in Wonderland (Lewis Carroll)