{"id":2095,"date":"2015-10-22T00:00:00","date_gmt":"2015-10-06T00:00:00","guid":{"rendered":"https:\/\/test.simple-talk.com\/uncategorized\/ins-and-outs-of-the-powershell-pipeline\/"},"modified":"2016-07-28T13:47:03","modified_gmt":"2016-07-28T13:47:03","slug":"ins-and-outs-of-the-powershell-pipeline","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/sysadmin\/powershell\/ins-and-outs-of-the-powershell-pipeline\/","title":{"rendered":"Ins and Outs of the PowerShell Pipeline"},"content":{"rendered":"<p>Pipelining is an important technique when the operation you are performing, such as reading files of indeterminate length, or processing collections of large objects, requires you to conserve memory resources by breaking a large task into its atomic components. If you get it wrong, you don&#8217;t get that benefit. While PowerShell provides an ample supply of constructs for pipelining, it is all too easy to write code that simply does not pipeline at all. <\/p>\n<p>So why is pipelining important?  <\/p>\n<ul>\n<li>As mentioned, pipelining is helpful to conserve memory resources.  Say you want to modify text in a huge file. Without a pipeline effect you might read the huge file into memory, modify the appropriate lines, and write the file back out to disk. If it is large enough you might not even have enough memory to read the whole thing. <\/li>\n<li>Pipelining can substantially improve  <em>actual <\/em> performance. Commands in a pipeline are run concurrently-even if you have only a single processor, because when one process blocks, for example, while reading a large chunk of your file, then another process in the pipeline can do a unit of work in the meantime. <\/li>\n<li>Pipelining can have a significant effect on your end-user experience, enhancing the  <em>perceived <\/em> performance dramatically. If your end-user executes a sequence of commands that takes 60 seconds, then without pipelining the user sees nothing until the end of that 60 seconds, while with pipelining output might start appearing in just a couple seconds. <\/li>\n<\/ul>\n<p>To illustrate pipelining as simply as possible, I&#8217;ll introduce a simple pipeline that takes a sequence of inputs through three successive functions, mapping a sequence of inputs to a sequence of outputs. <\/p>\n<p class=\"illustration\">  <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/2290-img14.jpg\" height=\"92\" width=\"431\" alt=\"2290-img14.jpg\" \/><\/p>\n<p class=\"caption\">Figure 1:<\/p>\n<p>Each pipeline function, called a filter, performs a transformation on the inputs fed to it and passes on the result. The operations performed as shown are deliberately simple and nonsensical: It is purely to illustrate the process. It is the pipeline itself that deserves attention here, rather than the particular transformations. Nonetheless, here to start us off are the simple implementations of each operation so you can follow along. <\/p>\n<pre class=\"lang:ps theme:powershell-ise\"># double the input\r\nFunction f1($x) { $x * 2 }\r\n\r\n# concatenate the nth letter to the input where n is half the input value\r\nFunction f2($x) { \"$x\" + [char]([byte][char] \"A\" + $x\/2 - 1) }\r\n\r\n# reverse the 2-character string\r\nFunction f3($x) { $x[1..0] -join '' }\r\n<\/pre>\n<p>As written, these are standard PowerShell functions that are  <em>not <\/em> <em>, by themselves <\/em> suitable for pipelining. If you try <code>1, 2, 3 | f1 | f2 | f3 <\/code> you will get an incorrect result.  <\/p>\n<h2>Right and Wrong <\/h2>\n<p>Instead of writing this &#8230;. <\/p>\n<pre class=\"lang:ps theme:powershell-ise\">1, 2, 3 | f1 | f2 | f3<\/pre>\n<p>&#8230; you will need to execute this: <\/p>\n<pre class=\"lang:ps theme:powershell-ise\">1, 2, 3 | %{ f1 $_ } | %{ f2 $_ } | %{ f3 $_ } <\/pre>\n<p>(meaning:  <\/p>\n<ul>\n<li> %<code> <\/code> <strong>  <\/strong>&#8211; for each object that arrives as input  <\/li>\n<li> <code>{ <\/code>&#8211; start of process block  <\/li>\n<li> <code>f1 <\/code> <strong>  <\/strong>&#8211; execute the function F1 &#8230; <\/li>\n<li> <code>$_ <\/code> <strong>  <\/strong>&#8211;  &#8230;.on the object passed down the pipeline <\/li>\n<li> <code>}  <\/code>&#8211; end of process block) <\/li>\n<\/ul>\n<p>&#8230;or you could, as we will soon do in this article, create filters to do it all more neatly <\/p>\n<pre class=\"lang:ps theme:powershell-ise\">1, 2, 3 | f1Runner | f2Runner | f3Runner <\/pre>\n<p>&#8230; where the given filters have yet to be defined.  <\/p>\n<p>In each of the examples that follow, you will see implementations of the functions that act as filters  mentioned above (  <code>f1Runner <\/code>,  <code>f2Runner <\/code>, and  <code>f3Runner <\/code>). Within each scenario, all of the functions are essentially identical except for the statement that produces a calculation. After the calculation, the value is output  <em>twice <\/em>: once with <code>Write-Host<\/code> , just to see what is going on, and once with <code>Write-Output<\/code> to allow it to feed the next function in the pipeline. To be clear, if you do not care about viewing intermediate results you can delete the <code>Write-Host<\/code>  statements and the pipeline will work just the same. Also, note that the f3Runner does not use the <code>Write-Host<\/code>  call because it is at the end of the pipe and you will get the results on the console from just the <code>Write-Output<\/code> call itself. <\/p>\n<p>In order to make it simpler to see what is going on, we will add another filter,  <code>showInputs <\/code>, that merely displays the values at the start of the pipeline <\/p>\n<h2>Scenario #1 &#8211; Generate all pipelineable output before emitting any <\/h2>\n<p>The special variable <code>$input <\/code>is available within a function as a provider of pipeline data. So we just loop through that pipeline data, calculating a result, and then writing it out. This is a very common coding pattern seen in many questions on StackOverflow. It is perfectly reasonable for some languages, but generally it should be avoided in PowerShell. Even if data is coming in nicely through the pipeline to the first function, the pipeline dries up completely, because each function is processing the entire pipeline input until the pipeline is empty, and only then sending its results onward en masse to the next pipeline participant. <\/p>\n<pre class=\"lang:ps theme:powershell-ise\">function showInputs\r\n{\r\n    $result = @()\r\n    foreach ($value in $input) { $result += $value }\r\n    Write-Host $result\r\n    Write-Output  $result\r\n}\r\nfunction f1Runner\r\n{\r\n    $result = @()\r\n    foreach ($value in $input) { $result += f1 $value }\r\n    Write-Host $result\r\n    Write-Output  $result\r\n}\r\nfunction f2Runner\r\n{\r\n    $result = @()\r\n    foreach ($value in $input) { $result += f2 $value }\r\n    Write-Host $result\r\n    Write-Output  $result\r\n}\r\nfunction f3Runner\r\n{\r\n    $result = @()\r\n    foreach ($value in $input) { $result += f3 $value }\r\n    #Write-Host $result\r\n    Write-Output  $result\r\n}} <\/pre>\n<p>Here is what happens when you execute. showInputs displays all of its inputs (1 2 3). (It is an artifact of <code>Write-Host<\/code> that it combines all of the values in to a single, space-separated string.) Then <code>f1Runner<\/code> displays all of its calculations (2 4 6). Then <code>f2Runner<\/code>&#160; does all of its work, yielding (2A 4B 6C). Finally,<code>f3Runner<\/code> lets its results stream out the end of the pipeline, so you see that there are still 3 values being processed (because they are emitted one-per-line). <\/p>\n<pre class=\"lang:ps theme:powershell-output\">\t\r\n\tPS&gt; 1,2,3 | showInputs | f1Runner | f2Runner | f3Runner\r\n\t1 2 3\r\n\t2 4 6\r\n\t2A 4B 6C\r\n\tA2\r\n\tB4\r\n\tC6 \r\n<\/pre>\n<p>Note that there is often more than one way to do the same thing in PowerShell: for example, the loop in <code>f3Runner<\/code>&#160; could be replaced with this line (and the others similarly) and it would produce exactly the same result-namely, no pipelining! Here the special <code>$_ <\/code>variable indicates the current item in the loop. <\/p>\n<\/p>\n<pre class=\"lang:ps theme:powershell-ise\">$input | ForEach-Object { $result += f3 $_ }\r\n<\/pre>\n<h2>Scenario #2 &#8211; Collect all pipeline input before processing any <\/h2>\n<p>The way to fix the previous example is to emit each value as soon as it is calculated. The point to catch from the previous example is that there&#8217;s no need for you to explicitly aggregate the results; PowerShell will implicitly do that with the pipeline itself. This set of functions seemingly does just that, emitting one value at a time: <\/p>\n<pre class=\"lang:ps theme:powershell-ise\">Function showInputs\r\n{\r\n    $input | ForEach-Object {\r\n        $result = $_\r\n        Write-Host $result\r\n        Write-Output  $result\r\n    }\r\n}\r\nFunction f1Runner\r\n{\r\n    $input | ForEach-Object {\r\n        $result = f1 $_\r\n        Write-Host $result\r\n        Write-Output  $result\r\n    }\r\n}\r\nFunction f2Runner\r\n{\r\n    $input | ForEach-Object {\r\n        $result = f2 $_\r\n        Write-Host $result\r\n        Write-Output  $result\r\n    }\r\n}\r\nFunction f3Runner\r\n{\r\n    $input | ForEach-Object {\r\n        $result = f3 $_\r\n        # Write-Host $result\r\n        Write-Output  $result\r\n    }\r\n}<\/pre>\n<p>When you execute the test line, however, it reveals essentially the same result as before (though there are now newlines between the intermediate results since they are being emitted individually): the output is not being pipelined!  <\/p>\n<pre class=\"lang:ps theme:powershell-output\">\tPS&gt; 1,2,3 | showInputs | f1Runner | f2Runner | f3Runner\r\n\t1\r\n\t2\r\n\t3\r\n\t2\r\n\t4\r\n\t6\r\n\t2A\r\n\t4B\r\n\t6C\r\n\tA2\r\n\tB4\r\n\tC6 \r\n<\/pre>\n<p>While these functions do  <em>not <\/em> suffer the deficiency of the previous example (generating all the output before emitting anything), the converse is actually the problem here: these functions are waiting for all the input before calculating anything!  <\/p>\n<p>As you learn more about advanced PowerShell functions, you&#8217;ll discover that you actually have to put code in one of three blocks: the begin, process, or end blocks. The begin block runs  <em>before <\/em> any pipeline input is accepted; the process block runs once for each pipeline input; and the end block runs  <em>after <\/em> all pipeline input has been processed. If you do not explicitly specify a block, all your code is implicitly in the end block. And that explains why this scenario did no pipelining: each function waits until the pipeline empties and the function has collected all of its inputs,  <em>then <\/em> it runs the end block code, emitting all its output in one chunk to the next pipeline participant. Thus, again we have no pipelining benefit.  <\/p>\n<h2>Scenario #3 &#8211; Process each input when received and emit its output promptly <\/h2>\n<p>Scenario #2 moved a bit closer to real pipelining, and you might surmise from the discussion above that the final piece of the problem can be resolved by moving from the end block to the process block. And that would be quite correct; the set of functions below show how. Notice there is no loop here because the process block runs once for each input; in other words, there is a loop but it is handled by PowerShell itself. Within a process block, use the special <code>$_<\/code> variable to access the current pipeline item. <\/p>\n<pre class=\"lang:ps theme:powershell-ise\">Function showInputs\r\n{\r\n    process\r\n    {\r\n        $result = $_\r\n        Write-Host $result\r\n        Write-Output  $result\r\n    }\r\n}\r\nFunction f1Runner\r\n{\r\n    process\r\n    {\r\n        $result = f1 $_\r\n        Write-Host $result\r\n        Write-Output  $result\r\n    }\r\n}\r\nFunction f2Runner\r\n{\r\n    process\r\n    {\r\n        $result = f2 $_\r\n        Write-Host $result\r\n        Write-Output  $result\r\n    }\r\n}\r\nFunction f3Runner\r\n{\r\n    process\r\n    {\r\n        $result = f3 $_\r\n        # Write-Host $result\r\n        Write-Output  $result\r\n    }\r\n}<\/pre>\n<p>And indeed, with this approach, the output is properly pipelined, corresponding to Figure 3 above: <\/p>\n<pre class=\"lang:ps theme:powershell-output\">\tPS&gt; 1,2,3 | showInputs | f1Runner | f2Runner | f3Runner\r\n\t1\r\n\t2\r\n\t2A\r\n\tA2\r\n\t2\r\n\t4\r\n\t4B\r\n\tB4\r\n\t3\r\n\t6\r\n\t6C\r\n\tC6 \r\n<\/pre>\n<p>Because the process block is so central to pipelining, there is a special syntax to reduce the amount of code you have to write. As noted above using the Function keyword with no explicit begin, process, or end block executes the code in the context of the end block. Similarly, if you use the Filter keyword with no explicit block then the code executes in the process block. Thus, this more concise code produces exactly the same result: <\/p>\n<pre class=\"lang:ps theme:powershell-ise\">Filter showInputs {\r\n    $result = $_\r\n    Write-Host $result\r\n    Write-Output  $result\r\n}\r\nFilter f1Runner {\r\n    $result = f1 $_\r\n    Write-Host $result\r\n    Write-Output  $result\r\n}\r\nFilter f2Runner {\r\n    $result = f2 $_\r\n    Write-Host $result\r\n    Write-Output  $result\r\n}\r\nFilter f3Runner {\r\n    $result = f3 $_\r\n    # Write-Host $result\r\n    Write-Output  $result\r\n}\r\n<\/pre>\n<h2>Summary <\/h2>\n<p>As you have seen, it is very easy to think you are pipelining when, in fact, you are not. You could, instead, be waiting for all input to be received, or you might be processing all received inputs before sending any output. If, however, you are aware of the underlying concepts of the pipeline, it is straightforward to get true pipelining behavior in PowerShell. <\/p>\n<div class=\"float-right\">\n<p class=\"illustration\"> <img loading=\"lazy\" decoding=\"async\" height=\"155\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/2290-img2B.jpg\" width=\"253\" alt=\"2290-img2B.jpg\" \/><\/p>\n<p class=\"caption\">Figure 2:<\/p>\n<\/div>\n<p>Compare conceptually how the data moves  <strong> <em>without <\/em> <\/strong> pipelining (scenarios 1 and 2)&#8230; <\/p>\n<p><\/p>\n<div class=\"float-right\">\n<p class=\"illustration\"> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/2290-img3B.jpg\" height=\"155\" width=\"253\" alt=\"2290-img3B.jpg\" \/><\/p>\n<p class=\"caption\">Figure 3:<\/p>\n<\/div>\n<p> and  <strong> <em>with <\/em> <\/strong> proper pipelining: <\/p>\n<\/p>\n<p>For many situations-and particularly for time-consuming operations-you want to use proper pipelining so your end-user starts getting results as soon as possible. On the other hand, some operations actually require you to collect up the input and perform some operation on them together. Sorts are a classic example of something that can&#8217;t be done on an object-by-object basis: there is, of course a Cmdlet that does it for you but if there wasn&#8217;t &#8230; <\/p>\n<pre class=\"lang:ps theme:powershell-ise\">5, 6, 7, 2, 3, 0, 9 |% -begin { $v = @() } { $v += $_ } -end { [array]::Sort($v); $v } <\/pre>\n<p>I would be remiss, however, if I did not mention that the filter solution is not optimal in all situations. If you  <em>only <\/em> need data to come through the pipeline, it is sufficient. But PowerShell provides the capability to accept input from either the pipeline or as direct arguments. That is, you often want to be able to execute either of these and get the same output: <\/p>\n<pre class=\"lang:ps theme:powershell-output\">\tPS&gt; 1,2,3 | showInputs\r\n\tPS&gt; showInputs 1,2,3\r\n<\/pre>\n<p>Alas, the filters above will  <em>not <\/em> accept direct arguments. Here is a template for showInput that does (the other 3 filters can be adapted just as before, by changing the name and the calculation line). It is a bit more complicated, but that&#8217;s the trade-off for being more versatile. <\/p>\n<pre class=\"lang:ps theme:powershell-ise\">Filter showInputs(\r\n[Parameter(ValueFromPipeline = $True)]\r\n[array]$item)\r\n{\r\n    $item | ForEach-Object {\r\n        $result = $_\r\n        Write-Host $result\r\n        Write-Output  $result\r\n    }\r\n}<\/pre>\n<p>Again, this is not the only syntax that would work. Take a look at  <a href=\"https:\/\/www.simple-talk.com\/dotnet\/.net-tools\/down-the-rabbit-hole--a-study-in-powershell-pipelines,-functions,-and-parameters\/\">Down the Rabbit Hole: A Study in PowerShell Pipelines, Functions, and Parameters <\/a> to see other possible variations of pipelineable function templates, discussing their pros and cons. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>For many developers, understanding pipelining in PowerShell is like understanding reductive physicalism: you think you&#8217;ve just about got it, and then the brain blue-screens. Michael Sorens is inspired by his several efforts to explain pipelining on StackOverflow to attempt the definitive simple explanation for the rest of us.&hellip;<\/p>\n","protected":false},"author":221868,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[35],"tags":[4635,4871],"coauthors":[6802],"class_list":["post-2095","post","type-post","status-publish","format-standard","hentry","category-powershell","tag-powershell","tag-sysadmin"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/2095","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/221868"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=2095"}],"version-history":[{"count":10,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/2095\/revisions"}],"predecessor-version":[{"id":66424,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/2095\/revisions\/66424"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=2095"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=2095"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=2095"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=2095"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}