The PoSh DBA – Using the Basic PowerShell Paradigms

Comments 0

Share to social media

Last week I went to a Burger Shop with some friends and one of them asked for a Belly-Buster-Burger Bacon without Cheese and Bacon. At this point, to what extent he was eating a Belly-Buster-Burger Bacon?

When I´ve asked my good friend and Editor Andrew Clarke what he thought about me writing an article explaining why we should “think the way the language goes ” to try get the best of the programming language you are using, his answer was “Do it… and Laerte, please, Keep it Simple….Stupid!”.

In the days of having to use VBScript, DMO and SQL Server 2000, I attempted to write automation solutions and was always trying to understand what I had to do the job. The experience left me with some deep convictions about the best way to approach automation tasks. I’ll now share some of these with you.

Ignorance is Bliss

I remember when I started programming using Clipper language – a compiler for dBASE III (a popular database at the time), in a long-forgotten age – and I had a boss (in the company that was just he and I) that told me.

“Make it work, doesn’t matter how.”

He gave me that advice, well – shouted it at me, at a time when I was struggling for the best way to perform an operation at the very moment that my boss realized that we were starting to run out of time to deliver the solution to the client.

At that time, I took the advice to heart: But now that I’m older but wiser I wonder to what extent he was correct.

In the 1980s, I was a youngster in a remote town in Brazil and had only a few books about how to use the Clipper language and these were only in English. My native language is Portuguese. Even then, we could only get technical books: there was nothing available that described the paradigmof programming. We had to improvise with ‘sausage-string paradigm’. Why sausage-string? Simple. Everything serially in one string. Let us take this simple example to explain what I mean:

I have a Table (table 1) with some information. For each line in this table, I need to check other table (table 2). If this line it exists in the other table, it needs to update a column otherwise it needs to insert it.

The “sausage-string paradigm” Clipper way:

  • Loop in the table 1
  • For each line, seek in the table 2
  • If it exists update the information, otherwise insert
  • End loop

The Clipper way translated to T-SQL

  • Cursor in the Table 1
  • For each line, select in the table 2
  • If exists , Update, Otherwise Insert
  • End Cursor
  • Deallocate Cursor

Does the “sausage-string paradigm” Clipper way work in T-SQL? Yes! Is it doing the job? Yes, but slowly! Is it the way that T-SQL was written and intended to work? No! Try this algorithm in an environment with hundred lines and then in with thousands. You will understand my point

T-SQL Way in T-SQL

  • Insert What does not exists
  • Update what exists

You should “think the way the language goes”

Thinking as PowerShell

The same way, I can easily write code in PowerShell using the Clipper way. Is it wrong? No, but is it a Belly-Buster-Burger? Not at all

PowerShell is an awesome scripting language and deserves code in scripts that uses its features as intended: The way that the language works.

Let’s get to business with a realistic example, not in order to explain the cmdlet itself but to show how to use the features of the language. I will not go deeper in the cmdlet or in the features. For that, you can follow the links in the end or the article or use our best friend, Professor Google.

The first example: Where are those databases?

I was asked to create a script to read about 800 SQL Server instances to find a particular database by name. The script had to create an Excel file with the names of instances that had the database of that name. By the same token, the script had to create an Excel file with only the Instance Name where the database didn’t exist.

The ‘Sausage paradigm’ Way

Thinking the way that the language works

In the PowerShell way, I don’t need to create or fill the second array $DoNotHaveDatabase with the instance Names that don’t have the Database as in the ‘sausage-string’ way. This array is the difference of the all the Instance Names Array ($InstanceNames) with the Array of the Instance Names that do have the Database Name ($InstanceHave). I just need to use the built-in Compare-Object cmdlet. Also, I don’t need to test whether the database exists and then iterate through the databases to check whether the database exists before incrementing the variable. I just need to increment it with the SMO parent name from Database enumeration to get the instance name, pointing the Database Enumerator to the database Name that I want to check for. PowerShell is so beautiful that if the Database you specify does not exists the following expression …

… merely fails to add the element to the array (containing the instance name) if the database doesn’t exist.

You may also be asking why in the sausage-string’ way I am piping the results of the get- content cmdlet directly to the pipeline, and when doing it the ‘thinking as PowerShell’ way, I am storing the array in a variable. It is because I need the Instance names in an object to compare the objects. So if I code the same way as in the first approach, I would need to load it twice; first for the foreach -object loop and second for compare-object

Another Example: Counting the number of VLF files

Let’s look at the maintenance task of counting the VLFs in the transaction logs of the databases to check for excessive fragmentation.

The ‘sausage-string’ Approach

Thinking the PowerShell way

I do not need an extra overhead of a foreach -Object in the databases to increment the counter to VLF. Every single object returned by PowerShell, if it has more than one item, will become an array. This is because everything in PowerShell is an object, and since 2.0 version all objects have count property. I just need to count the return of invoke- sqlcmd cmdlet.

Another Example: Aggregating data

I had an CSV file to transform some data. This CSV had several columns, but what matter for us are : (the data is changed of course)

So my CSV was something like :

The challenge was, for each PRT CODE I had to know :

  • Highest ACT COST
  • Sum of ACT COST
  • Select the top 5 that have Highest ACT COST

The Sausage-String way.

Can you think how you might do it in Clipper or sausage-string’ way ? Probably something like that (I did not have to think too much about this)

  • For each items in CSV
  • Algorithm to group the PRT CODE column
  • Algorithm to after group the PRT CODE, seek the Highest ACT COST column
  • Algorithm to after group the PRT CODE, calculate the sum of the ACT COST column
  • Algorithm to sort everything by ACT Cost Descending
  • Select the First 5

Man, it would be a veeeery long script. Even on more recent languages such as C#, it would not be so simple as in PowerShell, if you know how to use it .

The PowerShell way :

  • Import The CSV
  • Group the PRT CODE
  • Highest ACT COSTSelect the PRT CODE grouped, plus a custom table piping the group output from group-object to a sort-object descending the ACT COST value and piping to a select first 1 .
  • Sum of ACT COSTSelect the PRT CODE grouped, plus a custom table piping the group output from group-object to a measure-object by SUM .
  • Pipe everything to a sort-object to the Highest ACT COST
  • Select Top 5

Yes, everything in one simple command line. At this point I want to thank my good friend Shay Ley for helping me to come out with this solution. The code is :

You have asked for a Belly-Buster-Burger Bacon and you do eat a proper Belly-Buster-Burger Bacon

Finally: Determining if the Log and Data File are on the Same Drive

if you read my last articles Automating Your SQL Server Best Practice Reports:The Checks and Automating Your SQL Server Best Practice Reports: The Document, one of my checks was whether the data and log files for the databases are in the same drive. I just need to check whether there is at least one log and data file at the same drive. Because, in PowerShell, all data is stored as an object, this is made rather simpler than otherwise.

The trick is, after using a TSQL query to return the datafiles …

.. you just need to group the objects returned by the first part of the filename string, before the ‘:’ colon. This is the drive. If the count is > 0, there are both log and data on the same drive location; it doesn’t matter what they are. The Variable $ ItHave will store a true or false

If you want to check out what the files are …

Simple and clean. Classic PowerShell

Conclusions

Those were just simple examples of how you can use the features of the language to help you to improve your code and learning. Of course there is a lot of more to do it, especially in PowerShell. I suggest you download the scripts in the PowerShell community channels and check how those Jedi do them. You will learn a LOT with these guys.

Ha! By programming PowerShell the way it was intended, I’m getting the full value from it, with none of the goodness taken out. I am eating the genuine Belly-Buster-Burger Bacon with a lot of cheese and Bacon, iced Coke and French fries.

References :

Load comments

About the author

Laerte Junior

See Profile

Laerte Junior is a Cloud and Datacenter Management MVP, focused on PowerShell and automation and, through through his technology blog and simple-talk articles, an an active member of the SQL Server and PowerShell community around the World. He is a skilled Principal Database Architect, Developer, and Administrator, specializing in SQL Server and PowerShell Programming and Automation. He also is PASS Virtual Chapter Mentor , Co-Lead of the PASS PowerShell Virtual Chapter and former PASS Regional Mentor for Brasil. In addition He is also a HUGE Star Wars Fan, his main skill. "May The Force be with us" You should follow him on Twitter as @LaerteSQLDBA

Laerte Junior's contributions