Last week I went to a Burger Shop with some friends and one of them asked for a Belly-Buster-Burger Bacon without Cheese and Bacon. At this point, to what extent he was eating a Belly-Buster-Burger Bacon?
When I´ve asked my good friend and Editor Andrew Clarke what he thought about me writing an article explaining why we should “think the way the language goes ” to try get the best of the programming language you are using, his answer was “Do it… and Laerte, please, Keep it Simple….Stupid!”.
In the days of having to use VBScript, DMO and SQL Server 2000, I attempted to write automation solutions and was always trying to understand what I had to do the job. The experience left me with some deep convictions about the best way to approach automation tasks. I’ll now share some of these with you.
Ignorance is Bliss
I remember when I started programming using Clipper language – a compiler for dBASE III (a popular database at the time), in a long-forgotten age – and I had a boss (in the company that was just he and I) that told me.
“Make it work, doesn’t matter how.”
He gave me that advice, well – shouted it at me, at a time when I was struggling for the best way to perform an operation at the very moment that my boss realized that we were starting to run out of time to deliver the solution to the client.
At that time, I took the advice to heart: But now that I’m older but wiser I wonder to what extent he was correct.
In the 1980s, I was a youngster in a remote town in Brazil and had only a few books about how to use the Clipper language and these were only in English. My native language is Portuguese. Even then, we could only get technical books: there was nothing available that described the paradigmof programming. We had to improvise with ‘sausage-string paradigm’. Why sausage-string? Simple. Everything serially in one string. Let us take this simple example to explain what I mean:
I have a Table (table 1) with some information. For each line in this table, I need to check other table (table 2). If this line it exists in the other table, it needs to update a column otherwise it needs to insert it.
The “sausage-string paradigm” Clipper way:
- Loop in the table 1
- For each line, seek in the table 2
- If it exists update the information, otherwise insert
- End loop
The Clipper way translated to T-SQL
- Cursor in the Table 1
- For each line, select in the table 2
- If exists , Update, Otherwise Insert
- End Cursor
- Deallocate Cursor
Does the “sausage-string paradigm” Clipper way work in T-SQL? Yes! Is it doing the job? Yes, but slowly! Is it the way that T-SQL was written and intended to work? No! Try this algorithm in an environment with hundred lines and then in with thousands. You will understand my point
T-SQL Way in T-SQL
- Insert What does not exists
- Update what exists
You should “think the way the language goes”
Thinking as PowerShell
The same way, I can easily write code in PowerShell using the Clipper way. Is it wrong? No, but is it a Belly-Buster-Burger? Not at all
PowerShell is an awesome scripting language and deserves code in scripts that uses its features as intended: The way that the language works.
Let’s get to business with a realistic example, not in order to explain the cmdlet itself but to show how to use the features of the language. I will not go deeper in the cmdlet or in the features. For that, you can follow the links in the end or the article or use our best friend, Professor Google.
The first example: Where are those databases?
I was asked to create a script to read about 800 SQL Server instances to find a particular database by name. The script had to create an Excel file with the names of instances that had the database of that name. By the same token, the script had to create an Excel file with only the Instance Name where the database didn’t exist.
The ‘Sausage paradigm’ Way
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
Get-Content c:\laerte\ST\InstanceNames.txt | ForEach-Object { $SQLServer = new-object ("Microsoft.SqlServer.Management.Smo.Server") $_ $SQLServer.Databases | ForEach-Object { if ($_.name -eq 'test') { $HaveDatabase += $SQLServer.Name } else { $DoNotHaveDatabase += $SQLServer.Name } } } $HaveDatabase | Export-Csv InstanceNameHaveDatabase $DoNotHaveDatabase | Export-Csv InstanceNameDoNotHaveDatabase |
Thinking the way that the language works
1 2 3 4 5 6 7 8 9 10 11 12 13 |
$InstanceHave = @() $InstanceNames = Get-Content c:\laerte\ST\3article\InstanceNames.txt $InstanceNames | ForEach-Object { $SQLServer = new-object ("Microsoft.SqlServer.Management.Smo.Server") $_ $HaveDatabase += $SQLServer.Databases['test'].parent.Name } $HaveDatabase | Export-Csv InstanceNameHaveDatabase #InstanceNamesDoNotHave Compare-Object -ReferenceObject $InstanceNames -DifferenceObject $HaveDatabase | Export-Csv InstanceNameDoNotHaveDatabase |
In the PowerShell way, I don’t need to create or fill the second array $DoNotHaveDatabase with the instance Names that don’t have the Database as in the ‘sausage-string’ way. This array is the difference of the all the Instance Names Array ($InstanceNames) with the Array of the Instance Names that do have the Database Name ($InstanceHave). I just need to use the built-in Compare-Object cmdlet. Also, I don’t need to test whether the database exists and then iterate through the databases to check whether the database exists before incrementing the variable. I just need to increment it with the SMO parent name from Database enumeration to get the instance name, pointing the Database Enumerator to the database Name that I want to check for. PowerShell is so beautiful that if the Database you specify does not exists the following expression …
1 |
$HaveDatabase += $SQLServer.Databases['test'].parent.Name |
… merely fails to add the element to the array (containing the instance name) if the database doesn’t exist.
You may also be asking why in the ‘sausage-string’ way I am piping the results of the get- content cmdlet directly to the pipeline, and when doing it the ‘thinking as PowerShell’ way, I am storing the array in a variable. It is because I need the Instance names in an object to compare the objects. So if I code the same way as in the first approach, I would need to load it twice; first for the foreach -object loop and second for compare-object
1 2 3 |
#InstanceNamesDoNotHave Compare-Object -ReferenceObject $InstanceNames -DifferenceObject $HaveDatabase | Export-Csv InstanceNameDoNotHaveDatabase |
Another Example: Counting the number of VLF files
Let’s look at the maintenance task of counting the VLFs in the transaction logs of the databases to check for excessive fragmentation.
The ‘sausage-string’ Approach
1 2 3 4 5 6 7 8 9 10 |
$SQLServer = new-object ("Microsoft.SqlServer.Management.Smo.Server") 'DeathStar' $SQLServer.Databases | ForEach-Object { $vlfcount = 0 Invoke-Sqlcmd -ServerInstance 'DeathStar' -Database $_.name -Query 'DBCC LOGInfo' | ForEach-Object { $vlfcount++ } [pscustomobject]@{ DatabaseName = $_.name; VLFCount = $VLFCount } } |
Thinking the PowerShell way
1 2 3 4 5 6 |
$SQLServer = new-object ("Microsoft.SqlServer.Management.Smo.Server") 'DeathStar' $SQLServer.Databases | ForEach-Object { $VLFCount = (Invoke-Sqlcmd -ServerInstance 'DeathStar' -Database $_.name -Query 'DBCC LOGInfo').count [pscustomobject]@{ DatabaseName = $_.Name; VLFCount = $VLFCount } } |
I do not need an extra overhead of a foreach -Object in the databases to increment the counter to VLF. Every single object returned by PowerShell, if it has more than one item, will become an array. This is because everything in PowerShell is an object, and since 2.0 version all objects have count property. I just need to count the return of invoke- sqlcmd cmdlet.
Another Example: Aggregating data
I had an CSV file to transform some data. This CSV had several columns, but what matter for us are : (the data is changed of course)
1 2 |
PRT Code - The code of the practice itself ACT COST - The actual cost of an specific product |
So my CSV was something like :
1 2 3 4 5 6 7 |
PRT CODE,PRD CODE, PRD NAME, ITEMS,ACT COST PRT01,Q001,Name 1,1,002.10 PRT01,Q002 Name 2,3,10.89 PRT01,Q003 Name 2,3,8.37 PRT02,Q002 Name 2,3,10.89 PRT02,Q004 Name 4,3,7.89 PRT03,Q002 Name 2,3,10.89 |
The challenge was, for each PRT CODE I had to know :
- Highest ACT COST
- Sum of ACT COST
- Select the top 5 that have Highest ACT COST
The Sausage-String way.
Can you think how you might do it in Clipper or ‘sausage-string’ way ? Probably something like that (I did not have to think too much about this)
- For each items in CSV
- Algorithm to group the PRT CODE column
- Algorithm to after group the PRT CODE, seek the Highest ACT COST column
- Algorithm to after group the PRT CODE, calculate the sum of the ACT COST column
- Algorithm to sort everything by ACT Cost Descending
- Select the First 5
Man, it would be a veeeery long script. Even on more recent languages such as C#, it would not be so simple as in PowerShell, if you know how to use it .
The PowerShell way :
- Import The CSV
- Group the PRT CODE
- Highest ACT COSTSelect the PRT CODE grouped, plus a custom table piping the group output from group-object to a sort-object descending the ACT COST value and piping to a select first 1 .
- Sum of ACT COSTSelect the PRT CODE grouped, plus a custom table piping the group output from group-object to a measure-object by SUM .
- Pipe everything to a sort-object to the Highest ACT COST
- Select Top 5
Yes, everything in one simple command line. At this point I want to thank my good friend Shay Ley for helping me to come out with this solution. The code is :
1 2 3 4 5 6 7 8 9 10 |
Import-Csv Import-Csv "c:\CSV\CSVChallenge IEXT.CSV" | Group-Object -Property 'PRT CODE' | Select-Object @{ N = 'PRT Code'; E = { $_.name } }, @{ N = 'Highest Actual Spend'; E = { ($_.group | Sort-Object 'ACT COST' -Descending | select -First 1 'ACT COST').'ACT COST ' } }, @{ n = 'Spend in Total'; e = { ($_.Group | Measure-Object -Property 'ACT COST' -Sum).Sum } } | Sort-Object -property 'Highest Actual Spend' -Descending | Select-Object -First 5 * |
You have asked for a Belly-Buster-Burger Bacon and you do eat a proper Belly-Buster-Burger Bacon
Finally: Determining if the Log and Data File are on the Same Drive
if you read my last articles Automating Your SQL Server Best Practice Reports:The Checks and Automating Your SQL Server Best Practice Reports: The Document, one of my checks was whether the data and log files for the databases are in the same drive. I just need to check whether there is at least one log and data file at the same drive. Because, in PowerShell, all data is stored as an object, this is made rather simpler than otherwise.
The trick is, after using a TSQL query to return the datafiles …
1 |
$SQL = "select filename FROM sysfiles" |
.. you just need to group the objects returned by the first part of the filename string, before the ‘:’ colon. This is the drive. If the count is > 0, there are both log and data on the same drive location; it doesn’t matter what they are. The Variable $ ItHave will store a true or false
1 2 3 |
$SQL = "select filename FROM sysfiles " $ItHave = ((Invoke-Sqlcmd -ServerInstance 'DeathStar' -Database ObiWan -Query $sql | Group-Object -Property { ($_.filename -split ':')[0] }).count) -gt 0 |
If you want to check out what the files are …
1 2 3 |
Invoke-Sqlcmd -ServerInstance -ServerInstance 'DeathStar' -Database ObiWan -Query $sql | Group-Object -Property { ($_.filename -split ':')[0] } | Select -ExpandProperty group |
Simple and clean. Classic PowerShell
Conclusions
Those were just simple examples of how you can use the features of the language to help you to improve your code and learning. Of course there is a lot of more to do it, especially in PowerShell. I suggest you download the scripts in the PowerShell community channels and check how those Jedi do them. You will learn a LOT with these guys.
Ha! By programming PowerShell the way it was intended, I’m getting the full value from it, with none of the goodness taken out. I am eating the genuine Belly-Buster-Burger Bacon with a lot of cheese and Bacon, iced Coke and French fries.
References :
- Hey, Scripting Guy! Blog
Some very good overview and deep dive into some features of PowerShell - Michael Sorens – Simple-Talk
Learn from a master - and everything about the PowerShell Community
Load comments