Product articles SQL Toolbelt Database DevOps
Peering into and Parsing Application…

Peering into and Parsing Application Log Files

How to use PowerShell cmdlets, such as Select-String, to glance at the contents of the application logs, or use them in conjunction with Regular Expressions to sift through log files in detail looking for particular types of error.

Guest post

This is a guest post from Phil Factor. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.

Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

He is a regular contributor to Simple Talk and SQLServerCentral.

When you are using any application or tool regularly, you need to know if the logs are showing up any problems. This is particularly handy if you are hitting configuration difficulties. The log can become essential with the tools that use a service because the information it provides gives you your best chance of tackling tricky installation tasks. The logs also help with checking for warnings, errors and critical issues in any automated process.

None of us enjoy looking through log files. We prefer to be alerted only if an application is hitting issues, for example because it can’t connect to a server, or a required file is missing. Often, in an automated process, receiving an alert based on errors reported in the log is the only way we can be warned. I’ll show you how to peer into application log files, and dissect their contents in detail, using the logs for various Redgate tools as examples.

The location and format of Redgate logs

Redgate products produce several log files. There is good news and bad news about this.

The good news is that the logs for tools with a user interface are stored consistently, in a subdirectory for each tool. This is the $env:localappdata\Red Gate\Logs directory. You can see what is in your user area by executing

The tools that run as a service, such as SQL Monitor and SQL Clone, can’t guarantee to have an associated user area and so are often in a different directory. For example, the SQL Monitor service logs are in %programdata%\Red Gate\Logs\SQL Monitor.

Location of the SQL Monitor log files

The Redgate documentation site maintains a list of the location of the log files for all the products. It is reasonably easy to inspect the log with a text editor and you can search the logs easily for problems. SQL Clone and SQL Monitor will even zip them up and download them to you. By having a consistent location, it is easy to scan through the logs looking for possibly significant words such as ‘Warning’ or ‘Error’.

The bad news is that almost every tool uses a different format. For accurate monitoring of problems, you need to parse a log file to separate out each separate record, where each record may extend to several lines, and then select according to the contents of a particular column such as error-severity. This gives you the problem of having to create a different parsing system for each tool. I’ll be showing how to get around this sort of problem, but before I do so, I’ll look at some of the traditional ways of inspecting logs.

Peering at log files at arm’s length

In PowerShell, the Get-content and Select-String cmdlets are your friend when you are looking for something in application logs. They will tackle almost any sort of complex search, especially if you find Regex strings relatively easy. For Windows system event logs you don’t need them because you have Get-WinEvent.

Searching through a single log file

For our purposes, we’ll start by choosing a single file. To play along, you’d need to replace this with an existing log file on your workstation. For the time being I’ll just show simple searches as you’d use in any text file

You can page through a long file (but not in the IDE).

The last line prompts you to page through the file:

Paging through a log file

You can just view the last five lines, or whatever you specify, as these will usually be the most recent.

This technique becomes much more useful with an application such as SQL Clone Service or SQL Monitor at a time that it is actively writing to the log, if you open up a new PowerShell window and execute

Get-Content checks the file once each second and outputs new lines if present. You can just keep it going to see how the service is running.

You can look at the lines that have ‘error’ on the line by executing this:

Or you can specify several alternatives…

You can just get the five last lines

Even better, you can see more of the record that contained the word ‘error’ by displaying the lines before and after the line. This has specified one line before, and three after:

If you know that your record is on a particular line, say line 45, you can display it, guessing that it will be 30 lines long.

Naturally, you can use Out-Gridview to display the lines:

The contents of a log in grid view

Scanning several log files

Some logs are extraordinarily garrulous, whereas others only report an emergency. Normally, you need to scan several files to identify the log file that holds the information you need. The path can be a wildcard, so you can look through several files within a directory. You can then use Select-String cmdlet, which is designed to do as much as possible of the searches that GREP does, to scan for a string:

This gives you the line before and after, as well as the line that contained the word ‘error’. You get a MatchInfo object for each match which is displayed as the filename and line.

You can use this to provide a slightly easier layout:

scanning a set of log files for errors

You’ll notice that you get false positives in our example because the word ‘Error’ appears rather a lot. What you need is to do instead is search for an error category in the ‘Type’ field. In this particular log, you could do this by looking for the '[Error]' string

A SQL Prompt Error

However, even if you specify a context of preceding and following lines, you are a bit stuck if the value for the field stretches over several lines.

Parsing a log

To query logs, you need to be able to tease out specific fields. You might, for example, choose to select only those records at the Warning or Error log level. These are generally indicated as follows:

Log Level Value Description
Trace 0 These are only used in development work to chase bugs and investigate performance problems. These messages may contain sensitive data such as passwords so are disabled outside development
Debug 1 For debugging and development.
Information 2 This tracks the application’s flow of processes (equivalent to ‘Verbose’ in PowerShell
Warning 3 For abnormal, odd or unexpected events and conditions that don’t actually cause the app to fail.
Error 4 For errors and exceptions that cannot be handled but indicate just a failure in the current operation or request, rather than an app-wide failure.
Critical 5 failures that stop the application dead.
None 6 Specifies that no messages should be written.

You may want to look at just specific events by name or by ID. You generally use a filter to inspect just the message or details field that you need to see, because log entries can be long, especially when the details field often dumps an exception of unpredictable length.

The way you can parse a log depends on the way that the date is written, and the delimiters used between fields. There is no standard format for ‘structured logging’. The Redgate tools use a rich variety of formats, so a consistent way of listing recent errors in any of the installed Redgate products is just about impossible. For example, some logs are used for just one day, which makes retention periods easier to manage, but the downside is that they only record the time in the log and not the date.

Using Select-String with Regex Named Capturing Groups

The technique I use to deal with the variety of formats is to use the same way of parsing the logs, whatever tool produced it. I just vary the regex that I use to split the text of the log into columns and rows.

The Import-Log PowerShell function that we’ll use to parse the logs is provided below. It supplies the Select-string cmdlet with two mandatory parameters: the path the log file(s) and a regex ‘string splitter’, which varies by tool. This regex uses a named “capture group” to extract each part of the log record that we need, such as the date, the error number, log level, error details and so on. The last and most troublesome part of the log entry is the one that contains the details. It can run to several lines. I look for either the start of the next record (usually a date field) or the end of the file.

There is a third optional parameter, a scriptblock filter that assumes that you want only the Warning, Error or Critical log entries. However, you can change that to whatever you want by providing the scriptblock as a parameter, as I’ll be showing later in this article.

Having imported the contents of the log files, Import-Log uses the regex splitter we provide to parse it into columns and rows. It displays a record only if it matches our scriptblock filter. In the example cmdlet, I just output a string because it is easier to demonstrate, but the cmdlet can easily output an object, containing an array of log records, in the same way that Get-WinEvent does.

The output of the Import-Log log parser

The Regex strings to slice and dice log entries

There is a different regex string for every Redate product. In a similar way, you can parse or any other long semi-structured text files. Here the regexes required for a few different Redgate tools:

These regexes are similar because they look for a date at the start of a line, as the start of a record. It must be picky as to what constitutes the date because dates can turn up in the stack-dumps from an exception, or in a description. Each field value is captured in a named capturing group. Notice that with the names of the capturing groups that you choose, you specify the keys. The body of the log can be several lines long and so we pick up everything until we either hit another date or the end of the string. For this, we use a positive lookahead.

Filtering for just the entries you want

The filter for the cmdlet is passed to it as a scriptblock (in the same way it’s done in the Where-object cmdlet). It can be changed to allow for other queries. By default, it looks for only those records of the Warning, Error or Critical Level, but you can edit it to make it specify whatever you like.

You can also override this default by providing your own scriptblock filter as a parameter. Here, you want to search all log record but return those where the string 'StringActiveStyleName' is mentioned in the details section of the record. Yes, I did once, when searching through the SQL Prompt logs, but it is a long story that would appeal only to geeks.

All you need to do is to specify what you need to filter. To filter on two conditions, you add an -or condition between two Boolean expressions.

Here is code to look through the current SQL Monitor Logs for a specific network error reported in the details. You’ll need to set it to the right path for your SQL Monitor logs.

Normally, you’ll only want to look through recent logs because this type of ‘brute-force’ search is never quick. It can also pay to do a quick search through the contents of a file for the specific string you are looking for (a -Like search is quicker than a regex search) before you go through the labor of parsing the file into records.

Conclusions

Like fire-extinguishers, parachutes or seatbelts, one only appreciates application logs when things go wrong. With GUI-based applications, there are plenty of ways to alert the user when things go wrong. However, in networks where there are many scripted processes such as a DevOps Database development, it pays to have ways of being alerted quickly to warnings or errors, that are below the obvious level of the email or Slack message. Here, a PowerShell process is ideal. When you are installing a complex tool into an unusual setting, the log can save a lot of time.

Tools in this post

SQL Monitor

Real-time SQL Server and PostgreSQL performance monitoring, with alerts and diagnostics

Find out more

SQL Toolbelt

The industry-standard tools for SQL Server development and deployment.

Find out more