Product articles Redgate Monitor Database Performance Monitoring
Monitoring the Application with SQL…

Monitoring the Application with SQL Monitor: Website Activity

Using a PowerShell script that collects log data from a web server, plus a SQL Monitor custom metric, Phil Factor offers a way to check for suspicious website errors and unusual patterns of activity, right alongside your database monitoring.

Guest post

This is a guest post from Phil Factor. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.

Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

He is a regular contributor to Simple Talk and SQLServerCentral.

As well as being essential for giving you an overview of the SQL Server installations that you look after, SQL Monitor can also be used in ways that aren’t so immediately obvious.

One immediate advantage it has as a monitoring tool is that it isn’t part of any one server, database or application: it is independent. This means that if one or more components of a system crash, it is unlikely to prevent people from being alerted that it has happened. Also, it is difficult to disable the oversight of processes for malicious reasons. SQL Monitor can also represent variation in a metric graphically and compare it with a baseline. It can send you alerts on that metric. Finally, it allows you to create a custom metric from any SQL Batch that returns an integer. These four virtues mean that it can provide a simple but effective way to monitor an application.

In order to demonstrate this, we’ll set up a simple application, in this case an actual website, and read its logs. We can then report on usage, as well as on any problems, which are normally probing attacks. This is an important part of administering any application, but one that is generally rather tedious without automated processes to do it. Using SQL Monitor, we can simply create a custom metric to collect the web log data, on a schedule, and report it graphically.

When we see odd usage patterns, we’ll know exactly when it happened and can investigate further. For performance issues, we’ll have a direct link between application behavior and the resulting query behavior and resource constraints seen in SQL Server. For possible security issues, we get an early warning that the operations team can investigate, using more specialized tools. Where there is a service interruption, it can be remedied quickly.

Monitoring application activity, alongside the database, is a good example of how development and operations people can share their skills for mutual benefit, to get a better understanding of what is happening with an application.

Collecting the web log data

Often, web logs are used only after something goes horribly wrong, because the process of monitoring text-based logs is generally manual, and difficult to automate for continuous checks. In our case, the Apache Web Server usage logs are retained on the site for a month (30 days, in fact), zipped up using GZip. The log data for the past two days is unzipped.

The web usage log has a standard format that includes, among other things, the type of HTTP operation request, the IP address of the requestor, the User ID, the time the request was made, the body of the request and the UserAgent (describing the type of browser, the identity of an indexing bot, and so on). There is also an error log, which is undated, and records all errors, failed logins, or bad requests, as well as scripting errors from PHP or Perl.

For the Apache server there is no equivalent to Windows IIS logging via ODBC, so there is no way for the Web Server to transfer the log records directly into a database. However, all the log data is accessible via FTP.

Normally, in a production system, one keeps any regular FTP transfer at arms-length because active FTP isn’t very secure. For that purpose, we’ll use a windows management server in a DMZ or logical subnet. The task of the management server is to:

  • Interrogate the server for the date and time of the last log record and error stored.
  • FTP the necessary files.
  • Unzip any files, if required.
  • Parse the contents of the records.
  • Check the UserAgent field in the usage log records to filter out bot records that are merely indexing the site for search engines, and so send to SQL Server only log records representing real website visits, and only those that it hasn’t already got.
  • Store the required records in the database, via a SQL Server connection.
  • Insert the required log records into a log file archive on a shared file server.

The receiving SQL Server has only to do any reporting and aggregation that is required.

Getting all scripted up

This is the type of system that was established at a time when operations people had far fewer websites and could schedule their time to do regular manual checks on all operational systems.

Such is the workload nowadays that it is a struggle to do manual checks, especially with the less-important services. However, the nature of today’s internet threats means it is no longer safe to neglect any live internet-facing website. For a system like this, which wasn’t designed for automation, we need to spend a bit of time to script this and store the relevant log records in a SQL Server database.

The PowerShell

Our PowerShell script will perform all the previously-described tasks, on a schedule. We develop it in the PowerShell ISE, then test it out in a command-line, using the credentials we’ll use for the working system. Finally, we schedule it, using the same credentials. The task has no console, so it must log its activities and errors to a progress log file.

To make the logging more resilient, the process will create the log tables in the database, if they do not already exist, and stock the InputLogPageReads table with the previous thirty days’ pageviews. It also stocks the error log table (WebsiteErrors) with the current contents of the error file.

Most access log activity on a small website has no value. Either it comes from bot activity, indexing the site for search engines, or hackers tirelessly trying, with automated probes, to break into the system. We filter out all of this, before it gets to the database; otherwise you will have a bloated idea of the popularity of the site.

Here is the current PowerShell script. It uses the sqlserver module, so you’d need to install that first. It uses some modified functions from other authors. Other than that, all you need is the database because the script aims to create the tables in the database if they do not already exist.

The first time it runs it takes a while to collect what it can of the existing logs. I’ve set it to read the last thirty days logs, when starting up from scratch. For a large site, this will need changing to take smaller gulps.

This PowerShell script is run every few minutes on the scheduler on the Windows management server. The UserID that runs the scheduled task has restricted rights to network shares, and needs database read/write access. It needs an FTP account that can access the remote FTP server. The problem I found on Windows Server is that the FTP utility uses active FTP, which is usually prevented by the firewall. A Windows 10 machine works fine.

I create the Windows account that will run the PowerShell process, part of which is to perform the FTP transfer. Then, I log in to the management server using the credentials of that account holder and run the script interactively in the PowerShell ISE until all problems are fixed. This means that I can ensure that the IDs and encrypted passwords are saved securely in the user’s area, and check that all the access rights are correct. If you change a password, you will need to delete the old password file in the user’s area and run the script interactively again.

Once everything is running well interactively, it’s time to run it from the PowerShell command line. If it runs there, then it will run in the scheduler with the same UserID.

The SQL Monitor custom metrics and alerts

The script for the Website Pageviews custom metric simply counts the number of log records inserted into the InputLogPageReads table in the last 10 minutes. In fact, I chose to count the number of records in the ten-minute interval between 30 and 20 minutes ago, just to be certain of the number of records, but this is easy to change.

A second custom metric, called Website errors or similar, detects whether any errors have occurred over the same period. Again, you can change the latency to whatever you prefer.

Both custom metrics are scheduled to run every 10 minutes. Here is the SQL Monitor configuration screen for the Website Pageviews custom metric:

The method of alerting, and the type of alerts required, is best left as a decision for the administrator. This is where the virtues of SQL Monitor show themselves. I like to know if there is no activity on the site for a period, because that suggests that the site has gone offline, so I set up one alert for no activity:

I also like to know of a sudden spike in usage, so I set up another alert for that. Often, these sudden spikes are just happenstance. On one occasion, the website I was administering was mentioned on prime-time television, and the site got over a million pageviews, but that is another story.

Often, it is a deliberate attack on the site. Sadly, a lot of internet processes are dedicated to trying to find vulnerabilities, and these are often quite subtle. These attacks show up in both the error logs and the usage logs and they are often interesting to check on. Most will show up in the database records and if you know the time they happened, the full logs are in the local archive and easily inspected with a text editor for forensics. These attacks generally probe the LAMP stack and show up a wonderland of PHP and MySQL vulnerabilities.

I keep what is often called a ‘tethered goat’, which is an apparently-useful and innocuous website, which is there with enticing signs of vulnerabilities, though in reality completely isolated. It gets a great deal of attention from the dark web.

Monitoring and reporting

Once all this is set up then in SQL Monitor, you can look at the custom metrics either in an analysis graph, or you can set it up more permanently in a report, which can then be emailed to you.

I much prefer emails for regular checks, as it takes less of my time. By specifying the two metrics, each as tiles in a report, you will quickly be able to see the general activities going back up to a month. Here is an example report showing the result of three days of monitoring:

The spike on Sunday morning was not the result of an attack but was caused by the indexing of the entire site by a robot that was not properly marked in the UserAgent. It is very easy to explore this either by inspecting the logs in the file repository or in the SQL table.

It is probably more interesting to see variations of the ten-minute metrics, over a day, like this:

Querying the logs for the details

To get general reports on the logs in SQL, and to tidy up the records in the SQL Server database, there are a few obvious SQL queries that can be made.

Conclusion

When you are setting up monitoring for any database-driven application, it pays to be comprehensive. You should monitor the application as well as the server and database. This not only gives more, and often earlier, warning of performance problems, but also alerts you quickly when something breaks.

My boss used to tell me he didn’t care what happened to my ***** application, as long as the shopping cart worked well. That inspired me to monitor the shopping cart so well that I was able to display a personalized welcome message on his screen whenever he surreptitiously tested the cart.

Monitoring not only gives you an unparalleled insight into what is happening in the very bowels of the database server but must also warn you when things are going unpleasantly wrong elsewhere in the application. Before you start congratulating yourself that the database server is coping well with all this extra traffic, you need to be sure that the extra traffic isn’t being caused by someone siphoning off all your customer data from the application!

A good monitoring strategy will allow you to remedy problems quickly, and the data you collect should increase the bandwidth of the evidence you can provide as to the likely cause, which is essential for effective security. This requires more than just monitoring the built-in Server metrics. You need to monitor on a broad base, adding in custom metrics for processes right across the application. It pays to monitor and be alerted.

Tools in this post

Redgate Monitor

Real-time SQL Server and PostgreSQL performance monitoring, with alerts and diagnostics

Find out more