There are a whole range of reasons you should monitor your SQL Server estate, from the essentials that support firefighting through to advanced modelling to help with proactive future planning.
However, there are a few steps you need to take in order to ensure the monitoring is effective. In this post, we’ll show you what they are and provide some tips on how Redgate’s SQL Monitor can make those steps easier.
If you’d prefer to watch this as a webinar instead, you can also see the original presentation of this content, which was part of Redgate Streamed.
But first …
Before you get started, you need to ensure you understand your systems and your business. The business goals of the company you work for will affect your priorities, so you need to be fully aware of the activities and processes that will have an impact on your estate. For example, are there seasonal upturns or downturns? Are there particular privacy issues in play? How does demand fluctuate during the day?
You also need to decide on buy-versus-build early on in your decision-making process, as this will influence where you invest your money and which resources are required. Your decision to purchase a monitoring tool will require a different planning approach to one that you build in-house.
One important point to bear in mind whichever route you go down is that you need to separate your monitoring tool from what is being monitored. If you put your monitoring tool on one of the servers you want to monitor you’ll add to the load, and exponentially increase it as you monitor the increase. Also, if that server goes down, so will your monitoring, and you won’t be able to use your tool to fix it or even alert you to the issue.
Finally, you must do capacity planning on your monitoring server to ensure it’s powerful enough to run your tool and deal with the volume of data coming in from your estate – an underpowered machine will prevent you from efficient monitoring. You should also plan for your monitoring to grow with your business, including upgrading the machine it’s hosted on as your estate grows.
With that said, let’s get into the 7 steps.
1. Define the servers under management
The first step to effective SQL Server monitoring is to find all your servers. Make sure you’re aware of all of the SQL Server instances in use at your organization, which means physically looking for machines in different departments as well as those hosted in the cloud.
When you know what servers you have, where they’re located, and who owns them you can begin to define the purposes they serve. Through this, you can group servers by usage and define service level agreements (SLAs) with the business to help prioritize monitoring requirements.
To help do this you can use the Microsoft Assessment and Planning Toolkit (MAP), which will give you a report on the servers that are on your environment to help you track down their location and who is responsible for them.
You can also use DBATools and PowerShell to explore your system and identify SQL Servers on the network.
Tip: Let your network team know before you use these tools, so they know you’re doing the work and don’t think there’s an attack on the network.
2. Document your servers
Next you need to ensure you have your server properties documented to help with your capacity planning, deployment, and management of them. You can use the information you document to ensure the servers are set up correctly and being used efficiently.
Also document the database properties on your servers, such as who owns them, how they’re set up, and who has access, to make sure the permissions for sensitive data are correctly set.
Finally, document the location and types of server you’re running, such as if there are virtual machines running in Azure or AWS, or if they’re in a specific data center. This matters because there are different monitoring requirements depending on how and where the servers are hosted.
To document your servers you can use PowerShell and DBATools to connect to your servers and run queries to pull this data together. You can also run T-SQL scripts against your system tables to pull out the information you need from your servers.
3. Track SQL Agent jobs
At a basic level you need to understand whether SQL Agent jobs are succeeding and failing, and you’re likely to be more interested in the jobs that have failed to understand what the problem was.
Monitoring is really important for SQL Agent jobs. You can use a tool to automatically track job successes and failures, or you can manually run queries to look at job history tables.
Beyond successes and failures you can also track jobs that are running longer than usual. For example, if a re-indexing job normally takes 46 minutes but is still running after 65 minutes, you should set up an alert so you can investigate what’s taking so long.
This is important, because if jobs overrun a ripple effect can cause additional problems such as maintenance tasks running into core business hours.
At a higher level, understanding the metrics about your SQL Agent jobs, such as average run time or frequency of failures, helps you take a more proactive approach to improving performance.
4. Identify disk space, capacity, and growth rates
Disk space always causes headaches for DBAs, who often find themselves firefighting when unexpected growth causes disks to fill up at inconvenient times.
When you know how and why disks are growing, and specifically which files on the disk are growing, you can take action to identify the cause and make the changes required to manage the issue.
By efficiently tracking your disks, files, and file groups you can ensure you get alerted to changes as early as possible to give you time to react.
For example, you can use the Windows Management Instrumentation service (WMI) to run disk space and file monitoring queries from the Windows API. Through T-SQL queries you can monitor SQL Server to understand what space is being allocated to ensure it matches requirements. PowerShell can then bring these two pieces of information together into a single report.
5. Understand system resources: CPU, I/O and memory
As well as disk capacity, monitoring resource usage is also key to maintaining a healthy SQL Server estate.
Simply monitoring your system metrics isn’t enough. It’s important to understand what’s normal for your servers and to identify where abnormal behaviour is occurring, which could be a sign of a deeper problem.
To manually monitor system resources you can use Windows Performance Monitor (Perfmon), through which you can customize collection frequency and set what information you want to gather from the server.
You can also use T-SQL to run Dynamic Management Views (DMV) and collect specific pieces of information about your server, enhanced with specific stored procedures, such as sp_whoisactive for activity monitoring.
Extended Events in SQL Server will help you identify and troubleshoot performance issues on your servers, while Query Store can aggregate information about your queries at a database level to help you catch problems.
6. Find errors
Errors clearly matter when managing SQL Server. Deadlocks, blocking processes and long-running queries are the most common performance issues that arise, and you’ll need to deal with them regularly. These can slow down or completely stop your databases from running, so keeping on top of them is a priority.
You can use a series of tools to monitor errors, such as SQL Agent to capture errors and send alerts, which can be configured using PowerShell. You can also use DBMail to receive error notifications via email, but be aware that a high volume of alerts can crash your mail server.
DMVs, as previously mentioned, can help when you use the queries in the right order, as can Extended Events, which can give you a rich amount of information as long as you ask the right questions.
7. Track and tune your monitoring
Finally, you need to monitor your monitoring to make sure you’ve got the right tools for the right jobs, and you’re getting actionable information out of your monitoring tool.
For example, if you’re getting more alerts than you can read, you’ll probably end up ignoring them all. Instead, you need to be able to accurately tune your monitoring to get the right alerts, when you need them, rather than drowning in them and risk missing something important.
Furthermore, you should be able to take action from the alert, whether that’s getting details through email or other notification channels, or getting linked directly to the problem so you can see what’s wrong and how you can fix it.
Now you know what you need to do to achieve successful monitoring of your SQL Servers, it’s time to grab a tool to automate as much of the process as possible.
With Redgate’s SQL Monitor you get a single pane of glass to manage your entire estate. It scales as you add more servers and can be adapted to meet your specific monitoring needs. You can diagnose issues and find the root cause in seconds, and you can understand the biggest performance issues to help make improvements. With alerts that can be tuned and customized to your needs, you can find and fix issues before your users even notice.
Download your free trial today to get started.
Also in Database development
In 2018 and 2019 we ran the industry’s only surveys into the state of SQL Server monitoring, both of which provided unique insights into how companies and organizations monitor their estates, the te...
Also in Blog
At Redgate, we launched the Community Circle back in April as a place our customers and the Data Platform community could visit to find free resources, tips and content to help them learn, develop and...