Simple Talk is now part of the Redgate Community hub - find out why

Using Operations Manager Reports to Validate Your Uptime

Operations Manager has a number of reports to help you monitor the uptime of your applications, but reporting can be difficult to learn until you understand all the different options, the different parameters possible, and the way the Operations Manager health model is structured. Firstly, you need a clear idea about the way that your organisation defines 'uptime'. then you can start your reports from any of the views in the Monitoring tab, and then add or remove objects to get the report you need. Thomas LaRock explains...


I often see statements written in requirements documents, such as “we require the servers to have 99.999% uptime”. And then I will see people fall over themselves setting up expensive technology to meet this criterion. The truth of the matter is the percentage is meaningless unless you know precisely how ‘Uptime’ is defined.

Let’s examine the opposite of Uptime, its evil twin called Downtime.  Downtime is simply the amount of time that something is not available. It means that the user cannot access the resource they want. So, if it is a file on a network share somewhere and they cannot access that share, then the system is “down”. Of course, the file server could be up and running, but that is little consolation to the end user trying to access the file.

What is this 99.999% number that gets bandied about? Here are some numbers that give the reality behind the high percentage commonly referred to as “five-nines”.

Figure 1 show a table that helps to explain the amount of Downtime that is represented by various percentages of Uptime, assuming a 365 day year.

Downtime (as HH:MM:SS)


Per day

Per week

Per Month

Per Year


























Figure 1

So, given sixty seconds in a minute, and sixty minutes in an hour, and twenty-four hours in a day, and 365 days in a year (let’s keep this simple, shall we?), we arrive at a total of 31,536,000 seconds in a year. That means “five-nines” of Uptime represents the same as 00.001% Downtime, which is 315.36 seconds, or approximately five minutes fifteen seconds (plus a fraction). The chart above simply lists this in the HH:MM:SS format so you see 00:05:15. That means your system needs to be available for all but slightly more than five minutes for the entire year in order to meet a 99.999% uptime.

Impossible? Probably not. But is it practical to think that your system is truly available for all of that time? For example, when it comes to database servers, I could leave the server up and running for a year, never reboot the thing, but how do I know that the instance has been accessible throughout? How exactly do you measure periods of uptime and downtime in your shop?

Operations Manager Availability Reporting

Assuming you have been able to define what it means for your servers to be “up” or “down”, you must be able to report those metrics periodically. When I give presentations on Operations Manager I am often asked about reporting. I am usually embarrassed to tell people that I do not use the reporting functionality in Operations Manager. In my present position I have never been asked to provide regular reports. This is quite a relief as I have always found the reports in Operations Manager to be difficult to work with.

I decided to use Operations Manager to report Uptime so as to become familiar with the reporting function and to get an idea of how well the systems are performing.

To access the Operations Manager reports, you need to open up the Operations Manager console and navigate to the Reporting tab. Select the ‘Microsoft Generic Reporting Library’ on the left, then on the right double-click to open the ‘Availability’ report (Figure 2).


Figure 2


You will then be presented with the report parameters screen (Figure 3). Here is the first problem I always encounter whenever running a report in Operations Manager, what object is my target? If you select the ‘Add Object’ button you will be presented with a myriad of objects, but in this particular example we are looking to focus on the database engine. But it could be the case that you want the database engine and the SQL Server Agent to be represented in your availability report. You are given a lot of available options to choose from, and there advantages and drawbacks to this. Once you become familiar with the reports it becomes easier to understand, but until then, it can be daunting.

I found that, when running reports from this library, I either need to manually select the database engine target(s), or select a group of SQL Servers that I want to run the report against. Note how I am only focusing on the database engine and not anything else as I am only concerned with knowing about any issues with only the engine. In general this report itself takes a few seconds to run. If you decide to run against several servers you can expect the running time to last more than just a few seconds.


Figure 3


   Figure 4

If you did not want to select objects or groups manually for this type of report then go back and start the process from the Database State view on the monitoring tab. Select the server name you want and, on the right hand side, you should see a list of available reports that can be run against the database engine (Figure 4). When you run the ‘Availability’ report you will notice that the object target is selected for you. At this point you can make modifications to the number of objects or groups if you need to. I prefer this method for executing the reports because it requires fewer clicks to be up and running against an individual server.


Figure 5

The time parameters selections shown in Figure 5 are sufficient, as many quick selections are available. But the results can be deceiving. For example, the total number of minutes shown as ‘Uptime’ might list the amount of time possible rather than the actual amount of uptime. As an example, if your server was built one month ago, but you run a default report going against the past year, you will see the total number of available minutes for the year as your possible uptime (Figure 7).

The options listed on the far right in Figure 3 are interesting, they are:

  • Warning
  • Monitoring unavailable
  • Planned maintenance
  • Unplanned maintenance
  • Monitor disabled
  • Unmonitored

Be careful here, because you are specifying your criteria for Uptime or Downtime. The default choice is to enable only the option for ‘Unplanned Maintenance’, which may not be the your only criterion for Downtime. My own choice would be to select every option except for ‘Warning’ as my set of criteria for Downtime.

And how does Operations Manager determine whether maintenance is ‘Planned’ or ‘Unplanned’? Well, you specify that whenever you set a server to be in ‘Maintenance Mode’. There is a checkbox for you to enable (Figure 6), and this is how Operations Manager then reports back to you later if the maintenance was planned or unplanned.


Figure 6

The way I define Downtime may be stricter than most. I understand that the Operations Manager agent may have been unavailable, but the database instance was available to handle requests. Nonetheless, I expect to have more than just the instance available; I expect to have the Operations Manager agent up and running as well. Some people tell me that planned maintenance should not count as downtime, but I disagree.


Figure 7

The above report was run against one database engine, for a period of time including the entire previous year, and with the default ‘Unplanned Maintenance’ checkbox enabled. These setting could lead to the conclusion that the instance was up 100% of the time, a span of 9,528 minutes. In reality, out of the 9,528 minutes, the server was only running during a fraction of the time which can be seen by the little green bar all the way to the right of what is labeled ‘Availability Tracker’.

You need to click on the Availability Tracker and drill-into the report to understand fully what the report is trying to tell you. In this case the report will tell me that the server has only been running since last month. Now, look at the difference if I select all options other than ‘Warning’ (Figure 8).


Figure 8

That is a big difference in uptime, isn’t it? The Availability Tracker tells me that the server has not been monitored for most of 2008, so if I decide to include that as downtime then my report output is much different. Essentially Operation manager thinks the server was up for 2008, but the agent was disabled. It would be nice if there was a way for Operations Manager to know when the agent was installed, in order to avoid this reporting issue.

Figure 4 also shows a few of the other reports available and there are many more to choose from. I would encourage you to experiment with the reports and play with a lot of the options. One report that I want to point out is the Health report, as it ties into the Availability report.


Figure 9

The Health report is intended to display the ‘Entity Health’. What’s that you say? Is that not part of ‘Availability’ for your instance? Yes it is. Actually the ‘Availability’ report rolls up into the ‘Entity Health’, so be mindful of this (Figure 9). When you are running the ‘Availability’ report, you are running a report that only checks the entities that are currently a part of the Availability aggregate monitor for the database engine. The ‘Entity Health’ encompasses a handful of aggregate monitors, which means you could have very different results.

So, how best to decide which one to use? Well, it all depends on what you have defined for your Uptime (and conversely, your Downtime) continuums. If you want everything that could possibly affect your instance then it may be better for you to examine the Health report. You might, for example, even consider your server to be ‘down’ if it was not compliant with the service pack compliance monitor.


If you go looking to find your true Uptime, you should

  1. Ensure that you have properly defined what Uptime (or Downtime) means in your enterprise.
  2. Examine the options that are available in the Operations Manager reports and decide what works best for you,
  3. Let the reports run and see how your shop fares against the mythical “five-nines”.

I don’t think that you should be discouraged if you come up short, as most servers need more than five minutes of love per year.

Operations Manager offers a myriad of reports. The information is there, but reporting can be difficult until you understand all the different options, the different parameters possible, and the way the Operations Manager health model is structured. It takes time to run reports with various options in order to understand exactly what the report is telling you. To make things easier on yourself, start your reports from the database state view, or any other views in the Monitoring tab. This will allow you to only see the reports that are relevant for the objects you are viewing, and will even pre-fill the object selection in the report parameters. From there it makes it easier for you to add or remove objects because you have an example to work with.

How you log in to Simple Talk has changed

We now use Redgate ID (RGID). If you already have an RGID, we’ll try to match it to your account. If not, we’ll create one for you and connect it.

This won’t sign you up to anything or add you to any mailing lists. You can see our full privacy policy here.


Simple Talk now uses Redgate ID

If you already have a Redgate ID (RGID), sign in using your existing RGID credentials. If not, you can create one on the next screen.

This won’t sign you up to anything or add you to any mailing lists. You can see our full privacy policy here.