Step up to Red Alert!

“Step up to Red Alert!”
“Sir, are you absolutely sure? It does mean changing the bulb…”
                                                                — Red Dwarf 

Computers are stupid. This is a well established fact, which has been en-harped on by much greater and perspicacious authorities than I. They, that is, computers, tend to blindly and relentlessly do what they’re told. Given conflicting, asymptotic or just plain dumb instructions the machine will blindly and relentlessly attempt to execute said instructions with blindingly relentless blindness, and without relenting.

But despite this we have come to rely on them utterly. Many centuries ago, a lone imbicile might work away in his hut for many, many years, perchance to dream of such levels of productive, innovative dullardry. Even the Industrial Revolution didn’t come close, even having conceived the idea that if one were to shepherds right-thinking imbiciles together from a large area into a sort of pen, or congregation, then their productivity would greatly increase. Certainly the noise level and the smell increased, and these were signs that the humours were ticking over nicely, thereby giving the industrialist a warm feeling of success and busy-ness, ne business. But, alack, no cluster of imbiciles savant, however skilled, can replace a computer’s ability for quickly and unthinkingly doing as instructed, even if it takes said machine into the valley, so to speak, of the shadow of death. Wherein lie monsterous hazards known by obscure, demonic names: O’exate’oofor’ooofyve, Ess’teegee’eee’filenotfound, Bee’ess’o’dy.

Which almost brings me to my point, and although I hesitate to arrive so alacritously having only taken a more or less single and relatively brief tangent, this may be universally judged to be a good thing. Computers are stupid, and given the chance will rush headlong into the abyss without so much as a by your leave. And since they’re capable of making mistakes at such high velocity (bing-bing-bing-bing, 3.5 billion mistakes per second without touching the sides) it can be a hard task to keep them on the straight and narrow. To shepherd them, as it were, through the valley. To catch the cogs flying from the difference engine before they decapitate an innocent passer by.

There are many professionals whose job it is to do just that, for SQL Servers. And it can be a difficult and thankless task. DBAs, and other professions sharing similar responsibilities, have to make sure that the SQL Servers in their charge are rattling along in maximum Maytag mode without killing anyone and without too many bits falling off in the process. And this, usually, “at the same time” as they are performing their other duties, which already consume a full working day’s worth of effort.

SQL Server itself isn’t much help. It’s quite happy to answer questions as to how well it’s doing in, say, executing overnight jobs which were backing up important databases. But, for preference, it would rather play 20 questions with you (“Did this job go well?”, “No”, “Was it a backup?”, “Yes”, “Animal, vegetable or mineral?”) than give you a quick summary of what went well, or more importantly what went badly, in the last N hours. And should you be faced with a small room of these things, let alone an enterprise, then your chances of extracting some sense from the ensemble each morning, just to make sure that no servers caught fire over night and that all your backups actually took place, are drastically reduced.

Enter SQL Response. A new tool from Red Gate, and a continued foray into the area of tools focused towards DBAs, SQL Response is part watchman, part messenger and part coroner; part missile warning system and part CSI investigation team.

Installed in a convenient nook from whence it can monitor nearby SQL Servers, SQL Response keeps a watchful eye on your SQL Servers looking for problems. Problems, as far as SQL Response is concerned, are things which are really likely to spoil your day as a DBA. A SQL Server which appears to be down is one obvious candidate. Jobs which have failed are another. Yet another is a disk drive looking very full. A blocking or deadlocked process. A job which worked, but took longer than expected. And so on.

Being a Red Gate tool, SQL Response doesn’t just cover the obvious cases. How about a job which missed its run because the SQL Agent is disabled? Or which should have run, but couldn’t because it’s scheduled to run every half hour and the previous run took 31 minutes? If a disk drive is full, then is it the system (Windows) drive, or a drive used by a SQ Server database? Is that SQL Server down, or did the login just fail? And is the Windows box itself down?

More than just alerting you to the fact that the problem has occurred, SQL Response tries to help you find out why the problem occurred by looking at the “scene of the crime”. So whilst it’s firing off an e-mail to you telling you about the problem, it’s simutaneously gathering up information on SQL performance, SQL connections,  running Windows processes. Even, if requested, the SQL code running around the time the problem occurred. SQL Response captures as much of this information as seems appropriate, together with as much information as possible on the problem itself, and files this bundle of fun as an “incident report”. These incidents are available from within the SQL Response client application. If you have the client and your e-mail application installed on the same machine, you can even click the link inside all SQL Response e-mail alerts to jump straight into the application and view the incident report.

Once inside SQL Response you can set about diagnostic the problem. Some problems have a simple remedy inside SQL Response itself – a failed job is easily restarted. Most problems, of course, require the seasoned eye of a DBA or other professional to diagnose. SQL Response provides all the information it can, like any good investigator. Leaving the professional to work their Holmsian powers of deduction with data at their fingertips. 

SQL Response can also make recommendations. If your databases are in need of backing up, reindexing, or otherwise maintaining in key ways, SQL Response will do its best to point this out.

Of course, it’s all very well having a sturdy watchman nearby who will sound alarm bells when things go wrong. But there is such a thing as crying wolf. Computers being what they are, things go wrong all the time, and one doesn’t necessarily want someone town crier throwback heckling you constantly about every little niggle. If you have a job which runs every 5 minutes, and which starts failing at 2am whilst you’re happily convalescing, do you really want 72 “Job failed” e-mails sitting primly in your inbox when you arrive in the morning? And continuing to arrive, bing-bing-bing-bing, every 5 minutes until you fix the problem?

So, naturally SQL Response will allow you to ignore future incidents of a particular type, either on a particular server or just generally. But more importantly, even when its monitoring is full steam ahead, it doesn’t spam e-mails at you constantly. In any 24 hour period, SQL Response will send one e-mail when a particular problem first arises (say when job A on server B goes wrong), and another if it happens again. Since deathly silence isn’t necessarily a vast improvement over crying wolf, SQL Response will continue to raise incidents in its own client UI, for every failure, collapsed down into a single row to avoid visual clutter. It just stops heckling you about them. Should you care about the 56th job failure more than the first or last, you can easily drill down and inspect the gritty details.

However, much as I could easily sit here in a continuing pretence that this unspeakable waffle is a reasonable justification for my paycheck, I suspect the world would be better served through my cessation in doing so. As a closing thought, therefore, if you’re a DBA or other professional who has to worry about the health and general well being of SQL Servers, why not check out our beta at the below link. We’d love to hear what you have to say.

And perhaps you can avoid having to change the bulb. 

SQL Response 1.0 Beta