The One Way That High Availability Will Help You

High Availability (HA) is a term that is beloved of the marketing people, with its connotations of an unspecific sense of reassurance. However, service reliability cannot be bought like bath salts: But, explains a seasoned and cynical expert in the field, HA can be more than the start of 'HAHA!'. There is a role for HA in keeping your services running, if used as part of a broader solution.

After eviscerating high availability over not one, not two but three articles, I’ve probably left you wondering if there’s anything positive that high availability can do for you. Well, you’ve waiting so patiently, so it gives me great pleasure to confirm that, fortunately, there is something good about high availability. While you might be expecting the article to keep with the theme I’ve established thus far, and be titled something like “Seven Things that High Availability Will Help You With”, in truth there really aren’t seven distinct things that HA is good for. Instead, you can read about the one thing that it’s good for, seven times.

I should give fair warning that the startling conclusion to this epic saga may be obvious to many of you. Indeed, I hope it is, because that way I can sleep soundly, knowing that the networks of the world are patrolled by savvy administrators. However, that doesn’t mean that it doesn’t bear pointing out, or that there mightn’t be a twist you hadn’t considered. Read on, intrepid SysAdmins, and have your hunches confirmed.

Definition and Exposition

First, let’s make sure that we all understand what high availability actually is. HA is any implementation of a set of hardware or software features that protects a specific service, allowing for some part of the system that the service is running on to fail, while still allowing access to that service. Perhaps that description, in spite of being a run-on sentence, is an oversimplification. There are plenty of nuances, caveats and exceptions that will muddy the waters in some HA implementations. However, generally speaking, that’s a fair summation of high availability.

With that in mind, the one thing that HA systems are good for should be self evident: Keeping a very specific service protected from unavailability as a result of a very specific set of system failures. Yes, there is a lot of emphasis on specificity. If it sounds to you like HA is only good for a virtual knife edge of scenarios, then you’re thinking correctly. Indeed there are many different methods of achieving high availability on the market, including application HA, hardware HA and site HA among the more popular kinds. Regardless, each kind of HA has a miniscule list of disasters that it can protect against.

The concept of “what HA will protect you from” is probably not much different from an emergency plan that you may have in case of a natural (or unnatural) disaster. Where I grew up, earthquakes and volcanic eruptions were significant concerns, and my family had an emergency plan in case of a sudden disaster. We did our best to practice it and keep it ready, but we obviously all hoped we’d never have to use it. Of course, the plans we had would have proved virtually useless in the event of a different disaster, like a flood or house fire. In the same way, HA is merely a first line of defense against a specific set of worst-case scenarios, and will prove wholly inadequate for a separate set of disasters.

And Now For An Example

Just recently I had an up-close-and-personal encounter with the narrow scope of systems protection. I’m designing a small datacenter deployment that needs a highly available firewall solution, and one of my better design proposals involves a pair of Active/Passive ASA 5510s. I began to evaluate exactly what disasters that design protected me from, and was saddened at the shockingly small list – I’m only protected from a hardware fault in the primary ASA. No upstream network redundancy. No site redundancy. Just plain, ol’ lemon protection in case an ASA bites the dust.

It might be obvious to many (I certainly hope it is), and indeed I wasn’t surprised, but it is a good reality check to explicitly list out all of the potential disasters that lurk in the shadows, followed by a list of what (if anything) you’re protected from with your current HA systems. The disparity between the number of items on the first list and the number of items on the second will determine your resiliency (as well as your ability to sleep soundly at night).

Grudging Admittance

“Is that it?!” you might be thinking. “Surely there must be more that high availability is good for!” Okay, maybe there’s a few other things that I’m willing to admit HA is good for. First, it’s a good time-buyer. Remember that if one component dies and the failover steps in, you haven’t exactly avoided a disaster; you are currently living in one. You will not be out of that disaster until that failed node is repaired and brought back into the cluster. However, the failover component has bought you some valuable time to bring the situation back to normal. Many a grievous mistake has been made when a service is completely unavailable and a replacement needs to be ordered and implemented in mere hours. HA allows you some calmer moments to make more perspicacious decisions.

High availability could also be a useful marketing tool. HA looks good to customers and management. Before I go any further, know that I’m most certainly not advocating false bravado. However, if you’ve implement an impressive high availability system, then you should tell people about it. If your HA system faces inwards relative to your company, then make sure that upper management knows about it. Executives love to brag to other executives about the gadgetry that makes their company run, so you can play a small part in developing the positive image of your company (just make sure that the image is warranted).

Equally, if the HA system in question directly faces your customers, then tell them about it! It can increase customer confidence and potentially increase sales. Every little bit, as long as it’s truthful, can count in the mystical art of customer conversion rates.

In Conclusion

If your reading speed is in the average range of an adult (250 to 300 WPM) then it’s only taken you four or five minutes to power through this article. Twenty minutes if you’re multitasking with salesmen on the phone and a flapping Nagios monitor victimizing your cell phone. Hopefully nothing I said was terribly groundbreaking for you. If it was, not to worry! We all learn new things every day. If it wasn’t new, then consider yourself refreshed and ready to re-assess your HA efforts.

At the end of the matter, HA is just the tip of a very large technological iceberg. More like a single ice crystal in a snowflake in a drift on the south end of B-15. HA in its most proper form is built on top of excellent change management, stellar lifecycle management and a magnificent supporting infrastructure (meaning not having an old DSL line as your web connection for a monster Hadoop cluster). However, if kept in proper perspective, high availability can be a very effective tool in keeping your services running and your business profitable.

See? It’s not all blood and gore! I managed to (somehow) end this series on a positive note, which, given the amount of marketing hyperbole I’ve been bombarded with in my day job, is something I’m rather proud of. High availability does have a valuable role. A very tightly scoped role, but a valuable one nonetheless. Just make sure that you’re intimately acquainted with what few scenarios are covered by that scope. After all, the daemon is in the details.