There are QUESTIONS to help you choose the right availability protection for your applications.
Prevention of unplanned downtime is a growing concern in today’s always-on world. You know you need a way to keep critical applications up and running, but with so many options on the market, how can you determine which availability solution is right for your organization? This guide presents a series of questions you should ask vendors when evaluating solutions to protect your applications against costly downtime.
It highlights key considerations and provides valuable insights into the strengths and limitations of various availability approaches. Vendors’ responses to these questions will enable you to compare solutions and identify those that best meet your availability requirements, recovery time objectives, IT management capabilities, and return on investment goals while integrating seamlessly within your existing IT infrastructure.
What level of uninterrupted application processing can your solution guarantee?
There are a variety of availability solutions on the market today, each of which delivers a different level of application uptime. When evaluating solutions, it is helpful to ask vendors how many “nines” of availability their offerings provide — a figure that represents the average amount of uptime their customers should expect per year. This is an important first step in determining which solution best meets your organization’s specific requirements.
If your availability requirements are relatively low, you may be able to get by using a standard server with duplicate internal components. These servers typically deliver two nines — 99% — or more of availability for the applications running on them, which can result in as much as 87.6 hours of unplanned downtime per year. Continuous data replication delivers three nines — 99.9% availability — which equates to 8 hours and 45 minutes of downtime annually.
For those with more rigorous availability requirements, traditional high-availability clusters, which link two or more physical servers in a single, fault-resilient network, get you to 99.95% availability or 4.38 hours of downtime per year. Virtualized high availability software solutions deliver four nines of availability — 99.99% — which reduces unplanned downtime to 53 minutes per annum. Fault-tolerant solutions are often described as providing continuous availability because they are designed to prevent downtime from happening in the first place.
Fault-tolerant software and hardware solutions provide at least five nines of availability — 99.999+% — for the minimal unplanned downtime of between two and a half and five and a quarter minutes per year. While fault-tolerant hardware and software solutions both provide extremely high levels of availability, there is a trade-off: fault-tolerant servers achieve high availability with a minimal amount of system overhead to deliver a superior level of performance while fault-tolerant software can be run on industry standard servers your organization may already have in place.
In the event of a server failure, what is the process to restore applications to normal processing operation and how long does it take?
With most availability solutions, there will be some level of system interruption in the event of a server outage. Therefore, when evaluating solutions, it is important to understand what is involved in restoring applications to normal operations and how long the process takes. If you rely on standalone servers, your recovery time could range from minutes to days given the high level of human interaction required to restore the applications and data from backup — provided you’ve been backing up your system on a regular basis. With high availability clusters, processing is interrupted during a server outage and recovery can take from minutes to hours depending on how long it takes to check file integrity, roll back databases, and replay transaction logs once availability is restored.
If the cluster was sized correctly during the initial planning stages, users should not experience slower application performance while the faulty server is out of operation; they may, however, need to rerun some transactions using a journal file once normal processing resumes.
Fault-tolerant solutions proactively prevent downtime with fully replicated components that eliminate any single point of failure. Some platforms automatically manage their replicated components, executing all processing in lockstep. Because replicated components perform the same instructions at the same time, there is zero interruption in processing — even if a component fails. This means that, unlike a standalone server or high availability cluster, the fault-tolerant solution keeps on functioning while any issue is being resolved.
How does your solution protect against loss in-flight data?
When a system outage occurs all data and transactions not yet written to disk are at risk of being lost or corrupted. In the case of some applications, this risk may be tolerable. But when you consider systems that automate functions like building automation and security, public safety, financial transactions or manufacturing processes, the loss of in-flight data can have serious consequences ranging from a scrapped batch or lost revenue to compliance issues or even loss of life.
Many availability solutions are not designed to ensure transaction and data integrity in the event of a system failure. Depending on how the hardware is configured, standalone servers and high availability clusters can typically preserve the integrity of database transactions, but any in-memory data not yet written to disk will be lost upon failure. Fault-tolerant solutions are built from the ground up to provide higher levels of data integrity. Fully replicated hardware components and mirrored memory ensure that all in-flight transactions are preserved — even when a hardware component fails.
Written by Dionaro (Dion) Orcullo
For more information, please contact firstname.lastname@example.org or call 02-679-2233