Two of Your Nines Don’t Need Five Nines

The 99,9 % uptime re’s a pretty good chance that you have a number of environments that are tagged as having a five nines requirement for availability That don’t actually need it. We often find ourselves getting confused with availability and the true cost of keeping application environments up on the ever elusive 99.999% uptime.

Breaking Down the Nines

It is eye opening when we break down the actual downtime allowed in the sense of minutes per year when we look at the level of nines as listed on Wikipedia (https://en.wikipedia.org/wiki/High_availability):

nines

So, when we think about the cost of maintaining an uptime level for the application, it becomes important to see the real numbers in relation to availability. The cost of achieving 99.999 versus 99 is significant.

Fine Nines, Nine to Five

This is the real situation that many of us need to deal with. While we talk about the criticality of an application environment, it’s often thought about as critical only when it’s in active use. Most folks would obviously look at a straight 99 percent uptime with 87 hours and balk at that as a suggested availability. Here’s the catch, though. What we are really looking for is a five nines availability, but only during access hours. Many, if not most, of our internal business applications are only accessed during the day inside office hours.

Even if we span across time zones, the reality is that we aren’t using the applications during a decent amount of time in the day. Assuming that your application needs to cover time zones that span a continent, you are probably needing to cover a 10 hour day with a maximum of a 5 hour variance, totaling 9 hours a day that it is not needed for primary use. That means that you can effectively sustain 2964 hours…yes hours…of downtime. That means 177,840 minutes.

Does this mean we can shut them off? Well, not quite. Let’s talk about why.

The Highly Used Unused Application

Applications are considered active during a certain window which I refer to as primary use. There is a non-primary use set of processes which happen as well.

Backing up the environment is a good example of this. Backups tend to run off hours so as not to collide with primary use inside business hours. Security scans and other safety practices also take place outside of core hours to help with keeping application performance more stable during primary use hours.

Because of this we do still have requirements to keep the application platforms available to run these other operational processes.

Scale-back Versus Shut Down

As your application environments are being architected or refactored, it is good to think about the importance of a microservices approach and why it can help with this issue.  I know that there are assumptions around the fact that we are choosing when system availability occurs, but the important part of this discussion is that you may be paying for a surprising amount of warranty on systems that don’t need it.

We can see that we can’t really just power off servers a lot of the time because of the backups, security scans, and other non-primary use access. What we can do is to use a scale-out and scale-back approach.

Web applications may need the back-end to be continuously available but at different levels of usage. During off hours, why not have less front-end servers? Data layers can stay up, but can also be scaled down.

Some applications like file servers and less variable use applications will not do well in scale-up/scale-down scenarios. That’s ok. We have to accept that hybrid approaches are needed across all areas of IT.

Why is This Important?

Think ahead. When the architecture is being evaluated for production and disaster recovery, we should be thinking about primary use and availability as well as the non-primary use functions like data protection.

All of a sudden, those buzzwords like microservices and containers with infrastructure as code seem to make some sense. Should you be racing to refactor all of your apps? No. Should you be continuously evaluating the environment? Yes.

Most importantly, be aware of the true cost of the five nines and whether you really need it for all of your applications.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.