Availability Concepts for Networks and Systems

Understand these important factors for computer networks

In computer hardware and software, three elements are crucial to ensure that everything works as it should (and continues to do so): availability, reliability, and serviceability. Maximizing these qualities in your system will help you avoid unforeseen problems. They won't prevent them completely, but when issues do arise, they'll also make them easier to fix.

Graphic designer working late at computer in office
Hero Images / Getty Images

What Are Availability, Reliability, and Serviceability?

Availability refers to the overall "uptime" of the system or its specific features. For example, a personal computer is "available" for use if its operating system is booted and running.

While related to availability, the concept of reliability means something different. Reliability refers to the general likelihood of a failure occurring in a running system. A perfectly reliable system will also enjoy 100% availability, but when failures do occur, they can affect availability in different ways depending on the nature of the problem.

Serviceability affects availability as well. In a serviceable system, you can detect and repair failures more quickly than in an unserviceable one, meaning you'll have less downtime per incident on average.

Availability Levels

The standard way to define levels or classes of availability in a computer network system is a "scale of nines." For example, 99% uptime translates to two nines of availability, 99.9% uptime to three nines, and so on. The below table illustrates the meaning of this scale. It expresses each level in terms of the maximum amount of downtime per (nonleap) year that could be tolerated to meet the uptime requirement. It also lists a few examples of the type of systems being built that commonly meet these requirements.

Network and System High Availability Levels
Network and System High Availability Levels. Lifewire / Bradley Mitchell

When talking about availability levels, note that the overall time frame involved (weeks, months, years, etc.) should be specified to give the strongest meaning. A product that achieves 99.9% uptime over a period of one or more years has proven itself to a much greater degree than one whose availability has only been measured for a few weeks.

Network Availability: An Example

Availability has always been an important characteristic of systems but becomes an even more critical and complex issue on networks. By their nature, network services are commonly distributed across several computers and can depend on various other auxiliary devices as well.

Take the Domain Name System (DNS), for example, used on the Internet and many private intranet networks to maintain a list of computer names based on their network addresses. DNS keeps its index of names and addresses on a server called the primary DNS server. When only a single DNS server exists in a system, a server crash takes down all DNS capability on that network. DNS, however, offers support for distributed servers. Besides the primary server, an administrator can also install secondary and tertiary DNS servers on the network. Now, a failure in any one of the three systems is much less likely to cause a complete loss of DNS service.

Server crashes aside, other types of network outages also affect DNS availability. Link failures, for example, can effectively take down DNS by making it impossible for clients to communicate with a DNS server. It's not uncommon in these scenarios for some people (depending on their physical location on the network) to lose DNS access but others to remain unaffected. Configuring multiple DNS servers also helps to deal with these indirect failures that can impact availability.

Perceived Availability vs. High Availability

Not all outages are created equal: The timing of failures also plays a big role in the perceived availability of a network. A business system that suffers frequent weekend outages, for example, may show relatively low availability numbers, but this downtime may not even be noticed by the regular workforce. The networking industry uses the term "high availability" to refer to systems and technologies specially-engineered for reliability, availability, and serviceability. Such systems typically include redundant hardware like disks and power supplies and intelligent software like load-balancing and fail-over functionality. The difficulty in achieving high availability increases dramatically at the four- and five-nines levels, so vendors can charge a cost premium for these features.