Global Sources
EE Times-Asia
Stay in touch with EE Times Asia
EE Times-Asia > Networks

Ensuring QoS in oversubscribed networks

Posted: 23 Sep 2005 ?? ?Print Version ?Bookmark and Share

Keywords:network infrastructure? voice over ip? voip?

By Jim McKeon
Cortina Systems

Enterprise network applications are rapidly evolving once again as businesses discover productive new uses for their network infrastructures. Whereas enterprise networks once enabled mainly client-server computing architectures, today they deliver a myriad of functions including, voice over IP (VoIP), video-conferencing, wireless access, storage management, application hosting, e-mail/web access and document sharing. As this evolution continues, more new applications will be developed, traffic types will diversify, and enterprise networks will become more useful and more critical to the enterprise.

At the same time, the philosophy of technology investment has clearly switched from performance to return-on-investment. Network managers work diligently to achieve the widest and richest connectivity experience for each dollar they invest in their enterprise networks and they carefully scrutinize the payback period for everything they purchase for the network. New functionality needs to deliver an identifiable, quick return to justify its deployment.

Within this environment, equipment vendors must balance the service guarantees their customers require with the price points they demand. One historically popular means to achieve this is through oversubscription. By offering more customer-facing interfaces than the packet processing and switch fabric capacity behind them can accommodate at once, the switch vendor can more than halve the cost per network port. However, oversubscription requires a trade-off. If and when all the interfaces are transmitting simultaneously at the line-rate, packets must, and will, be dropped. Which raises the question: Should some packets be dropped to ensure that others will be delivered? Or, stated more simply: Are all packets created equal?

In older networks supporting fewer applications, there was little need to distinguish critical traffic. Oversubscription could be offered without regard to the nature of the packets flowing through the network, and any resulting packet loss did not disrupt application performance. But today's networks, particularly those supporting real-time communications such as voice, require a new philosophy towards oversubscription. For real-time applications to function appropriately, network equipment must ensure that the most critical traffic has the highest probability of being delivered as well as the least amount of latency. Indeed, in today's networks, all packets are not created equal.

Gettin' real
The leading driver for quality of service (QoS) in the enterprise is real-time communications. Applications such as voice and video-conferencing have specific and stringent QoS requirements that have traditionally been achieved by delivering these services on separate networks.

There are three key performance parameters governing real-time communications: latency, jitter, and packet loss. For voice, an end-to-end network delay of less than 300 ms is defined by the ITU as the acceptable maximum, although delay of greater than 200 ms renders VoIP to a quality that is not competitive with traditional TDM solutions. Jitter, or the variance in latency, must also be controlled for voice calls to be intelligible.

When it comes to video, packet loss is paramount due to the very nature of video applications. This is because video streams are rendered as a computation of the differences between the current frame and a reference frame. Due to the limits of this computation, if a video service sustains a packet loss rate of 3% to 5%, the video feed is lost and regaining it means going through "painful" resynchronization process. Voice quality also is dependent on low packet loss, as only a slight 1% to 2% loss can compromise the service.

Other protocols also are sensitive to excessive packet loss. Most modern connection-oriented applications run over TCP, which is a form of a positive acknowledgement protocol. All TCP transmissions are acknowledged. If a packet is lost, the TCP algorithm will initiate retransmissions until receipt is acknowledged. However, the performance penalty for this type of delivery guarantee is severe. The TCP transmission window cannot advance until the earliest transmission is acknowledged and the congestion window will be reset to the maximum segment size.

These effects can degrade performance of the underlying application dramatically. With a random packet loss of constant probability, TCP throughput may be estimated by the following equation [1] :

Throughput where:

MSS = maximum segment size

RTT = round trip time

P = packet loss rate

By this estimation, a reduction in the packet loss rate from 1% to 0.1% will improve TCP throughput by a factor of three.

Ethernet creates an additional issue because it mixes data packets and control packets. Unlike protocols Sonet, Ethernet has no out-of-band operations, administration, and maintenance (OAM) method. Therfore, all control information must share a single communications channel with the data. Although these are typically low-bandwidth services, they can be disproportionately sensitive to packet loss. The ability to isolate this traffic from congestion and ensure delivery improves the stability of the network and reduces the time to analyze faults.

In short, delivery of critical packets is the key to effective network performance.

Egress vs. Ingress
Network equipment vendors have long understood the relevance of preserving critical traffic at the egress port of the switch. Any packet switching device has a natural oversubscription point at an egress port!if multiple ingress ports send line-rate traffic to a single egress port, that port is oversubscribed. To deal with this reality, equipment vendors have developed a methodology:

  • Classify the traffic by importance of delivery

  • Segregate it into multiple queues

  • Buffer as much of it as is feasible

  • Schedule it out according to priority

  • A sketch of this solution is shown in Figure 1 below:

Figure 1: Egress Datapath

Oversubscription of the ingress port creates precisely the same problem. If traffic is not properly classified and separately queued, then critical real-time communications and control packets will be dropped as frequently as unimportant web traffic. If there is not a substantial amount of buffering, even a small degree of oversubscription will quickly cause tail-drop of both critical and non-critical traffic. And without a scheduling policy that guarantees bandwidth to critical flows, latency will suffer. In short, to provide equivalent QoS, the entire infrastructure that manages egress oversubscription must be duplicated at the ingress oversubscription point as shown in Figure 2:

The following sections of this article will describe these architectural features in detail and how they may be applied to the ingress oversubscription problem.

Attributes of interest
Classification is the process of identifying the ingress packet stream by a series of pre-defined attributes. In most networking applications, the attributes of interest occur at layers 2-4 of the OSI reference model. Those that need to be checked are dependent on the applications being serviced by the network, and what type of QoS is provided in the networking equipment. A review of the relevant protocols is included in Table 1 below:

Table 1

Both Ethernet and IP provide facilities to mark packets with QoS levels that downstream switches and routers should honor. For Ethernet this occurs in the 802.1p bits of the VLAN tag, which allow up to 8 QoS values. IPv4 provides the TOS field, which originally specified 8 QoS levels but was redefined by the DiffServ RFCs to support up to 64 levels. The newer IPv6 preserves this capability via its Traffic Class field which also supports up to 64 levels. Enterprise networking equipment being delivered today should be capable of identifying as well as marking all of these QoS options.

Queuing up
Queuing is the adjunct of classification; without sufficient categories to group the traffic, there is no value in classifying it. As shown in Table 1, it is clear that we can define a great many classifying attributes, but fortunately many of these share common QoS requirements. Real-time communications form one such class: to achieve minimal latency and loss these packets need to be delivered with the highest priority. Network control forms another class: by avoiding drops of control packets, network stability and resiliency increases dramatically.

Multicast/broadcast services are another useful class, often related to the distribution of application data among a population that requires uniform access to it. And the remainder of traffic without sensitivity to delivery or latency can be grouped as a fourth, best-effort class. While it is certainly possible to create even more classes given the range of QoS options, these are sufficient for most enterprise environments today and in the foreseeable future.

Buffering and WRED policy
The amount of buffering is the most difficult parameter in the oversubscription architecture because application diversity makes it difficult to quantify the requirement. Some applications require low latency, while others require minimal packet loss. How should the system decide how many packets to drop, and how long to hold them?

No oversubscribed system can buffer traffic indefinitely. But the burstiness of the Ethernet traffic will be a function of the utilization of the underlying links, and here we can make certain assumptions. Virtually all enterprise networks today are designed hierarchically, so that the closer one gets to the users of the network, the more lightly loaded the individual links will be. As link utilization decreases, the permissible degree of oversubscription increases; therefore, the edge of the network presents the greatest opportunity for oversubscription. Here ratios of 2:1 or 4:1 are common, with desktop switches commonly configured with 24 or 48 Fast Ethernet ports serviced by 2 Gigabit Ethernet uplinks. As we get closer to the network core, link utilization increases, and oversubscription beyond a 2:1 ratio is generally inappropriate.

Another way to quantify buffering for ingress oversubscription is in comparison with egress oversubscription. For chassis-based networking equipment, most vendors supply several megabytes per Gigabit Ethernet port of egress buffering, which supports an oversubscription ratio that ranges from 1 to the number of ports in the switch. This choice is a mix of the architectural and practical: it represents a heuristic experience of the tolerable oversubscription ratios observed the field, and the prevailing cost of dynamic memory. The ingress problem is more deterministic, in that the oversubscription ratio is not dependent on the variable destinations of the traffic.

Given that we would like to support at least four queues, one last parameter to consider is the Maximum Transmission Unit (MTU). The Ethernet standard defines this as 1536 bytes, but non-standard Jumbo packets up to 9600 bytes are commonly provided for by Ethernet equipment. We can assume that we would like preserve several MTUs per queue to minimize packet loss, particularly for high-bandwidth file transfer applications like NFS.

Considering these together, ingress oversubscription buffering in excess of 1 MB per Gigabit Ethernet queue is appropriate for most networks. This will preserve over 100 Jumbo packet MTUs per port, is consistent with what is provided for egress oversubscription, and is economically provisioned with modern memory technology.

Weighted Random Early Detection (WRED) is a means of gracefully scaling the size of network queues in the presence of congestion. As traffic increases and queues become longer, the queue manager will begin to drop packets according to a probability curve, rather than waiting until the queue is full and dropping all new packets. This behavior has the beneficial effect of randomizing which traffic flows lose packets and avoiding the synchronized dropping and retransmission of packets. The WRED drop probability curve can be of any shape, but it typically embodies an increasing probability with queue length as shown in Figure 3:

Figure 3: WRED packet drop probability

The effectiveness of this function is directly proportional to the size of the oversubscription buffer; without sufficient buffering, there is no means to gradually scale the drop probability. But with adequate buffering and a WRED policy, the oversubscription traffic may be elegantly managed, with as little noticeable disruption to the user community as possible.

Scheduling packets
Scheduling is the process of determining the order of packets to transmit through the oversubscribed interface. The key issue is defining the scheduling policy: how does the equipment decide which queue to service next, and in what quantity? To ensure QoS, this policy must be as fine-grained as the queuing architecture, so that the distinct traffic classes are delivered appropriately. There are two common scheduling policies: strict priority and round-robin. The strict priority method is as it sounds!the order of the priorities is defined, and a higher priority trumps lower priorities. In a simple two-queue architecture, the higher priority queue would always be serviced if it had packets available, and the lower priority queue would wait indefinitely for the high priority queue to empty.

While this ensures minimal latency for high-priority packets, it is not always desirable that other queues should wait indefinitely for service. Most networking applications have some sensitivity to delivery time!TCP connections will re-synchronize and throughput will suffer. Here the round-robin scheme becomes attractive!each queue is allocated bandwidth, and the different queues are scheduled in turn. With this mechanism all the queues will be serviced regularly, and none will be starved.

To connect these tools with the prevailing applications running over the network, it is common to combine both methods. One queue is defined as strict priority, with the caveat that only application traffic with specific low-latency requirements!such as real-time communications!is placed in this queue. It is reasonable to assume that the overall bandwidth consumed by these applications won't exceed the oversubscribed interface bandwidth, so there will always be excess bandwidth available for the remaining queues. These queues then have a round-robin scheme defined among them, which may allocate more bandwidth to some priorities than others to tailor the system to the application environment. This strategy provides the best of both worlds!low-latency for the applications that need it, and fair delivery for what is left.

This article has reviewed the techniques networking equipment manufacturers may use to provide ingress oversubscription with high QoS. Cost reduction often requires significant performance compromises, but with careful system design oversubscription can yield dramatic economies with minimal service degradation. With new applications blossoming, a flexible approach that can grow with the evolution of the network is critical to realizing value from the infrastructure investment.

About the authors
Jim McKeon
is a product manager at Cortina Systems, covering Ethernet devices. Prior to working in marketing, Jim was an ASIC designer for several years, working primarily on networking devices.

Mathias,, The Macroscopic Behavior of the TCP Congestion Avoidance Algorithm, Computer Communication Review, Vol. 27, Number 3, July 1997.

Article Comments - Ensuring QoS in oversubscribed netwo...
*? You can enter [0] more charecters.
*Verify code:


Visit Asia Webinars to learn about the latest in technology and get practical design tips.

Back to Top