Global Sources
EE Times-Asia
Stay in touch with EE Times Asia
EE Times-Asia > Interface

Getting max packet processing throughput per app flow

Posted: 01 Aug 2003 ?? ?Print Version ?Bookmark and Share

Keywords:packet processing? internet protocol? ip? vpn? qos?

The demand for advanced services - such as Internet Protocol virtual private networks (IP VPNs), application-level QoS, data encryption, managed firewalls and address translation - has not subsided. Many customers have expressed willingness to pay premiums for such special treatment of their important network traffic.

Unfortunately, the trials of service-enabling systems have not gone well as hoped. Most cannot handle the concurrent service mix and high-touch packet processing throughput required by large numbers of higher-bandwidth business users who need to be sure their time-sensitive and mission-critical applications always receive appropriate precedence and security.

To address this problem, a new architecture for the next-generation IP service edge switch is emerging, which uses high-speed ASICs in combination with programmable network processors to yield maximum packet processing performance and flexibility. Interestingly, it is not focused on switching bandwidth or interface speeds, but on what really matters - maximum packet processing throughput per application flow, without any compromises.

By distinctly identifying subscriber traffic and classifying application flows at the network boundary, the IP service edge switch is able to apply QoS and encryption algorithms correctly and aggregate flows prior to sending the traffic on to the shared IP backbone. Attempting to initiate QoS or security techniques elsewhere in the network could be futile, since the traffic would first have to cross the backbone in the clear where it might be subject to congestion and security breaches.

The real challenge for the IP service edge switch is the ability to apply all required services concurrently to even the largest application flows or class-based aggregate flows, without introducing performance-impacting latency or jitter. Additionally, since these services must be managed and ultimately billed for, detailed statistics collection is required. And because the switch will be widely deployed in local communities, it must deliver all these capabilities at a reasonable cost.

Such uncompromising performance is only possible if the IP service edge switch is built upon an architecture that is up to the challenge. Service providers looking to attract and retain lucrative business customers using advanced IP services will want to understand what makes these essential capabilities possible.

In general, an IP service edge switch receives data on its input interfaces, processes it and sends it out to appropriate output interfaces. Since the switch is designed for the edge of the network, this data-handling flow is not usually symmetrical.

On ingress to the network, many thousands of subscribers' data flows are aggregated onto a few backbone links. While the backbone links are quite large, it is still possible for them to be oversubscribed, and therefore it is important that subscriber traffic is properly aggregated according to relative priority of the type of the traffic.

In the opposite direction, data arriving from the network backbone has to be channeled into small subscriber access links. As diverse traffic flows are transmitted onto individual subscriber links that are much smaller in throughput than the core-side links, they have to be properly prioritized and controlled.

This aggregation of data flows requires a switch and packet processing architecture that can handle individual flows just as well as it handles aggregated flows. That can differentiate between different traffic types, properly manage traffic aggregation and account for all data passing through the switch. Since the switch is deployed in a critical point of the network, it must also provide reliable operation even in case of failures.

Many architectural choices must be made around how I/O, packet processing and traffic manager modules are connected together. For reference, the most basic configuration would comprise one input, packet processor, traffic manager and output. A more flexible and interesting system would have several of each of these components.

But before such a system can be built, two issues must be settled. First, the size of each component must be decided since Each I/O module has to be sized in accordance with the particular media it services. Each packet processing module has to be able, at a minimum, to handle all the processing requirements of the largest single traffic flow. Since the largest traffic flow can be as large as the largest I/O port in the system, packet-processing bandwidth has to be sized accordingly.

The second issue to be decided is how a traffic flow should be aggregated relative to all other flows. This aggregation decision must be made after packet processing has been completed. It is the foundation of the QoS treatment applied to a traffic flow. The traffic-manager module is responsible for this.

In a perfect system, all data destined to an output port would be received in the output port's traffic manager where QoS and flow aggregation decisions are made. The basic reference architecture therefore places the packet processor on the input side of the switch and the traffic manager on the output side. Physically locating packet processing and traffic management together with the I/O module may seem like a good idea until the cost ramifications are examined.

This is because there is a huge difference in the bandwidth requirements of I/O modules for an IP service edge switch ranging from low-speed access to high-speed core-side interfaces. While one could use the same packet processing and traffic management hardware for all I/O modules, the cost to support lower-speed interfaces would prove to be prohibitively high. Conversely, optimizing the packet-processing and traffic-management designs for each I/O module type is not practical from the engineering resource and support aspects.

Sharing resources

A better solution to the design of an IP service edge switch is to make packet processing and traffic management separate resources, which may be shared among multiple I/O modules. This has the advantage of making efficient use of development and system resources, improving scalability and facilitating system redundancy. Additionally, packet processing and traffic management are inherently separate processes. As such, placing a switch fabric between them makes sense as well.

While the I/O, switch-fabric, and traffic-management modules are important components of the IP service edge switch, the real heart and brains of the system is in the packet-processing module. It is here that careful attention to design and optimization yield the greatest benefit.

The first generation IP services routers employed general purpose processors to handle traffic. The bandwidth of each of these individual processors is limited to around 30Mbps to 100 Mbps of packet throughput depending on the services performed, and data flow striping cannot be used to increase the throughput per flow, due to packet sequencing issues.

Two major problems with this approach should be immediately obvious: each processor is limited to much less than the potential bandwidth of an individual traffic flow; and even this meager throughput is dependent on the types and quantities of services that need to be performed. Advanced IP services can only be effectively applied if neither of those restrictions is present. As a result, a packet-processing module must handle the full bandwidth of the maximum size flow regardless of the processing complexity.

The best way to build a packet-processing module is to use a pipeline of network processors and assist them in the execution of special functions using ASICs and other specialty chips. This way, the IP service edge switch is able to provide unmatched speed and flexibility while also avoiding sequencing issues of parallel processing systems. Multiple pipelines can then be interconnected via the switch fabric to provide redundancy, load sharing and system scalability.

The network processors can receive and transmit the full bandwidth of the largest port in the system, then distribute this traffic to ASICs and off-the-shelf co-processors. Thus, the network processor pipeline effectively provides sufficient bandwidth to and from these ASICs and co-processors far exceeding the actual data throughput.

The ASICs and co-processors provide lookup, database, search and encryption engine functions to the network processors. All packet manipulation decisions lie with the network processors. This division of labor improves system flexibility and allows for future growth and evolution of services.

- Josh Cochin

Hardware Engineering Manager

- Steve Kohalmi

Chief Systems Architect

Quarry Technologies Inc.

Article Comments - Getting max packet processing throug...
*? You can enter [0] more charecters.
*Verify code:


Visit Asia Webinars to learn about the latest in technology and get practical design tips.

Back to Top