Global Sources
EE Times-Asia
Stay in touch with EE Times Asia
EE Times-Asia > Networks

Scale Routing for Next-Gen IPv6 Networks

Posted: 01 Jun 2005 ?? ?Print Version ?Bookmark and Share


By Anand Rangarajan and Veena Kondapalli

Cypress Semiconductor

Internet Protocol Version 6 (IPv6) or virtual private LAN services (VPLS) networks can use Forwarding Information Bases (fibs) to hold the information used to decide the routing behavior of incoming packets. There are challenges to using this memory architecture for next-generation networks, however, including performance scalability and density. Yet, by taking a unique organizational approach, greater scalability and density can be achieved with FIBs.

Traditionally, routing information has been stored in SRAMs or DRAMs. Prefixes that represent routes have been stored in ternary content addressable memories (TCAMs) and searched instantly, or stored in DRAMs with an algorithm incrementally searching several bits of the address. In either case, the result index has some associated information indicating the action to take, such as the next hop address, update statistics, replicate on another port, etc.

Unfortunately, neither of these traditional approaches offers a clear advantage over the other. For instance, DRAMs scale nicely in density for a low cost, but have limited scalability in performance and come with update complexity. On the other hand, the TCAMs provide high performance and high update rates but are expensive and power-hungry. Currently, 18Mb TCAMs cost approximately $200 and 288Mb RLDRAM-II is about $15. The drastically different $/Mb drives the choice of memory architectures towards a FIB implementation.

There has been significant research done in the area of algorithmic FIB implementations as compared to deployable scalable packet classification implementations (where TCAMs are increasingly becoming the de facto standard). There are two trends, though, that pose significant challenges for RAM-based algorithms for FIB implementations. First, the width of the entries is increasing as IPv6 gains support. And, second, the size of the routing tables is increasing because of the greater adoption of virtual router forwarding (VRF) tables and VPLS.

In the case of VRF tables, similar routing tables are stored per virtual LAN (VLAN), and this increases the number of entries. VPLS is a new paradigm by which highly hierarchical networks are flattened, leading to more entries. As the width increases, RAM-based algorithms need to construct deeper multi-bit trees/tries, which increases the number of RAM accesses needed for a search.

Two algorithmic/architectural approaches are analyzed that can be used in FIB implementations: Trie-algorithms with External Memories (TEM) and Pipelined Organization with Embedded Memories (POEM).

Metrics of an FIB implementation

The strength of a FIB implementation depends on several metrics such as density (total route capacity), range of entry widths supported, search throughput/performance, route update throughput, search latency, power consumption, pin count and the board space used by the implementation. Figure 1 shows where each of the metrics comes into play in a sample line card.

Figure 1: Typical Search subsystem of a router line card showing the key Metrics of a FIB implementation

In addition to the need for wider entries and larger tables, the performance requirements are also increasing with the link speeds and the number of lookups per packet. Combined, these factors put an enormous demand on the memory bandwidth that is required for searches. Capacity and speed can be traded off such that if speed is a significant concern, a higher search rate can be accomplished with an increase in the amount of DRAM required. There is a need for algorithms that can dynamically provide these tradeoffs.

Packet processors want to have simple logic interfacing with FIB implementations. This means that a good FIB implementation must have uniform, deterministic data path latency for all the various packet types and widths. Because of the need to support changing table mixes, it is important that the implementation have some reasonable worst-case performance and density characteristics, instead of working well only for today's real-life routing databases.

Memory bandwidth and table compression are the fundamental impediments to scaling. If the number of pins that are needed to provide the memory bandwidth increase, there are proportional increases in board space and cost. Compression of the tables trades off greater density for lower update throughput. If the tables are packed very tightly, even small changes may require large parts of the table to be modified. The key to a "good" FIB implementation, then, is to be able to simultaneously meet all of these metrics: density, search throughput, update throughput, and power dissipation, for the lowest cost.

Trie-algorithms with External Memories (TEM)

Over the last decade, there has been significant interest in performing exact match searches and longest prefix match searches by storing lookup tables (IP unicast and multicast routes, MAC entries, flows, etc.) in tree-based (or trie-based) data structures stored in external RAMs. When entries are added to the table, the data structures are constructed and modified to form nodes that are mapped to memory locations. This is shown in Figure 2.

Figure 2: Generic trie-based algorithmic search system

When a search key is issued, a logical block in the packet processor traverses the nodes and does partial matching of the key with the partial entries represented by the node. An aggregate matching index is returned if all the nodes of the tree matched. This is shown in Figure 3.

Figure 3: Generic algorithm demonstrating a Search operation

Pipelined Organization with Embedded Memories (POEM)

In this approach, (Figure 4) the algorithms are sophisticated enough that large tables can be compressed and stored in embedded memories inside the chip, thus getting enormous memory bandwidth for the search and update logic. The key challenges are the ability to provide for large tables as well as rapid updates to these tables.

Figure 4: POEM architecture

The embedded memories are organized such that the pipelined searches and updates can be performed to the data structures at high speeds. The challenges of a sophisticated algorithm are in keeping compact data structures, mapping the nodes to the available memories (such that the enormous memory bandwidths that are available can actually be tapped), and the ability to update these nodes quickly and coherently.

For example, a network search engine that uses this architectural approach can scale up to 7M to 10M IPv4 routes, while simultaneously sustaining search throughput of 250 MSamples/s, fixed deterministic latency of 216 ns, update throughput of 25K updates/sec, and dissipate approximately 5W of power.

Comparative analysis

Memory bandwidth is one important problem that POEM solves easily as compared to TEM. In TEM, the way to increase memory bandwidth is by adding more pins/interfaces. In POEM, adding more internal memory bandwidth is done by widening internal buses and segmenting memories into more banks. Adding internal wires is almost free, while adding extra pins is very expensive. In contrast, scaling the actual amount of memory is easy for TEM (just add more chips), while adding more internal memory to POEM will increase die size and cost.

FIB performance requirements are increasing, and innovative algorithmic approaches along with high-performance architectures are necessary to simultaneously meet all of the requirements. Satisfying the various metrics for a FIB implementation requires tradeoffs in architectural choices. Fortunately, a pipelined organization of embedded memories promises greater scalability in performance and density when compared to a traditional algorithmic approach with external memories.

About the authors

Anand Rangarajan is a systems architect in the Data Communications Division at Cypress Semiconductor. He holds an MS in Electrical Engineering from Stanford University, and B.Tech. in Electrical and Electronics Engineering from the Indian Institute of Technology, Madras. Veena Kondapalli is an applications engineer at Cypress Semiconductor. She holds an MS in Electrical Engineering from NYIT, and BE in Electronics and Communications Engineering from Osmania University.

Article Comments - Scale Routing for Next-Gen IPv6 Netw...
*? You can enter [0] more charecters.
*Verify code:


Visit Asia Webinars to learn about the latest in technology and get practical design tips.

Back to Top