Global Sources
EE Times-Asia
Stay in touch with EE Times Asia
EE Times-Asia > Processors/DSPs

How to achieve 200-400GE network buffer speeds

Posted: 27 Nov 2014 ?? ?Print Version ?Bookmark and Share

Keywords:400GE? DDR4? ASIC? FPGA? transmission protocols?

By contrast, data word transactions take place between two end points which could be either two peer devices such as ASSP, ASIC or FPGA or from a host device to a memory or co-processor. These rely on look-aside data word protocols and eliminate switch related issues. Data word transactions exhibit these transfer characteristics:
???Data is in fixed length frames of a predetermined size (eg. 32b, 64b, 72b, ...)
???Rate at n x the packet arrival rate, often (typically 4 ???Synchronous transfer mode
???Reach of less than 8 inch with no connectors

Packet integrity is the only concern and is typically low (BER 10-15) because signal integrity related issues are the source for information loss. Data loss can be managed by design best practices and error checking protocols.

Until GCI was developed, no commonly available protocol using SerDes had been optimised for synchronous fixed length transfers in the look-aside path. As options, designers have utilised channelised packet-oriented protocols over SerDes that result in higher overhead in resources and latency. The GCI protocol specifically streamlines device-to-device data transmissions and overcomes the inefficiencies of existing protocols for the look-aside application.

GCI's three layer structure
The GCI specification defines three layers, the Physical Medium Attachment (PMA) layer, the Physical Coding Sublayer (PCS), and the Data Link layer.

The PMA layer in the GCI transfers 10bit characters over a serial lane from one device to another. The PMA performs functions similar to those of the Physical Medium Dependent (PMD) and PMA layers in the Ethernet standards. These include electrical and timing functions, equalisation, clock and data recovery (CDR), and serialisation/deserialisation.

Electrical and timing specifications are based on the Common Electrical I/O (CEI) 11G-SR standard. However, the GCI protocol stands alone and does not define electrical and timing requirements or equalisation can be implemented with other electrical standards. Other characteristics and operations that take place in the PMA layer include:
???Both devices on a connection must use the same reference clock source; i.e., the clocking is mesochronous. Because the devices operate at exactly the same frequency, the GCI does not introduce skip symbols to compensate for clock rate differences.
???Each lane's transmitter serialises 10bit characters onto the lane. The least significant bit of the character is sent first.
???The receiver desterilizes single bits received at line rate and reassembles them into 10bit groups. The receiver includes a clock and data recovery (CDR) circuit to align the incoming bit stream with the internal bit clock. The CDR circuit continuously tracks the location of the eye in the received signal.
???The training sequence provided by the PCS sublayer includes a pseudo-random bit sequence (PRBS), which can be used for training by the CDR circuit and a decision feedback equaliser (if present)

How the PCS sublayer works
As shown in figure 2, the PCS encodes and transfers 80bit frames over one or more serial lanes. To transmit, the PCS receives 80bit frames from the Data Link layer and transforms them into 10bit characters for the PMA layer. The PCS sublayer has the following characteristics:
???Before scrambling begins, the Tx and Rx exchange initial 48bit state of the linear feedback shift registers (LFSR).
???Each lane is then scrambled with a PRBS generated by the LSFR for transmission. This provides sufficient transition density and DC balance for reliable, high-speed serial communications without the overhead of 8b/10b coding. On the receiver side, except in pathological cases, the descrambling process provides sufficient transition density for clock recovery by the receiver, as well as DC balance.
???In Tx, scrambling takes place after striping. In Rx, descrambling takes place after character alignment and deskewing and before lane reordering.
???The number of lanes in each direction in a GigaChip Interface connection does not need to be the same. For example, a memory device designed for write-mostly applications needs fewer transmit lanes than receive lanes.
???Striping breaks each 80bit frame into 10bit characters and distributes them over the available lanes.
???The PCS layer stripes the frames from the Data Link layer over 1, 2, 4, or 8 serial lanes. The Data Link layer does not need to know how many lanes are used in the lower layers. On the Rx side, the PCS identifies character boundaries in each lane by means of a synchronisation pattern of ten 0s followed by ten 1s with each training sequence. To destripe, the receiver for each lane independently searches for the synchronisation pattern and reassembles 10bit characters from the logical lanes back into 80bit frames.
???Following character alignment, the PCS compensates for character-scale (10 UI) skew between lanes.
???Deskewing amounts are established during a training period before any frames are transmitted. Deskewing is a similar function to that which is performed in the PCIe and XAUI protocols as shown in figure 3.

 PCS layer in the GCI

Figure 2: PCS layer in the GCI.

The GCI data link layer
The Data Link Layer reliably transfers 72bit payloads on behalf of the Transaction layer. This layer applies the CRC coding and the acknowledgment/replay protocol that detects and recovers from errors. It appends error detection and control information to the payload, creating an 80bit frame.

?First Page?Previous Page 1???2???3???4?Next Page?Last Page

Article Comments - How to achieve 200-400GE network buf...
*? You can enter [0] more charecters.
*Verify code:


Visit Asia Webinars to learn about the latest in technology and get practical design tips.

Back to Top