Global Sources
EE Times-Asia
Stay in touch with EE Times Asia
EE Times-Asia > FPGAs/PLDs

Programmable chips rev critical algorithms

Posted: 19 Jun 2006 ?? ?Print Version ?Bookmark and Share

Keywords:AMD64 Opteron? DRC Computer? XtremeData? multisocket processor? FPGA?

 XtremeData uses Altera FPGA on a credit card-size board

The first two companies to offer socket-compatible coprocessors for AMD64 Opteron processor sockets, DRC Computer Corp. and XtremeData Inc., are delivering programmable solutions that can accelerate time-critical algorithms. These coprocessors leverage the flexibility of Xilinx and Altera FPGAs, so that they can be configured to accelerate graphics, XML, floating-point, video transcoding and other applications.

Although the latest AMD64 processors offer topnotch performance, when it comes to specialized operations such as graphics, XML operations and video transcoding, they deliver good, but less than stellar, performance. To achieve improved system performance, Advanced Micro Devices Inc. has opened its processor socket interface as part of the just-released Torrenza platform to allow companies like DRC, XtremeData and others to develop and deploy application-specific coprocessors to work alongside AMD64 CPUs in multisocket processor systems.

The coprocessors plug directly into an empty CPU socket and can be dynamically reconfigured, thus permitting users to change logic configurations to better match the algorithms that need acceleration. Both the DRC and XtremeData solutions are modules that combine an FPGA with static RAM, flash memory (XtremeData only) and interface logic to support 8- or 16bit HyperTransport interfaces.

DRC offers three versions of its module: the DRC100-L60ES and L60, which are based on the LX60 Virtex 4 FPGA (60K logic cells), and the DRC110-L160, which is based on the LX160 FPGA (152K logic cells). The L60ES uses an 8bit, 200MHz HyperTransport channel, while the L60 and the L160 have one and three 400MHz, 16bit channels, respectively. To support the implementation of algorithms in the FPGA logic fabric, DRC offers a Linux-based development system that contains a two-socket motherboard, double-data-rate memory, disk drives, a graphics controller and software development tools.

For its part, the XD1000 from XtremeData employs Altera's largest Stratix II FPGA, the EP2S180, on a credit card-size board that fits into the secondary CPU sockets of any 2P or 4P AMD Opteron processor-based motherboard. Able to fit in systems with tight board-height form factors, including 1U servers, server blades and Advanced Telecom Computing Architecture platforms, the XD1000 uses a 200MHz, 16bit-wide HyperTransport interface to achieve low-latency communication with the host AMD Opteron processor, said Ravi Chandran, CEO of XtremeData.

"This means that the traditional latency chain of CPU to north bridge to south bridge [via PCI interface] to FPGA has been reduced to a point-to-point CPU-to-FPGA link," said Chandran. "Compared to competing I/O board systems, the XD1000 offers a more scalable solution. It gives access to more memory [via DIMM modules] and provides higher bandwidth and lower-latency interconnects than north-bridge solutions, at a much lower total cost of ownership."

The module also includes a JTAG port that allows a download cable connection. It can be used to configure the FPGA and probe internal FPGA signals using Altera's SignalTap II embedded logic analyzer. Also incorporated on the module are a 128bit-wide DDR333 memory interface, up to 8Mbytes of high-speed SRAM and 32Mbytes of flash memory. Additionally, the company has several enhanced versions of XD1000 planned for future release.

To develop the hardware-based algorithms, XtremeData leverages Altera's SOPC Builder and C2H (C-language to hardware) tools as well as Altera's soft intellectual-property blocks, such as the Nios processor core. A full development system with a dual-socket motherboard and one XD1000 module sells for about $15,000. In small quantities, the XD1000 module sells for $6,500 apiece.

In one possible scenario, an FPGA-based hardware accelerator used in medical CT imaging might run the overall application 10 times faster when each 3GHz AMD Opteron processor is coupled with an FPGA. The result is significant system-level savings for power, space and cost. "The key to acceleration is parallelism of the algorithm implementation in the FPGA, so that even when the FPGA operates in the subgigahertz range, it can outperform a multigigahertz CPU," said Chandran.

- Dave Bursky

Article Comments - Programmable chips rev critical algo...
*? You can enter [0] more charecters.
*Verify code:


Visit Asia Webinars to learn about the latest in technology and get practical design tips.

Back to Top