Broadcom displays range of architectures
Keywords:broadcom? embedded processor? communication? hot chip?
Broadcom Corp. seemed to be announcing its arrival as a major embedded processor vendor at this year's Hot Chips conference. The company successively turned the spotlight on three entirely different processing architectures, each from a different design team and is targeted at a different communications application.
Some of the diversity at least is the result of acquisition. The first processor to be described, the FirePath, was originally developed by startup Element 14 before its acquisition by Broadcom. The second of the architectures, Calisto, is based on designs done at Silicon Spice before its absorption into the company.
But the mother company showed itself unafraid to bring to market three quite diverse approaches to computing, each requiring a different approach to software development and different tools, and each employing a different instruction set. The company apparently reasoned that there is so little crossover between communications applications that commonality of tools or libraries would not be as valuable as a good fit between application needs and architectural strengths.
FirePath, for example, is a single-processor design with two symmetric data paths, each capable of handling 64-bit words. Each can also be partitioned for single-instruction, multiple-data (SIMD) parallelism. Rather than using a superscalar instruction dispatch approach to keep the two pipes full, the entire processor is controlled by a single long-instruction word.
Operations in the SIMD data paths are predicated on a byte-by-byte basis, allowing the machine to conditionally execute operations on individual bytes without branching. The two paths share a set of general-purpose registers and each path has a dedicated set of multiply-accumulate registers. Instructions span the usual space, but are weighted toward arithmetic and byte manipulation, as might be expected in a machine designed to perform signal processing on large amounts of data.
The architecture does not include either precise exceptions, which would be nearly mandatory in a data-processing environment, or DSP-style stride-addressing or zero-latency branches. Thus it hews to a middle path between being a modest control processor and being a highly capable SIMD number cruncher. Broadcom's first use of the processor is at the heart of a 12-channel digital subscriber line transceiver SoC. No details about the implementation were given, other than to say that the chip has an entirely cell-based design using proprietary libraries.
In contrast, the Calisto processor takes an equally aggressive, but entirely different approach to parallelism. Calisto is actually a cluster of processors, combining conventional RISC processor cores for control and supervision with SpiceEngine vector-based DSP cores for number crunching. Calisto arranges those resources into clusters. Each cluster comprises four DSPs, a RISC core and a local memory.
The heart of the design from the viewpoint of throughput is of course the SpiceEngine, which was first described publicly at the 2001 Microprocessor Forum. The core includes a collection of various execution units, registers and a vector register file - essentially a highly structured scratch pad memory with very fast addressing.
Given the hierarchy of relatively small memories and the fact that the local memory attached to a cluster is not that much larger than the vector memories on the DSPs, attention to bulk memory transfer is essential in the architecture. Hence there is careful attention to memory bandwidth, with a 512-bit-wide crossbar switch between the shared program memory banks and the clusters, an elaborate DMA controller scheme and a lot of attention to cache-fill latency.
There has to be at least equal attention to software tools for such a tuned architecture. Hence Broadcom is using a vectorizing C compiler for the processor with extensions for aggressive optimizations, managing loads among the processors, and details of packet processing.
The company will first employ the architecture in a four-cluster configuration intended to serve as a multiprotocol gateway. The chip is controlled by a proprietary OS that assigns transport and signal-processing tasks to channels based on the protocol then on the particular channel. Next, it dynamically assigns those tasks to available DSP cores on a per-packet basis.
Broadcom's ability to handle multiple channels in that way allows the company to implement a 2,016-channel OC-3 gateway on a single blade using ten SoCs, or a total of 160 DSP cores. The 166MHz device is one of the early production chips implemented in a 130nm CMOS process. Very significant for the market it serves, it operates at a rated 1.2W at 1.2V internal voltage.
For yet another contrast, a third Broadcom design team described the evolution of an integrated SoC for use in Internet Protocol (IP) telephone desk sets and gateways. Departing entirely from the custom processors, specialized instruction sets, and specialized software tools of the previous designs, the BCM1101 IP phone chip integrates a standard MIPS CPU core, a standard LSI Logic Corp. ZSP DSP core, memory, a modest Ethernet switch, peripherals, and analog codecs. The Ethernet switch is provided so that the chip can support both an IP phone and a PC on a single Ethernet port from the switching cabinet.
The device is implemented in a 0.185m process and consumes peak power, interestingly enough, just under that of the Calisto-based SoC.
Both the MIPS and ZSP cores are designed for programming with conventional C-language tools. The chip has its own OS, however, that is distributed between the MIPS and ZSP cores.
- Ron Wilson EE Times |
Related Articles | Editor's Choice |
Visit Asia Webinars to learn about the latest in technology and get practical design tips.