Global Sources
EE Times-Asia
Stay in touch with EE Times Asia
?
EE Times-Asia > Power/Alternative Energy
?
?
Power/Alternative Energy??

Reduce power SoC consumption in the interconnect

Posted: 09 Dec 2014 ?? ?Print Version ?Bookmark and Share

Keywords:system-on-chip? SoC? interconnect? Modular? network-on-chip?

Designers have been asked to integrate more features in their SoCs, so the demands on interconnect technology have grown. To keep pace, the following features are in high demand:
???Interfaces to different transaction protocols
???Switches (demux-routers and arbiter-muxes)
???QoS (priority)
???Buffers
???Data path serialisation
???Statistics probes
???Debug tracing
???Firewalls
???Register slices (pipe stages)
???Clock domain crossings
???Voltage domains
???Power domains

These have caused new challenges in interconnect design.

Designers want IP to be reusable and reconfigurable. Supporting the growing feature requirements within the logic of crossbars creates complexity and can slow critical paths. Furthermore, many wires are toggled even for a small volume of traffic, which consumes a disproportionate amount of power. However, a reusable modular interconnect design offers advantages in simplicity, speed, area, and power efficiency by overcoming complexities of older bus and cross bar technology.

Transaction, transport and physical layers
NoC technology employs a 3-layer protocol with the transaction layer serving as the highest. It performs the reads and writes requested using AMBA, PIF, OCP, or other industry standard protocols. It is also the interface visible to the designers of the IP blocks connected through the interconnect.

The transport layer protocol in NoC is managed by network interface units (NIUs). It creates one or more packets for each transaction. All packets have a header. Read data and write data packets include the data payload after the header. The packet header encodes addresses, transaction parameters, and sideband signals as fields. The NIU controls outstanding transactions and tagged sequences. The header format is minimal, and optimised differently for each NoC. The header is used at each pseudo-switch within the interconnect to route requests from initiators to targets and responses from targets to initiators. The request and response paths are independent, which eliminates logic and architectural dependencies, making deadlocks impossible.

 Multiplexing

Figure 7: Multiplexing of address/control signals with data between the transaction interface and the packet transport interface simplifies interconnect design.

The modular design enables transported packets to be transferred on the physical layer using a very simple protocol. The protocol consists of the following signals:
???Data [N bits] (driven by the sender)
???Valid [1 bit] (driven by the sender)
???Ready [1 bit] (driven by the receiver)

"Valid" and "Ready" implement flow control, which enable back-pressure feedback. This simple handshake protocol exists between all units of the NoC. Standardising on a simple interface allows units to be connected interchangeably, in the style of children's plastic interlocking building blocks.

Clock tree gating
With well-known chip design methodologies, it is possible to gate the clock at each flip-flop during cycles in which toggling is not required. This is applicable to the flops in all interconnect technologies; however, it does not address clock tree power consumption.

The clock tree is a single signal and therefore much narrower than data paths. However, to reach all physically distributed flops, the clock tree has a lot more metal than each data path bit. Since clocks, by definition, toggle twice per clock cycle, the clock tree typically consumes significantly more power than data paths.

In a crossbar, every clock net toggles even when and where data is not flowing. While it is theoretically possible to achieve some clock gating to all crossbar logic in cycles when no data is transferred anywhere in the crossbar, it is impractical. It would require a large clock gating mux of multiple distant signals to generate enable signals back to several distant flops.

Therefore, building the interconnect from atomic modules of combinatorial logic allows unit level clock gating with much finer granularity than is possible within a monolithic crossbar.

 Unit Level Clock Gating

Figure 8: Unit Level Clock Gating using combinatorial logic is possible by building the interconnect through a modular approach.

Registers within and between units only toggle when the valid handshake signal is asserted, indicating that data traffic is present. Gating logic is local to each unit, making paths shorter and minimising the muxing required to generate the enable signal. Clock gating is distributed, and each module of the modular interconnect is gated off for idle clock cycles, regardless of the state of the rest of the system. This gives nearly ideal minimum switching power consumption.

Other benefits of modularity
Aside from clock gating, other benefits include improved use of mixed threshold voltage (Vt) synthesis, reduced leakage power consumption, improved logic simplicity, and localisation.

?First Page?Previous Page 1???2???3???4?Next Page?Last Page



Article Comments - Reduce power SoC consumption in the ...
Comments:??
*? You can enter [0] more charecters.
*Verify code:
?
?
Webinars

Seminars

Visit Asia Webinars to learn about the latest in technology and get practical design tips.

?
?
Back to Top