Global Sources
EE Times-Asia
Stay in touch with EE Times Asia
EE Times-Asia > EDA/IP

Examining performance in hardware emulators (Part 2)

Posted: 21 Sep 2015 ?? ?Print Version ?Bookmark and Share

Keywords:hardware emulation? emulators? arithmetic logic units? ALUs? SoC?

In my previous article, I discussed the dependency of performance on the type of deployment. In this article, I will examine the relationship between emulation system architectures and their performance.

Emulation performance vs emulator architecture
Not all emulators are created equal. Previously, I pointed out that the design capacity and the compilation process are dependent on the architecture of the emulation system. This is also the case for emulation performance.

Let's remind ourselves that each of the current three hardware emulation suppliers is promoting its own architecture:
???Cadence: Processor-based architecture
???Mentor: Custom emulator-on-chip architecture
???Synopsys: Commercial FPGA-based architecture

While the first two solutions are based on custom chips, the third is built on arrays of commercial FPGAs.

Speed of execution in processor-based emulators
The operation of a processor-based emulator was addressed in this article, Understanding design compilation in hardware emulators. Here, I wish to quickly recall the principles of operation, exemplified and simplified in the figure and table.

Figure: The logic functions in the DUT are in a time order controlled by a sequencer.

Table: An emulator's processors are tasked with evaluating functions in the DUT in a time order controlled by a sequencer.

The model of the design under test (DUT) is converted into a data structure stored in memory, and processed by a computing engine consisting of a vast array of Boolean processors; hence the name of this type of emulator. This vast array is typically made up of relatively simple 4-input arithmetic logic units (ALUs), reaching into the multi-millions of elements in fully expanded configurations. All operations are scheduled in time steps and assigned to the processors according to a set of rules the preserve the functional integrity of the DUT.

An emulation cycle consists of running all the processor steps for a complete execution of the design. Large designs typically require hundreds of steps. The more operations carried out in each step, the faster the emulation speed.

During every time step, each processor is capable of performing any function using, as inputs, the results of any prior calculation of any of the processors and/or any design input and/or any memory contents. The more processors available in the emulator, the more parallelism can be achieved for faster emulation time.

Currently, the maximum speed reported by such an implementation in the vendor datasheet hovers around 2MHz. On real system-on-chip (SoC) designs, several users have claimed to reach 1MHz. This would be the case with in circuit emulation (ICE), an embedded testbench, and embedded software acceleration modes. In transaction-based acceleration, however, the Palladium-XP2 is rumored to perform at lower speeds.

1???2?Next Page?Last Page

Article Comments - Examining performance in hardware em...
*? You can enter [0] more charecters.
*Verify code:


Visit Asia Webinars to learn about the latest in technology and get practical design tips.

Back to Top