Global Sources
EE Times-Asia
Stay in touch with EE Times Asia
?
EE Times-Asia > Embedded
?
?
Embedded??

Embedded software evaluation tries hybrid approach

Posted: 24 Mar 2008 ?? ?Print Version ?Bookmark and Share

Keywords:researchers IITK? embedded? software?

Although simulation-based performance evaluation of software on a customized processor is accurate, it can be very slow. The researchers at the Indian Institute of Technology, Kharagpur (IITK) instead propose a new hybrid approach on software performance evaluation.

"Evaluation of software performance on a given customized processor is an important step in the design space exploration of embedded system architectures. Such evaluations help system designers in taking early design decisions regarding the hardware architecture most suitable for the target application, but simulation-based performance evaluations, although very accurate, can be prohibitively slower," wrote researchers Soumyajit Dey, Monu Kedia and Anupam Basu.

Their approach consists of a one-time initial simulation run followed by analysis of intermediate level (IR) application code by an evaluation engine. This method, they said, showed that the evaluation engine could estimate the execution cycles of applications or of application tasks on a given customized embedded processor with more than 95 percent accuracy much more quickly than current methods can.

"Instruction set customized processors are evolving as a viable solution addressing the needs of flexibility and performance in the domain of embedded systems. These customized cores try to deliver the performance close to ASICs while at the same time retaining the flexibility of a General Purpose Processor (GPP)," the researchers said.

Improved performance
It has been shown that by extending a base processor core by means of intelligently selected instructions, the performance of the processor for the application or application domain can be very much improved using design tools from vendors such as CoWare, Tensilica and ARC. "But," they said, "keeping in view the shrinking time-to-market window in today's embedded system design and [the] existence of several customized processor cores in the market from silicon vendors, a system designer is often tempted to adopt an off-the-self processor core, rather than designing and synthesizing one. However, doing a performance evaluation of the available off-the-self processor cores for the target application by cycle accurate simulations of the application for each of the available processor is tedious and time consuming."

The IIT team's method is a hybrid one made up of an initial simulation run followed by analysis of IR-level application code using an evaluation engine to predict the execution time statistics on any given instruction-set-customized processor. They studied the behavior of their evaluation engine both in terms of the accuracy of the predicted execution time and of how fast it is in comparison to the simulation-based estimations, implementing the methodology in the Tensilica (Xtensa) design platform.

"An obvious application of the proposed approach becomes the estimation of task-level execution times which are the inputs to any Design Space Exploration algorithms performing the application-to-architecture mapping in heterogeneous MPSoC architectures. In a multiprocessor platform, identifying the most suitable processor for an application task is a non-trivial task and needs the performance evaluation of each application task on each of the PEs in the platform.

Along with the prediction of execution cycles, our evaluation engine is also capable of automatically augmenting the application tasks with the Custom Instructions (CI) available in the processor hardware and generating scheduled code. This can be seen as an important step in the system design flow using the Tensilica platform where the CIs have to be manually embedded in the application code for porting into an architecture that consists of different extensions of the base Xtensa core," they said.

The team claimed that their results showed that the evaluation engine is at least an order of magnitude faster than simulation-based evaluation techniques; what is more, the predicted execution times showed up to be nearly completely accurate in all test cases.

Alternative methods
Other approaches for evaluating processors for applications take a simulation/execution-based approach in which the application task is compiled for the target processor and executed either on an evaluation platform or simulated via a cycle-accurate simulator. In the alternative analytical approach, the application is neither simulated nor executed but is instead analyzed in light of the processor's instruction set architecture.

The IIT team assumed there to be many customized embedded processors differing in the instruction set architectures as architectural design alternatives. "We identify a base processor configuration which consists of instructions doing basic arithmetic, logical, comparison and jump instructions. Our goal is to evaluate the performance of the given application or application tasks on these different customized steps.

As the first step, execution time statistics of the application or each of the application tasks is obtained through cycle accurate simulation using the simulator for the base processor core. This is the only simulation run used in the new approach proposed and the execution time statistics obtained by this initial simulation run forms the input to the second step, where an evaluation engine predicts the execution time statistics for each of the customized embedded processors.

The inputs to the evaluation engine are the performance statistics got from the simulation on the identified base processor and a parameter file, which captures the architectural differences in terms of the instruction set of the customized processor core in comparison with the identified base processor.

This file has the definition of all extra instructions contained in a customized processor. An instruction definition consists of the number of input/output parameters, latency information and dependency relations. Essentially, the evaluation engine tries to estimate how these architectural differences will effect the execution time of the application or application tasks.

Researchers say
"We have presented a novel hybrid approach consisting of one time simulation of the application on an identified base processor architecture and then prediction of execution time on a customized processor by our evaluation engine. Along with the prediction of execution time, the application tasks are automatically augmented with CIs optimally for execution in the heterogeneous processors. Our approach also finds use in the application to architecture mapping during MPSoC design. Our work turns around the classic instruction selection problem for code generation. However, the presented framework is significant in the context of performance estimation for a given processor architecture and corresponding code-generation.

The evaluation engine in the new method proposed can only predict the execution time for a custom processor and refinements for improving the prediction accuracy is a natural extension to the present work. Interesting work in the future can be in augmenting the evaluation engine with cache performance estimation due to CIs and [the] effect of other micro-architectural parameters," they concluded.

- K.C. Krishnadas
EE Times





Article Comments - Embedded software evaluation tries h...
Comments:??
*? You can enter [0] more charecters.
*Verify code:
?
?
Webinars

Seminars

Visit Asia Webinars to learn about the latest in technology and get practical design tips.

?
?
Back to Top