Global Sources
EE Times-Asia
Stay in touch with EE Times Asia
EE Times-Asia > Embedded

Developing heterogeneous multi-core SoC for mobiles

Posted: 06 Jun 2014 ?? ?Print Version ?Bookmark and Share

Keywords:Mobile SoCs? CPU? DSPs? GPUs? processing cores?

Edge detection, a more computationally taxing workload running on a quad core system with a PowerVR G6200 GPU, shows the GPU maintaining frame rate at half the power consumption of the CPUs and boosting frame rate by 3x within the power budget envelope set by the device power management software. A final example, a software implementation of a VP9 video decoder once again pairing a quad-core CPU with a PowerVR G6200 GPU, shows the heterogeneous solution maintaining the frame rate of heavily optimised CPU code at significantly although not dramatically lower power. The major benefit in this final case is that when the decoder is run from within the browser based app, user interface responsiveness is significantly improved due to the greater availability, at finer granularity, of CPU cycles.

Thus the efficiency improvement available by moving to heterogeneous compute is affected by the type of app, the relative floating point performance of the GPU versus the CPU, and one other significant factor, which is the overhead associated with partitioning the workload between the compute units. In the image filtering app, removing an image copy (originally imposed due to API requirements and not required on the CPU-only version) resulted in approximately 25% improvement in performance with a reduction in power consumption.

The VP9 decoder case illustrates a different overhead, that of setting up, synchronising, and dispatching the GPU workload. Only a portion of the workload is delegated to the GPU, dividing a naturally homogeneous task into two and making the overhead dominate performance. Occupancy analysis shows that the GPU is capable of taking the whole task while meeting performance deadlines, leading to the conclusion that greater improvement results from handing over the largest possible workload to the GPU.

As a result of these trials, we can make some basic recommendations for heterogeneous compute:

The type and degree of efficiency improvement is heavily dependent on the type of workload. Careful selection of appropriate workloads is necessary; not all apps can achieve all three benefits simultaneously.

Performance is heavily system-specific, especially dependent on the relative capabilities of the CPU and GPU. These vary widely so that specific system knowledge is required to achieve consistent performance. Also, without a zero copy mechanism, data movement through the system can dominate performance.

Optimal workload partitioning is critically important; in general it is best to divide workloads into the largest possible chunks in order to minimise overhead.

About the author
Peter McGuinness is a director of multimedia technology marketing at Imagination Technologies. He has an extensive background in the architecture and design of integrated circuits and systems for graphics and video, where he holds a number of patents and patent applications. He began his career as a silicon chip designer in 1980 at Plessey Research in England, leaving in 1983 for start-up Inmos, Ltd. (later part of STMicroelectronics).

To download the PDF version of this article, click here.

?First Page?Previous Page 1???2

Article Comments - Developing heterogeneous multi-core ...
*? You can enter [0] more charecters.
*Verify code:


Visit Asia Webinars to learn about the latest in technology and get practical design tips.

Back to Top