Global Sources
EE Times-Asia
Stay in touch with EE Times Asia
EE Times-Asia > Memory/Storage

Memory-oriented optimisation techniques (Part 1)

Posted: 07 Aug 2013 ?? ?Print Version ?Bookmark and Share

Keywords:Memory? embedded systems? Optimisations? loop? Compilers?

Loop fusion (see example in figure 5) allows references to the same array in different loops to be combined and reused. The layout of array elements can also be modified to change the way that

they map into the cache or a parallel memory system.

For example, transposing a matrix is an alternative to loop permutation. Loops can also be padded to change how the data elements fall into cache lines. The wasted memory may be more than made up for by improved cache performance.

Given that the memory system is a major contributor to system power consumption, we would expect that loop transformations can either hurt or help the energy consumption of a program. Kandemir et al. [9] studied the effects of compiler transformations on energy consumption by simulating different versions of several benchmark programs with SimplePower.

Figure 5: An example of loop fusion.

Figure 6 summarises their results. They experimented with different types of transformations on different benchmarks and measured the energy consumption of unoptimised and optimised code, testing several cache configurations for each program implementation.

One interesting result of these experiments is that, with the exception of loop unrolling, most optimisations increase the amount of energy consumed in the CPU core. Given the Kandemir et al. technology parameters, the increase in core energy consumption was more than offset by the reduced energy consumption in the memory system, but a different technology could result in net energy losses for such transformations.

Any optimisation strategy must balance the energy consumption of the memory system and the core. These experiments also show that increasing the cache size and associativity does come at the cost of increased static and dynamic energy consumption in the cache.

Once again, losses were more than offset by gains in the rest of the memory system for these technology parameters, but a different technology could shift the balance.

Global optimisations
Compilers apply a great many transformations. The order of transformations is importantsome transformations enable other transformations, and some make it impossible to apply other transformations later. Optimisation strategies have been developed for both general-purpose compilers and for embedded systems.

The Real-Time Specification for Java (RTSJ) [2] is a standard version of Java that allows programmers to consider the temporal properties of their Java programs to determine whether they can meet deadlines. The designers of the standard identified three major features of Java that limit programmers' ability to determine a program 's real-time properties: scheduling, memory management, and synchronisation.

Figure 6: Simulation measurements of the effects of compiler transformations on energy consumption.

Java did not provide detailed specifications about scheduling. RTSJ requires a fixed-priority preemptive scheduler with at least 28 unique priorities. A RealtimeThread class defines threads that can be controlled by this scheduler. Java uses garbage collection to simplify memory management for the general purpose programmer but at the price of reduced predictability of memory systems. To improve the predictability of memory management, RTSJ allows programs to allocate objects outside the heap.

A MemoryArea class allows programs to represent a memory area that is not garbage collected. It supports three types of objects: physical memory, which allows for the modelling of non-RAM memory components; immortal memory, which lives for the execution duration of the program ; and scoped memory, which allows the program to manage memory objects using the syntactic scope of the object as an aid.

The RTSJ does not enforce priority-based synchronisation, but it does provide additional synchronisation mechanisms. The system queues all threads that are waiting for a resource so that they acquire the resource in priority order. Synchronisation must implement a priority inversion protocol. RTSJ also provides a facility to handle asynchronous events, such as a hardware interrupt.

?First Page?Previous Page 1???2???3???4?Next Page?Last Page

Article Comments - Memory-oriented optimisation techniq...
*? You can enter [0] more charecters.
*Verify code:


Visit Asia Webinars to learn about the latest in technology and get practical design tips.

Back to Top