Global Sources
EE Times-Asia
Stay in touch with EE Times Asia
EE Times-Asia > Memory/Storage

Memory access ordering in complex embedded designs

Posted: 22 Dec 2014 ?? ?Print Version ?Bookmark and Share

Keywords:embedded systems? processor? Sequential Execution Model? SEM? Compilers?

What if the STR was accessing a peripheral register and that access needed to be complete before the LDR is executed? In that case, the program would not function correctly. We need to fix this by inserting a memory barrier between the two instructions. A barrier is an instruction which tells the processor that it needs to ensure, with some degree of strictness, that outstanding memory accesses are complete before it continues. So, we would need to write this:

add r0, r0, #4
mul r2, r2, r3
str r2, [r0]
ldr r4, [r1]
sub r1, r4, r2 bx lr

The DMB (data memory barrier) instructs the processor to ensure that no more memory accesses take place before all prior accesses have completed. This includes things like flushing write buffers and so on.

ARM memory barrier instructions
Particular memory barrier instructions implemented in current ARM systems:

DMB Data memory barrier ensures that no more memory accesses occur until all outstanding accesses have completed. Does not stop the processor continuing to execute instructions as long as they don't cause memory accesses.

DSB Data synchronization barrier causes the processor to stall, without executing any further instructions, until all outstanding memory accesses have completed.

ISB Instruction synchronization barrier causes the instruction pipeline and any prefetch buffers/queues to be flushed and instructions to be refetched.

Actually, some of this can be fixed more easily on ARM systems by making use of the architectural memory types. Architecture ARMv6 onwards supports something called a "weakly-ordered memory model". This means that the processor and memory system are free to use all kinds of tricks to hide memory latency. This includes:
???Speculative accesses Speculative accesses are used heavily by the instruction fetcher to fetch ahead of the current execution point and to speculatively fetch multiple possible instruction sequences following a conditional branch. They can also be generated by the data side memory interface to speculatively load data into the cache based on observations of repeating access patterns at runtime. Note also the cache line fills are also speculative memory accesses in the sense that the loaded data may never be used.
???Merging memory accesses Many write buffers automatically merge multiple accesses to consecutive or overlapping addresses into single or burst transactions.
???Re-ordering memory accesses Where there are no data dependencies among a group of transactions, the memory system is free to carry out these access in the most efficient order.
???Repeating memory accesses In some circumstances, the system will repeat accesses. In ARM systems, this can occur if a LDM or STM instruction is interrupted (in low latency interrupt mode). When this happens on an ARMv7-A/R processor, the access is restarted on return from the exception handler and this means that some of the accesses may be repeated.
???Changing memory access size and number If it more efficient for the memory system to carry out an access of a different size than that specified by the program and then to extract the necessary portion of the loaded value before returning it to the processor, then it is free to do this. It is also free to split large accesses into multiple smaller ones, if this is more efficient for the memory system.

Clearly, systems are only allowed to do this when the effects are not observable to the executing program. And this is generally only true when memory accesses have no side effects. For most memory accesses, this is true. It is not true for accesses to memory-mapped peripherals. To make this distinction, the ARM architecture defines different types of memory.

ARM memory types
Memory types as defined in ARMv6 and ARMv7 architectures:

Normal memory is "weakly-ordered". Instruction memory is always Normal (the architecture actually requires this) and it is also used for the vast majority of program data. Normal memory regions may be cached and may use a write buffer.

Device memory obeys a much more strictly ordered memory model. In particular, memory access size, number, and order must be preserved and accesses may not be repeated. Speculative accesses are not permitted. Device memory is generally used for memory-mapped peripherals and any other addresses where accesses have side-effects. It may not be cached but, since write buffers are permitted a Device access, may "complete" before it reaches the addressed device.

Strongly-ordered memory is even stricter than device memory and is used only for support for legacy systems where memory ordering is a particularly problem. Write buffers are not permitted and a strongly- ordered access "completes" when it reaches the addressed device.

?First Page?Previous Page 1???2???3???4???5?Next Page?Last Page

Article Comments - Memory access ordering in complex em...
*? You can enter [0] more charecters.
*Verify code:


Visit Asia Webinars to learn about the latest in technology and get practical design tips.

Back to Top