Global Sources
EE Times-Asia
Stay in touch with EE Times Asia
EE Times-Asia > Controls/MCUs

From the ground up: How ARM built Cortex M7

Posted: 07 Oct 2014 ?? ?Print Version ?Bookmark and Share

Keywords:Cortex M7? MCU? ARM?

Editor's Note: ARM took the spotlight with the launch of Cortex M7, promising to supercharge the MCU market with 5 CoreMark/MHz. Earlier, ARM partners put their heads together to come up with use cases for the recently announced Cortex M7 (see M7 punch: ARM partners brainstorm use cases). EE Times Europe's Nick Flaherty talks to ARM executives to get a look into how ARM built a completely new architecture for MCU from scratch.

It's not too often there's a completely new microarchitecture for an embedded controller, so the launch of the Cortex M7 is particularly interesting. This has been designed from the ground up, rather than extending previous architectures from the M3 to the M4, for the Internet of Things and other applications.

"It's a clean sheet and so it doesn't look like anything we have built before," said Richard York, VP of embedded CPU marketing at ARM. "There is an insatiable access in the connected world and more interactivity, and customers and industry are demanding better user interfaces. That means the performance needs in embedded micros are going up and up. We looked at what the Cortex M4 was doing and the feedback was to double the performance."

Cortex M7

Cortex-M7 pipeline. Source: ARM

The execution pipeline has been made longer, reaching six stages, and it's dual issue, with a separate MAC pipeline and an optional double precision execution unit. But the latency has been tightened up so that it stays at 12 cycles, like the M4. "It's a longer pipeline but there's careful engineering to make sure we don't make interrupt performance worse."

The microarchitecture also adds an instruction and data cache, which is unusual for a microcontroller, but the higher performance allows the microarchitecture to meet the latency requirements within the frequency limits of the on-chip flash memory. "What we did was look at the other processors to re-engineer them, but we need to match the short pipeline to the flash memory as we don't go mad on frequency."

As ever, it's the infrastructure around the execution unit and the core that actually makes the main difference for the overall performance. Around the 32bit execution unit, the AXI bus has been expanded to 64bit wide so that two independent loads can be executed at the same time. All of this allows developers to run Java or Matlab directly.

1???2?Next Page?Last Page

Article Comments - From the ground up: How ARM built Co...
*? You can enter [0] more charecters.
*Verify code:


Visit Asia Webinars to learn about the latest in technology and get practical design tips.

Back to Top