Global Sources
EE Times-Asia
Stay in touch with EE Times Asia
?
EE Times-Asia > FPGAs/PLDs
?
?
FPGAs/PLDs??

The MCU guy's guide to FPGAs: The hardware

Posted: 07 May 2015 ?? ?Print Version ?Bookmark and Share

Keywords:microcontroller? MCU? FPGA? multiplexers? digital signal processing?

Actually, this really is a surprisingly easy concept to wrap one's brain around (it's a tad harder to implement, of course). The thing is that general-purpose microprocessors and microcontrollers are really horribly inefficientthe only reason they appear to be so powerful is that we can ramp-up the frequency of the system clock to make them perform more operations per second. Power consumption is, of course, a function of clock frequency, so doubling the frequency doubles the power consumption.

And even if we do increase the clock frequency, this still leaves the processor "thrashing around" when it comes to performing large amounts of data processing and digital signal processing (DSP) functions. As a simple example, suppose we have three 10 x 10 matrices, called a, b, and y, where each element in these matrices is a 32bit integer. Suppose that we wish to add the contents of matrix a to the contents of matrix b and store the results in matrix y. If we were to do this on a processor, the pseudo code might look something like the following:

Pseudo code for a 10 x 10 element matrix addition.

Let's reflect on how the processor handles this. We start with a read instruction that loads the value of the first element from matrix a into the CPU. Next we read the corresponding value from matrix b and add it to the value currently stored in the CPU. Then we store the result from our calculation to the appropriate element in matrix y somewhere in the system's memory. And now we have to do the whole thing again... and again... and again... for each of the matrix elements.

By comparison, we could create a dedicated hardware accelerator using the FPGA's programmable fabric. This hardware accelerator could comprise one hundred 32bit adders, which means that the entire matrix addition could be performed in a single clock cycle. In turn, this means that the clock controlling this hardware accelerator could be running at a much lower speed than the CPU clock, thereby consuming significantly less power.

Of course I'm being a little over-simplistic here, because the CPU will have to load the input values into the programmable fabric and then retrieve the results, but this could be achieved efficiently using a DMA-type process. Furthermore, as opposed to simply adding the two matrices, we might wish to perform significant amounts of logical and mathematical operations on each element, in which case the programmable fabric option starts to look very, very attractive.

The important point to understand is that processors are wonderful when it comes to performing decision-making control tasks, while hardware accelerators are more suitable when it comes to performing large quantities of repetitive data-processing tasks. Thus, the ideal solution is to achieve the optimal balance between those functions that are implemented in the processor and their compatriots that are implemented in a hardware accelerator.

And one final interesting aspect of all of this is that, after the main processor has provided the hardware accelerator with the appropriate data and instructed it to execute its task, the processor can leave the accelerator to perform its magic while it (the processor) is free to go off and do something else. When the accelerator has completed its mission, it can signal the processor, which will retrieve the data when it is ready and able to do so.

Furthermore, the design team may decide to implement a large number of hardware accelerators in the programmable fabric, each tailored to perform a different task. In some cases, these accelerators will work in isolation, communicating only with the main processor; in other cases, one accelerator may hand its results over to another, and so on and so forth until the final accelerator in the chain hands its results back to the main processor.

I am afraid that this article provides only a very modest overview to what can be quite a complex topic, but I hope that it will provide food for thought and stimulate conversation. What do you think? Does this make sense, or does it raise more questions than it answers?

About the author
Clive "Max" Maxfield is the Editorial Director of Embedded.com and the Embedded Systems Conferences (ESC). Max is also editor of the EE Times Programmable Logic and EE Life Designlines. Max is six feet tall, outrageously handsome, English, and proud of it. In addition to being a hero, trendsetter, and leader of fashion, he is widely regarded as an expert in all aspects of electronics (at least by his mother). Max received his BSc in Control Engineering in 1980 from Sheffield Hallam University in Sheffield, UK. He began his career as a designer of central processing units (CPUs) for mainframe computers. Over the years, Max has designed everything from silicon chips to circuit boards, and from brainwave amplifiers to steampunk "Display-O-Meters." He has also been at the forefront of Electronic Design Automation (EDA) for more than 20 years. Max is the author and/or co-author of a number of books, including Designus Maximus Unleashed (banned in Alabama), Bebop to the Boolean Boogie (An Unconventional Guide to Electronics), EDA: Where Electronics Begins, FPGAs: Instant Access, and How Computers Do Math.


?First Page?Previous Page 1???2???3???4



Article Comments - The MCU guy's guide to FPGAs: The ha...
Comments:??
*? You can enter [0] more charecters.
*Verify code:
?
?
Webinars

Seminars

Visit Asia Webinars to learn about the latest in technology and get practical design tips.

?
?
Back to Top