Global Sources
EE Times-Asia
Stay in touch with EE Times Asia
?
EE Times-Asia > Amplifiers/Converters
?
?
Amplifiers/Converters??

Guide to hardware troubleshooting (Part 1)

Posted: 30 Jan 2014 ?? ?Print Version ?Bookmark and Share

Keywords:debug? CAD tools? PCB? FPGA? bug?

This conversation is best had away from the action. Go through exactly what the evidence is, one bit at a time, then look for what other experiments or investigations can be carried out. Then go back to the board and carry on.

In my example, we had a meeting with the office hardware guru and ran through the system, drawing it all up on a white board. Looking at how the FPGA logic worked in action, he suggested there might be a meta-stability issue in the FPGA.

Step 8: Apply the fix
You understand the bug and have come up with a rational solution. You run the code and the problem appears to be solved. However, your job isn't over yet.

Step 9: Try to break it again
Try to break the system again. To be sure you succeeded you will need to put the system through an appropriate series of stress tests an order of magnitude beyond that of the original implementation.

For instance, if a real-time system such as the one above, crashed every ten minutes and never lasted longer than an hour, but now runs for ten hours, the bug is almost certainly fixed.

You may find that the system behaves better, but still crashes. But at this point you may have discovered a new bug that had been masked by the previous bug. You need to treat it as such and go back to step one, creating a fresh investigation on the "cured" system.

Happily, in my example, our resident guru was correct and a simple modification to the FPGA solved the problem. There had been two bugs with one symptom, one which resulted in crashes on a period of about five minutes, the other on a period of many hours (typically about five). The nearly 24 hours in our soak test turned out to be a fluke. We had finally reproduced, analysed, understood and fixed the problem.

Step 10: Remember 'disappearing' bugs are still there if you haven't fixed them
Sometimes bugs just appear to go away by themselves. This can be frustrating, but you can be sure that you haven't fixed the bug. Either the initial report was incorrect or the bug is still there. These are the sort of bugs that reappear when your boss, his boss, or a customer is present.

It can be tempting to lift up the carpet and sweep these bugs under it, but don't. Perhaps document it and carry on looking into other issues as it may return by itself. Ultimately though, you need to go back and fix it at some point. So, more effort needs to be applied in aggravating the problem to reproduce the bug.

Step 11: Celebrate
Remember how bad it felt when the bug was grinding you down? Now, celebrate when you win. It's you: 1, bugs: 0. Now the game can move on and you can be sure you'll fix the next bug too.

About the author
Dunstan Power is a chartered electronics engineer providing design, production and support in electronics to all of ByteSnap Design's clients. Having graduated with a degree in engineering from Cambridge University, Dunstan has been working in the electronics industry since 1992, and in 2004 founded Diglis Design Ltd, an electronic design consultancy, where he developed many successful electronic board and FPGA designs.

To download the PDF version of this article, click here.


?First Page?Previous Page 1???2???3



Article Comments - Guide to hardware troubleshooting (P...
Comments:??
*? You can enter [0] more charecters.
*Verify code:
?
?
Webinars

Seminars

Visit Asia Webinars to learn about the latest in technology and get practical design tips.

?
?
Back to Top