Global Sources
EE Times-Asia
Stay in touch with EE Times Asia
EE Times-Asia > T&M

Solve random tests' failure to identify regression

Posted: 31 Oct 2012 ?? ?Print Version ?Bookmark and Share

Keywords:ASIC? random tests? regression testing.?

Figure 2 shows how PinDown operates on the flow of random test failures. The stream of random failures are split into regressions and new tests, where the regressions are diagnosed down to the exact revision that caused the problem and a bug report is sent to the person who committed each the error. This allows regression errors to be fixed fast and thus allows the device and testbench to maintain high quality.

The other category is new tests, i.e. tests that have always failed and are consequently covering a new test scenario. These are not failing due to a sudden regression in quality, which may lead to panic and holding the release, but is new test coverage which is overall positive news.

Figure 2: Random testing with PinDown.

This setup solves the problem with using random tests in regression testing. It allows you to keep running random testing with the upside of getting good coverage without the downside of not being able to identify regressions.

Challenges with backtracking
There are challenges with backtracking, it is not as straight forward (or backward) as it may sound.

The first challenge is random stability, a topic widely discussed as it affects any debugging with random tests. Random stability is the art by which the same seed should always give you the same test even if the testbench has been updated. When you debug a test failure you want to reproduce the same scenario by providing the same seed number, and not get a new scenario where the test may not even fail, just because the testbench was updated. In one end of the spectrum, the EDA vendors often claim that they have perfect random stability, but at the other end of the spectrum it is impossible to make such guarantees for major changes of the testbench.

Random stability of the commercial tools has improved in recent years. Some years ago a vendor, who shall not be named, could not handle any changes to the testbench, not even comments, without losing the random stability, as the randomness was based on the number of characters in a file. Luckily those days are over. These days limited changes to the testbench does not affect random stability unless you fiddle with the random generation itself or change the structure of the entire testbench, .e.g instantiate more modules with random generators or change the dependencies between modules.

How does random stability affect back tracking? Well, if you encounter a pass in an older revision this is probably because you have reached a point before that error was introduced (which allows you to point at the faulty revision), but the pass may also be because of testbench changes which have changed the test to test something else. Capturing the impact of limited testbench changes is as important as capturing design bugs, but there always the risk that the test with the same seed passed on an earlier revision was producing a different test back then as random stability is not guaranteed for major changes. This problem is bigger when the testbench is undergoing major design changes and is reduced at the later stages of the project when the testbench is updated with smaller changes, such as constraint changes. The fact that backtracking can help you narrow down the problem is still very useful, especially using automatic backstracking such as PinDown, as it can point to the exact revision in the testbench when the test started to fail. If the commit message for the faulty revision says something like "changed constraints to solve an issue" then this revision probably introduced a real error and the debug analysis was correct. However if the commit message on the other hand says "Changed the random generation for one module" then this revision may not have introduced an actual error, just changed the test to test something completely different.

How often do bigger changes occur? Most changes are minor changes, like constraints update, whereas major revamps or new designs come less frequent. Every big change is followed by a number of small fixes. According to one paper 90% of updates is less than 10 lines of code. Depending how well designed the testbench is the more it will be randomly stable. But in most systems the far majority of the testbench changes will be minor and easily debuggable by back tracking. But what happens if the debug goes wrong because the exact same test is not reproducible on older revisions due to a major change? Well, if the automated debug failed, you are back to where you are now: manual debug. Automatic back tracking is about improving productivity, and no damage is done if there are cases where you still have to do manual debug. As long as the far majority of all issues can be automatically debugged then you get the much sought after overall productivity improvement.

A second challenge with backtracking is that it consumes time. All debugging takes time, a lot of time, so this is nothing unique with backtracking. However, the smarter you make the selection of older revisions and tests the faster you can backtrack through the revision history. PinDown has an algorithm (patent pending of course) which does a very good job at this, but if you do backtracking manually you should use your knowledge of the design to carefully select the fastest test on some good older revisions to get to a conclusion fast.

Random tests are great to use in regression testing to get good coverage, but they cannot distinguish regressions, i.e. dips in quality, from improvement in coverage. This can be solved by backtracking through older revisions and retest the failing test using the same test and the same seed on older revisions in order to separate regressions from tests that fail because it contains a new test scenario. This process can be done manually or automated with a tool such as PinDown. The issue of random stability means some updates of the testbench will still need to be manually debugged, but the far majority of all test failures can be automatically analyzed. Identifying regressions quickly and automatically allows you to maintain high quality, which in the end leads to an earlier release.

About the author
Daniel Hansson, CEO has 15+ years of experience as an ASIC designer and project manager from Ericsson, ST-Ericsson, Texas Instruments and ARC. Daniel has worked on innovative regression test flows with patents both granted and pending.

To download the PDF version of this article, click here.

?First Page?Previous Page 1???2

Article Comments - Solve random tests' failure to ident...
*? You can enter [0] more charecters.
*Verify code:


Visit Asia Webinars to learn about the latest in technology and get practical design tips.

Back to Top