mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Hardware (https://www.mersenneforum.org/forumdisplay.php?f=9)
-   -   prime95, torture test, and stability (https://www.mersenneforum.org/showthread.php?t=5676)

 polyestr 2006-03-27 20:22

prime95, torture test, and stability

Question: Imagine a computer has a hardware defect that is exposable through prime95's torture test. Is the probability of the torture test exposing this defect by failing a test the same in a 20-hour run as it is in two 10-hour runs?

I guess what I'm asking is whether the tests progressively "focuses in" in any way, or if each test is simply an independent event.

 polyestr 2006-03-27 21:28

..or to be more specific..

Ok maybe a better way of phrasing this is as follows: does the torture test avoid redundancy in the parameters of each test. OR is it the case that each test may or may not be redundant to some degree.

 markr 2006-03-28 00:41

Each run of a test is the same, following a fixed sequence which eventually repeats. (You can choose different types of test & parameters, but that's a different story.) Ten hours should be longer than the sequence, so 2 * 10 = 20, near enough, but 2 * 1 is not the same as 2.

[quote=polyestr]Question: Imagine a computer has a hardware defect that is exposable through prime95's torture test. Is the probability of the torture test exposing this defect by failing a test the same in a 20-hour run as it is in two 10-hour runs?[/quote]Well, that may depend on the exact nature of the defect, but I'll guess that, in general, a 20-hour run is slightly more stressing (and thus slightly more likely to expose the defect) than two 10-hour runs. For example, heat buildup -- running it continuously for a whole day, including the period when the system's surroundings are diurnally warmest, is better than a series of runs that miss the heat of the day. (But two runs during the hottest 10 hours of each of two days might be more stressing than one 20-hour run that includes only one complete daily hot period.)

[quote]I guess what I'm asking is whether the tests progressively "focuses in" in any way,[/quote]Not really

[quote]or if each test is simply an independent event.[/quote]Basically, yes.

A torture test runs a certain number of L-L iterations on each of many different exponents (actually, Mersenne numbers with different exponents, of course). The table of test exponents (and the corresponding correct answer for each one) is fairly long - many dozens of different exponents, covering a wide range. I think it may take more than 24 hours to cycle through the complete set of test exponents on at least some systems, so letting the torture test run for long periods is the best way to push as many different bit patterns through as possible, so as to maximize the probability of hitting upon a pattern that is defectively processed.

 polyestr 2006-03-28 07:43

Thanks

Awesome. Thanks all.

 Unregistered 2006-08-12 05:22

Another question: What parts of the CPU are and aren't tested? I'm using the program to test a heavily overclocked Prescott CPU. When I push it too far, I start getting "rounding errors, expect < 0.4". Upping the voltage makes it good for a few more bus clicks, etc.

My question is: What does this error mean, exactly? What kinds of instructions are failing? Are there other electrical pathways in my CPU that could be faulty that prime isn't testing properly?

 Prime95 2006-08-12 12:45

The error means that some floating point instruction returned a bad result. Prime95 does a great job at testing your floating point unit, L1 and L2 caches, and main memory. If the "weakest" spot in your systeem is somewhere else, then another program might fail before prime95 does. That is why you should run a variety of test programs before calling your machine stable.

 All times are UTC. The time now is 20:16.