mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2004-12-15, 14:23   #1
halcion
 

4,283 Posts
Default Complex, but deceptively simple question about Prime95 failures?

Assume the following situation:

1) Prime95 fails on systems specified with compontes of type X
2) Failures are not consistent (don't always happen)
3) Failures are not with all units of type X
4) Failures are not in the same spot (test type, minutes/calculations run)

If the situation is as explained above, then is it AT ALL possible, that Prime95 is to blame?

IMHO, there's a strong reason to suspect sw/hardware incompatibility between prime95 and a specific hardware, IF the failures are consistent, always in the same type/spot of calculations and on every single unit of that particular hardware.

But what if the reasons are as stated in 1-4?

Paraphrasing: are sporadic, inconsistent and here-and-there happening failures (almost always in Torture test/Blend resulting in critical error due to rounding) ALWAYS due to malfunctioning hardware?

And a continued question: what is the type of implementation that Prime95 uses in it's calculation? Are similar test-to-test situations (same test settings) always 100% the same in terms of computation sent to the cpu?

It's a simple question, but think before you answer, as it's not necessarily easy to answer (imho).

regards,
Halcyon

PS I don't claim to know the exact 100% totally full-proof guaranteed answer to this (although I believe I'm right), so I'm asking out of curiosity and willingness to learn.
  Reply With Quote
Old 2004-12-15, 19:07   #2
cheesehead
 
cheesehead's Avatar
 
"Richard B. Woods"
Aug 2002
Wisconsin USA

22×3×641 Posts
Default

Quote:
Originally Posted by halcion
2) Failures are not consistent (don't always happen)
That is more likely to be a hardware problem than a software bug.

Quote:
3) Failures are not with all units of type X
Once again, that is more likely to be a hardware problem than a software bug.

Quote:
4) Failures are not in the same spot (test type, minutes/calculations run)
Once again, ...

Quote:
If the situation is as explained above, then is it AT ALL possible, that Prime95 is to blame?
Given a fixed set of parameters [same Mersenne number exponent, same type calculation (FFT multiply, LL test, ...), same memory allocation, same hardware ...], then Prime95 performs exactly the same sequence of arithmetic operations each time. Therefore, if results differ or nonidentical failures occur on different runs with the same parameters, it cannot be due to a Prime95 bug, because the sequence of operations is the same in each case, so properly-performing hardware would always get the same arithmetic result.
cheesehead is offline   Reply With Quote
Old 2004-12-15, 19:19   #3
cheesehead
 
cheesehead's Avatar
 
"Richard B. Woods"
Aug 2002
Wisconsin USA

11110000011002 Posts
Default

Quote:
Originally Posted by halcion
But what if the reasons are as stated in 1-4?

Paraphrasing: are sporadic, inconsistent and here-and-there happening failures (almost always in Torture test/Blend resulting in critical error due to rounding) ALWAYS due to malfunctioning hardware?
Well, in that case it can't be due to a software bug if the software that is executed is identical each time (uncorrupted Prime95 with same parameters).

Quote:
And a continued question: what is the type of implementation that Prime95 uses in it's calculation?
I'm not sure what you mean by "type of implementation". Prime95 was entirely written by George Woltman plus some other contributors. It's not some off-the-shelf commercial software. If you want to, you can download the entire source code (except for a security module that is never used in stress tests) in a zip file.

Quote:
Are similar test-to-test situations (same test settings) always 100% the same in terms of computation sent to the cpu?
Yes.

There are no timing dependencies, or use of random/pseudorandom numbers, in the tests.
cheesehead is offline   Reply With Quote
Old 2004-12-15, 19:59   #4
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

2·4,787 Posts
Default

Quote:
Originally Posted by cheesehead
... then Prime95 performs exactly the same sequence of arithmetic operations each time. Therefore, if results differ or nonidentical failures occur on different runs with the same parameters, it cannot be due to a Prime95 bug, because the sequence of operations is the same in each case, so properly-performing hardware would always get the same arithmetic result.
One minor thing to note: This assumes that there are not errant effects from other software, a process gone awry or a very poorly coded app that the OS hasn't kept in check. If you run some app that plays with the memory that P95 is using, the errors are software related, but to do with the other app.
Uncwilly is online now   Reply With Quote
Old 2004-12-15, 23:46   #5
jinacio
 
Nov 2004

38 Posts
Default hardware...

fast version: hardware problems.

many overclockers use prime95 to test stability of the computer; sometimes, although it "seems" stable and like every component is working properly (mainly cpu, motherboard, ram) sometimes it takes a "real app." (one that "stresss" these resources to the maximum) to show if a computer is indeed rock-solid or not.

sometimes there's a very thin line between 99% working hardware and 100% rock-solid hardware.

as an example: if i overlock my computer to a certain point, most of the apps work fine, but a few selected ones don't.


just my 2 cents...
jinacio is offline   Reply With Quote
Old 2004-12-16, 00:42   #6
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

5·23·71 Posts
Default

Use the mprime floppy to minimize OS/driver variables:

http://www.mersenneforum.org/zip/
Xyzzy is offline   Reply With Quote
Old 2004-12-16, 20:10   #7
cheesehead
 
cheesehead's Avatar
 
"Richard B. Woods"
Aug 2002
Wisconsin USA

170148 Posts
Default

Quote:
Originally Posted by Uncwilly
This assumes that there are not errant effects from other software, a process gone awry or a very poorly coded app that the OS hasn't kept in check. If you run some app that plays with the memory that P95 is using, the errors are software related, but to do with the other app.
Thank you, Uncwilly!

That is an important, not minor, point.

Halcyon, consider all my previous statements about software errors to be prefaced with: "Assuming that all operating system and other non-Prime95 software has no bug that affects Prime95 execution, ..."
cheesehead is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Simple Question Unregistered Information & Answers 3 2012-11-26 02:55
A simple question Bundu Math 11 2007-09-24 00:02
Simple question... or is it? akruppa Puzzles 28 2006-02-04 03:40
Simple question Suse 9.0 Faraday Linux 13 2005-06-01 02:22
Simple question on Prime95 xtreme2k Software 4 2003-04-02 08:35

All times are UTC. The time now is 23:07.

Thu May 13 23:07:58 UTC 2021 up 35 days, 17:48, 0 users, load averages: 3.21, 2.97, 3.00

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.