mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   remote chance of a problem? (https://www.mersenneforum.org/showthread.php?t=4951)

nomadicus 2005-11-07 03:50

remote chance of a problem?
 
Is there even the remotest chance that v24.14 on an AMD X2 (I have a 4400+) dual-core, that passed several 12 hour torture tests with ease can have a illegal sumout problem?

I switched back to v23.8.1 and cannot reproduce the problem.

I switch back again to v24.14 and can reproduce the problem fairly easily: it takes 3 or 4 start/stops to get it to fail.

Thoughts?


edit: clarification

JHagerson 2005-11-07 05:13

You might have found a bug in Prime95. Could you please post more details? What exponent are you testing? Are you performing an LL test? How many iterations occur before failure? Specifically what chip are you using? We need gory details like code name, speed, size of caches, and all of that.

With some additional information, Dr. Woltman (Prime95) can try to reproduce the problem.

Mystwalker 2005-11-07 11:00

It could also be the case that 24.14 produces significantly more heat and thus increases the chance of a failure in borderline cases.

[QUOTE=JHagerson]Dr. Woltman (Prime95)[/QUOTE]

AFAIK, it's "only" Mr. - but what's in a name? :wink:

garo 2005-11-07 11:04

v24.14 is more efficient and thus stresses your computer more. So it is entirely likely that a borderline stable system works with v23.8 but not with v24.14.

ewmayer 2005-11-07 19:19

What exponent are you testing, Nomadicus?

nomadicus 2005-11-07 21:40

The rig:
DFI LP nF4 SLI-DR
AMD 64 X2 4400+ (2.2GHz dual-core, L2 cache 1MB each core)
2x1GB OCZ 2-3-2-5 Titanium
eVGA 7800GTX
RAID1+0 4x74GB Raptors
Maxtor 250GB (16MB cache)
NEC 3540A CD/DVD burner
Enermax 600Watt Noisetaker
Lian-Li PC-V1000B case

Although this is a benchmarking (hence the RAID1+0) and gaming rig, I don't have it overclocked; I had a mild overclock going at one time but have since returned everything to its stock settings. At first I thought the video drivers may have been a problem. But I since confirmed they are not. All BIOS, drivers, etc. are up to date and are stable as far as reports on the web on concerned.

I do not run prime95 while gaming or benchmarking. After testing hardware/playing, I restart both instances of prime95 and get the error right away. That is how I noticed it.

This is what I see. This happens only when I stop/continue. I allow a few minutes between each stop/contine cycle so the CPU is cool when I continue. I've never seen it happen several minutes/hours after I continue. Once it restarts successfully, it can run for days without an error.

This rig has passed several (i.e., three or four I forget) 12 hour torture tests at both OC and stock settings.

Test=30322213,68,1 has affinity set to CPU 0.
Test=30322363,68,1 has affinity set to CPU 1.

I am presently checking out the memory more closely and will let you know if I find a problem with it . . . but I have a feeling the memory is okay. We'll see.

nomadicus 2005-11-07 22:19

Results.txt for CPU0

[Sun Nov 06 18:02:29 2005]
Iteration: 2/30322213, ERROR: ILLEGAL SUMOUT
Possible hardware failure, consult the readme.txt file.
Continuing from last save file.
[Sun Nov 06 22:46:54 2005]
Iteration: 53720/30322213, ERROR: ILLEGAL SUMOUT
Possible hardware failure, consult the readme.txt file.
Continuing from last save file.
[Sun Nov 06 22:52:02 2005]
Iteration: 53720/30322213, ERROR: ILLEGAL SUMOUT
Possible hardware failure, consult the readme.txt file.
Continuing from last save file.
[Sun Nov 06 22:57:09 2005]
Iteration: 53720/30322213, ERROR: ILLEGAL SUMOUT
Possible hardware failure, consult the readme.txt file.
Continuing from last save file.



---------------------
Results.txt for CPU1

[Sun Nov 06 18:02:41 2005]
Iteration: 2/30322363, ERROR: ILLEGAL SUMOUT
Possible hardware failure, consult the readme.txt file.
Continuing from last save file.

Mystwalker 2005-11-08 00:13

Could you check the CPU temperature while running both versions?

nomadicus 2005-11-08 04:05

[QUOTE=Mystwalker]Could you check the CPU temperature while running both versions?[/QUOTE]Idle 33-35C, with both torture tests at 43-44C.

nomadicus 2005-11-11 13:46

Update:
Fails immediately 1 out of 15 (roughly) times when I restart (I do a stop a few minutes before).
I am trying 24.13. I'll see it creates the same symptoms.
I am going to do 48 hour torture test this weekend using 24.14.

Suggestions welcome.

garo 2005-11-11 17:15

Did you run memtest86?


All times are UTC. The time now is 19:49.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.