mersenneforum.org  

Go Back   mersenneforum.org > New To GIMPS? Start Here! > Information & Answers

Reply
 
Thread Tools
Old 2011-02-26, 21:41   #1
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

23·919 Posts
Default AMD machine fails torture test

A user has been emailing me with an odd torture test failure. I am not an expert in AMD cpus and motherboards, so any help would be appreciated:

His system:

AMD Athlon II X3 3.2 GHz CPU: http://www.newegg.com/Product/Produc...82E16819103886
MSI 880G-E45 Motherboard: http://www.newegg.com/Product/Produc...82E16813130563
4 GB (2x2GB) G.SKILL DDR3 1600 RAM (faster than necessary for board): http://www.newegg.com/Product/Produc...82E16820231277
Thermaltake TR2 W0070RUC 430W Power Supply: http://www.newegg.com/Product/Produc...82E16817153023


The symptom: repeatable crashes 7.5 hours into a blend test.
I suggested isolating which FFT size is the problem, and playing around with swapping DIMMS. This is his latest email:




I switched to a hard disk installation of Ubuntu and ran mprime, redirecting its output to a text file. Reviewing the files from several iterations, each with a reboot after ~7.5 hours, I saw that it was always happening with a 96K FFT length.

Selecting blend test, then custom, and changing the min and max FFT length to 96K results in a consistent and repeatable issue (normally a reboot, but it froze instead at least once) in under five minutes.

Swapping the DIMMs between slots one and two made no difference.

I tested each DIMM alone in slot one. No problems.

I tested one DIMM alone in slot two. No problems.

I tested DIMMS in slots one and three (single channel configuration). No problems.

At that point I was thinking there was something wrong with dual channel mode on the motherboard. I don't understand all the rules dictating which RAM slots can be used or why, but I figured I'd see if it would boot with the RAM in slots three and four. Not only did it boot - the manual claimed it wouldn't - but it passed the test. "Ok, it's recognizing all the RAM and working fine, so maybe it's just in single channel mode." But I remembered that one of the packaged AMD monitoring apps was pretty detailed on the hardware, so maybe it could tell me whether it was running in dual or single channel mode. After booting back into Windows, it did, and it said dual channel. I switched to slots one and three, and then it did say single channel, so it wasn't just displaying its capability.

I don't have the knowledge to get any finer-grained than this, but it seems to me that this is confirmation that it's a subtle problem with the memory controller. Would you mind confirming that theory, or letting me know what else might cause that type of problem? I'm asking because I am willing to go through returning / exchanging parts over this, and want to make sure I'm returning the right one. And I don't really know anyone else I can check with who's knowledgeable when it comes to hardware. (Plus I was going to email you anyway to let you know how things turned out.)

Also, and this is just an earlier thought that's probably less important now, do you think it might have something to do with the on-board graphics sharing the RAM?



Do you agree with his conclusion that this is a memory controller issue?
Any other tests he should run?
Anyone care to guess why the problem only occurred with the 96K FFT?
Prime95 is offline   Reply With Quote
Old 2011-03-09, 22:03   #2
ChuckPa
 

100100011001112 Posts
Default Memory problems with AMD and P95 FFT

I hope I can help from personal past horror stories. I had nearly the same problem except, having been an earlier version of P95, it failed on a different size FFT. The problem was finally verified and isolated by running memtest86+ (bootable cd version) advanced (stress) tests. I found the problem NOT to be with the motherboard, but with the standard deviations which occur in manufacturing of DIMMS. Each one is slightly different. When 'caught' at just the right timing, which you obviously did, it fails where it otherwise would not. Your RAM has timing 7-8-7-24, which is fine in single-channel mode, but not dual. That extra cycle (8-8-8-24) is worth it to get all the bits up and stable BEFORE data latch. I cured my problem by purchasing MATCHED PAIRS of memory (two DIMMS / package) which were 'paired' together by the manufacturer. I use Corsair memory and it has never failed me. It also comes with a lifetime guarantee. I believe your motherboard is correctly detecting it can operate in dual-channel mode because the timing is 'close enough' for the bios to init the memory controller but over time you hit just the right address and presto.... crash. This has happened to me when the AMD is dumping the cache during the FFT tests. Just because your ram came two in the package doesn't mean it's a "matched pair". Here is something to look at: A 'matched pair' of Corsair (known as Dual Channel Kit) Note it does say "2 matched modules"... that is what you want for full speed dual channel mode. Here are the links: Memtest86+ distros (ISO) to burn a bootable CD for 24+ hour stress test. http://memtest.org/#downiso The Corsair at Newegg is here: http://www.newegg.com/Product/Produc...-296-_-Product You may also go directly to the Corsair site and look up ram based on your MoBo make & model. Hope this helps. Write if I can be of more help.
  Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Blend torture test fails. Unregistered Information & Answers 2 2008-12-06 07:20
Will the torture test, test ALL available memory? swinster Software 2 2007-12-01 17:54
V2.13, 2.14 - Self Test fails bradslk Software 17 2005-10-02 16:59
Torture Test ALWAYS fails on 896K FFT johnny o Hardware 3 2004-07-19 13:00
Torture test not torture enough? cmokruhl Software 3 2003-01-08 00:14

All times are UTC. The time now is 13:18.

Mon Mar 1 13:18:26 UTC 2021 up 88 days, 9:29, 1 user, load averages: 1.99, 2.43, 2.43

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.