mersenneforum.org  

Go Back   mersenneforum.org > New To GIMPS? Start Here! > Information & Answers

Reply
 
Thread Tools
Old 2020-10-10, 03:10   #1
triangularbasic
 
Oct 2020
Australia

7 Posts
Default Fatal error when running p95 Large FFT

Hi everyone,
I apologize for the long post, but I have been having stability issues with my Work/Gaming computer and would like to please ask for some help.


System specs:
CPU: I9-9900k Processor, [non-OC].
Motherboard: ASUS ROG Maximus XI Hero.
RAM: Kingston HyperX Predator RGB HX440C19PB3AK2/16 16GB (2x8GB) DDR4, [XMP-4000], installed in slots A2 & B2 as per manual.
PSU: Corsair AX860 Platinum 860W.
GPU: none, currently monitor is connected to MOBO HDMI.


Issue:
Yesterday while working on AutoCAD, suddenly my PC became slow, Task Manager showed high RAM usage by Autocad (95% +), ending Autocad via task manager solved high ram usage issue. However, when I opened Autocad to work on the same project, the same issue occurred, this time I didn’t end Autocad via task manager to see what would happen, so after a minute or two of high memory usage, The PC display went black for a moment and then back to the Windows 10 login screen, I logged in and no errors were showing in event viewer. So I opened Autocad for the third time, and the same problem occurred, This time my PC restarted, and once logged in I checked event viewer to see critical error: Event ID 41- Kernel power-system.


Diagnostics:
Windows memory diagnostics: No issue detected.
Prime95/Large FFT/AVX-512 Disabled/45 Minute run time: Fatal error in worker 10 “rounding was 0.49xx expected 0.4” (I did notice that 2 CPU threads were not working at 100% during the torture test)
Prime95/Small FFT/AVX-512 Disabled/36 tests 1 hour 7 minutes run time: no error nor warnings.
Memtest86: completed 2 passes, 6 errors found on first pass, 59 cumulative errors by end of second pass.


Question:
I am please asking for advice on what to do next. Is there a way to isolate anything to see whether the issue is from the CPU or ram or motherboard?
Thank you very much in advance for your help.

Last fiddled with by triangularbasic on 2020-10-10 at 03:38 Reason: adding information.
triangularbasic is offline   Reply With Quote
Old 2020-10-10, 04:40   #2
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2·3·1,193 Posts
Default

Failing Large FFT and passing Small FFT usually indicates a memory problem.
Memtest86 confirms this.

Try running your RAM at 3600 instead of 4000.
Prime95 is offline   Reply With Quote
Old 2020-10-10, 06:00   #3
triangularbasic
 
Oct 2020
Australia

710 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Failing Large FFT and passing Small FFT usually indicates a memory problem.
Memtest86 confirms this.

Try running your RAM at 3600 instead of 4000.
Thank you for your reply,
I have just finished running another set of memory tests, and will try your suggestion next.
I have ran memtest86 with 2 passes on my two ram sticks individually, both sticks, when tested individually in slot A2 provided no errors over 2 passes, does this indicate that the memory stick are not faulty?
triangularbasic is offline   Reply With Quote
Old 2020-10-10, 14:49   #4
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

27×61 Posts
Default

Quote:
Originally Posted by triangularbasic View Post
I have ran memtest86 with 2 passes on my two ram sticks individually, both sticks, when tested individually in slot A2 provided no errors over 2 passes, does this indicate that the memory stick are not faulty?
Does your memory pass a torture test if you run with the JEDEC profile? Are you using XMP1 (Asus) or XMP2 (Default)? What are your VCCIO and VCCSA voltages? (These are in the BIOS near the bottom of the "Ai Tweaker" page.)

We would test both sticks together with the JEDEC profile first.

Xyzzy is offline   Reply With Quote
Old 2020-10-10, 16:31   #5
scan80269
 
"Sam"
Jun 2019
California, USA

29 Posts
Default

Getting DDR4 memory to run robustly at 4000 can be tricky. My guess is that your rig will stabilize once you disable XMP profile and run the memory at stock JEDEC 2133 speed, or back the speed down to 3600 like Prime95 suggested.

One of my rigs is quite similar:

System specs:
CPU: i9-9900K Processor, [non-OC].
Motherboard: Gigabyte Z390 Aorus Master.
RAM: Corsair Vengeance LPX CMK16GX4M2E4000C19R 16GB (2x8GB) DDR4, [XMP-4000], installed in slots A2 & B2 as per manual.
PSU: EVGA Supernova 1000T2
GPU: none, currently monitor is connected to MOBO HDMI

I quickly discovered that with the Corsair 2x8GB 4000 memory kit, this system is stable with XMP enabled at 4000, but switching to a Corsair 2x16GB 4000 kit, the system could not even finish POST properly at the 4000 speed! The difference was like day and night.

Another thing I learned is that a system passing Memtest86 repeatedly is not an iron clad guarantee that errors will not occur when running Prime95. A few of my systems dedicated to running Prime95 have memory with XMP settings such as 3333, 3466 and 3600, and every few weeks one system would pick up a Gerbicz error during a PRP run. There had been occasions where the Gerbicz errors became so frequent I had to switch the memory to a slower grade, e.g. from 3466 to 3333, or from 4000 to 3866. I even resorted to reprogramming XMP timings in the SPD of the DDR4 modules (using Thaiphoon Burner) as an alternative to buying different memory kits to try.

So it's best not to push the memory speed too aggressively. Running the memory "fast & dangerous" is often not a good thing, especially with Prime95. Only when a system can pass both Memtest86 and Prime95 torture test for extended periods can it be considered robust.

Last fiddled with by scan80269 on 2020-10-10 at 16:34
scan80269 is offline   Reply With Quote
Old 2020-10-10, 17:02   #6
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

172008 Posts
Default

We have given up on using memtest to test memory. We actually spent money (!) and bought a package called "Karhu RAM Test". It works and more importantly, it works fast. We already fixed one system in minutes rather than waiting hours, and our time is worth at least $1 an hour, so the software has already paid for itself!

https://www.karhusoftware.com/ramtest/

Xyzzy is offline   Reply With Quote
Old 2020-10-10, 17:51   #7
S485122
 
S485122's Avatar
 
Sep 2006
Brussels, Belgium

110001111012 Posts
Default

Quote:
Originally Posted by triangularbasic View Post
...
I have ran memtest86 with 2 passes on my two ram sticks individually, both sticks, when tested individually in slot A2 provided no errors over 2 passes, does this indicate that the memory stick are not faulty?
Try the sticks in the other slot : it might be the motherboard.

Jacob
S485122 is offline   Reply With Quote
Old 2020-10-11, 03:04   #8
triangularbasic
 
Oct 2020
Australia

7 Posts
Default

Quote:
Originally Posted by Xyzzy View Post
Does your memory pass a torture test if you run with the JEDEC profile? Are you using XMP1 (Asus) or XMP2 (Default)? What are your VCCIO and VCCSA voltages? (These are in the BIOS near the bottom of the "Ai Tweaker" page.)

We would test both sticks together with the JEDEC profile first.

Thank you for your reply,
I have not tried running a torture test with the JEDEC profile yet. Previously, when i ran the tests and found errors, i was using XMP 1 in BIOS, so what i did now was: replace both ram sticks in their original position, set XMP 2 in BIOS and ran memtest86 for 2 passes, and P95 Large FFT for one hour, and no errors were recorded thankfully.


BIOS settings under XMP2:

VCCIO is 1.424V
CPU System agent voltage is 1.424V
CPU Core/cache voltage is 0.977V
ASUS MCE is off.
the voltages were set automatically, do they seem OK?


Also one more question please, since i did not find any errors when running in XMP 2, does that mean there could be an error in my CPU memory controller? or no since no errors were picked up in p95 large FFT? And that the error was stemming from XMP 1?

Thank you.

Last fiddled with by triangularbasic on 2020-10-11 at 03:19 Reason: adding information.
triangularbasic is offline   Reply With Quote
Old 2020-10-11, 03:08   #9
triangularbasic
 
Oct 2020
Australia

7 Posts
Default

Quote:
Originally Posted by scan80269 View Post
Getting DDR4 memory to run robustly at 4000 can be tricky. My guess is that your rig will stabilize once you disable XMP profile and run the memory at stock JEDEC 2133 speed, or back the speed down to 3600 like Prime95 suggested.

One of my rigs is quite similar:

System specs:
CPU: i9-9900K Processor, [non-OC].
Motherboard: Gigabyte Z390 Aorus Master.
RAM: Corsair Vengeance LPX CMK16GX4M2E4000C19R 16GB (2x8GB) DDR4, [XMP-4000], installed in slots A2 & B2 as per manual.
PSU: EVGA Supernova 1000T2
GPU: none, currently monitor is connected to MOBO HDMI

I quickly discovered that with the Corsair 2x8GB 4000 memory kit, this system is stable with XMP enabled at 4000, but switching to a Corsair 2x16GB 4000 kit, the system could not even finish POST properly at the 4000 speed! The difference was like day and night.

Another thing I learned is that a system passing Memtest86 repeatedly is not an iron clad guarantee that errors will not occur when running Prime95. A few of my systems dedicated to running Prime95 have memory with XMP settings such as 3333, 3466 and 3600, and every few weeks one system would pick up a Gerbicz error during a PRP run. There had been occasions where the Gerbicz errors became so frequent I had to switch the memory to a slower grade, e.g. from 3466 to 3333, or from 4000 to 3866. I even resorted to reprogramming XMP timings in the SPD of the DDR4 modules (using Thaiphoon Burner) as an alternative to buying different memory kits to try.

So it's best not to push the memory speed too aggressively. Running the memory "fast & dangerous" is often not a good thing, especially with Prime95. Only when a system can pass both Memtest86 and Prime95 torture test for extended periods can it be considered robust.
Thank you for your reply,
What a nice coincidence to find someone with a similar rig to mine haha! You made a good point regarding passing memtest86 and prime 95 before assuming a system is stable. So i changed a setting in my bios from XMP 1 to XMP 2 (the DIMMs complete default xmp profile) and ran memtest86 for 2 passes, and p95 large FFT for one hour, and i think i found the issue to stem from the XMP 1 setting in bios, as i have not encountered any errors yet thankfully running both ram sticks on XMP 2.
Cheers.
triangularbasic is offline   Reply With Quote
Old 2020-10-11, 03:11   #10
triangularbasic
 
Oct 2020
Australia

7 Posts
Default

Quote:
Originally Posted by S485122 View Post
Try the sticks in the other slot : it might be the motherboard.

Jacob
Thank you for your reply,
I have changed a setting in my BIOS from XMP 1 to XMP 2, and have not encountered any errors yet thankfully, however in the event that i do, would you recommend installing the ram stick in Slots A1 and B1 to test the motherboard?
Thank you.
triangularbasic is offline   Reply With Quote
Old 2020-10-11, 07:56   #11
S485122
 
S485122's Avatar
 
Sep 2006
Brussels, Belgium

159710 Posts
Default

Quote:
Originally Posted by triangularbasic View Post
Thank you for your reply,
I have changed a setting in my BIOS from XMP 1 to XMP 2, and have not encountered any errors yet thankfully, however in the event that i do, would you recommend installing the ram stick in Slots A1 and B1 to test the motherboard?
Thank you.
I was reacting to the fact that the memory sticks gave no error when tested them individually in slot A2, I would have tested at least one of the sticks as only memory but in slot B2 to eliminate a bad slot as source of the problem. In the mean time you found a working setting.

Jacob
S485122 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Fatal error Unregistered Information & Answers 2 2010-02-26 23:28
Fatal Error Cameron2384 Information & Answers 3 2009-04-26 13:37
Fatal Error jerome2710 Information & Answers 0 2009-02-16 20:48
Fatal error on new PC. What to do? Unregistered Hardware 2 2005-09-05 21:14
Fatal Error running Prime95 furimonkey Hardware 14 2002-12-28 22:44

All times are UTC. The time now is 20:43.

Thu Nov 26 20:43:36 UTC 2020 up 77 days, 17:54, 4 users, load averages: 1.27, 1.28, 1.34

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.