![]() |
![]() |
#1 |
Jan 2019
24 Posts |
![]()
I am getting the following message from running mprime on AMD® Ryzen 9 3950x 16-core processor × 32 32 GB memory with Ubuntu 18.04.5 LTS as OS:
"Hardware errors have occurred during the test! 1 Gerbicz/double-check error. Confidence in final result is excellent." Can anyone help to understand what is happening? I have this machine since January 2020 so realtively new. Have I an actual "hardware problem"? |
![]() |
![]() |
![]() |
#2 |
If I May
"Chris Halsall"
Sep 2002
Barbados
2·3·5·373 Posts |
![]()
Possibly.
How hard have you pushed it to the limits (read: overclocked)? Is this a new build you are trying to test the limits of? Or is this a machine you've been using for a while, and suddenly it's reporting this? Perspiring minds want to know... |
![]() |
![]() |
![]() |
#3 |
Romulan Interpreter
"name field"
Jun 2011
Thailand
1028510 Posts |
![]()
One error is not big deal. Even the most stable hardware has errors sometimes (electricity flashes, cosmic rays, bad luck, etc). Along the test, the message repeats periodically till the test is finished, to remind you about, but it is the same, 1 (one) error that occurred in the past. Nothing to worry about.
The message will be gone once you finish the exponent, report the result, and start a new exponent. More errors is to worry. If they start growing, or appear regularly on subsequent tests, then yes, you may have a hardware issue. Meantime, try to monitor the temperatures closely. If they raise, reduce the clocks, clean the dust in the fans, or in the worst case, think about a re-seating of the CPU. Right now, do nothing (beside monitoring the system). Last fiddled with by LaurV on 2020-08-27 at 07:31 |
![]() |
![]() |
![]() |
#4 |
"Viliam Furík"
Jul 2018
Martin, Slovakia
14408 Posts |
![]()
I don't think that could help, since AMD uses PGA socket, that has pins on the CPU, and holes in the socket. If there was something that could be wrong, it would have to be one of these:
1. CPU is a tiny bit higher placed than it should be. - But that would most probably mean non-operability of the whole CPU. 2. One pin is missing, all other pins are in place. - THAT would be very interesting, but if the CPU is working, it would most probably cause some RAM to not be detected. |
![]() |
![]() |
![]() |
#5 |
"Yves"
Jul 2017
Belgium
5416 Posts |
![]() |
![]() |
![]() |
![]() |
#6 |
Romulan Interpreter
"name field"
Jun 2011
Thailand
1028510 Posts |
![]()
Re-seating has nothing to do with the pins side. Or, well, it has
![]() ![]() ![]() ![]() Last fiddled with by LaurV on 2020-08-27 at 18:49 Reason: s/both/bought/g |
![]() |
![]() |
![]() |
#7 |
Jan 2019
24 Posts |
![]()
I have dusted up the PC and restarted "mprime" but still getting the same error message. I am not overclocking the CPU (AMD 3950X) and "Throttle = 30" so CPU runs 30% of the time. The CPU temperature is about 50 degC.
I am using 4 workers and each has 4 threads as shown below: Resuming primality test of M54111917 using FMA3 FFT length 2880K, Pass1=1280, Pass2=2304, clm=2, 4 threads Resuming Gerbicz error-checking PRP test of M103884359 using FMA3 FFT length 5600K, Pass1=896, Pass2=6400, clm=2, 4 threads Resuming Gerbicz error-checking PRP test of M103884401 using FMA3 FFT length 5600K, Pass1=896, Pass2=6400, clm=2, 4 threads Resuming Gerbicz error-checking PRP test of M105836671 using FMA3 FFT length 5600K, Pass1=896, Pass2=6400, clm=2, 4 threads Note that for the first exponent M54111917 FFT length is 2880K and for the others (where I believe the "hardware error" is from) have FFT length of 5600 K. Is this a problem? Is there a way to stop this calculation and start a complete and fresh new one for a new set of 4 exponents? The above message of "Harware error" has appeared only recently. Thank in advance for any help you can provide. Last fiddled with by rgirard1 on 2020-08-27 at 18:35 |
![]() |
![]() |
![]() |
#8 |
Aug 2002
23·29·37 Posts |
![]()
What speed are you running your memory? What kind of memory is it? Have you tried a memory test?
https://www.memtest86.com/download.htm ![]() |
![]() |
![]() |
![]() |
#9 |
Aug 2002
23×29×37 Posts |
![]()
Also, have you run the torture test?
./mprime -m Select the torture test option. The defaults are fine. |
![]() |
![]() |
![]() |
#10 | |
P90 years forever!
Aug 2002
Yeehaw, FL
200716 Posts |
![]() Quote:
Do not use Throttle. Nowadays heat is rarely the cause of hardware problems. Usually it is memory related. Do not restart your calculations. The PRP error-checking has caught and corrected the problem. Your results will be just fine. Right now you should do nothing. Just keep an eye on the things. If you get more errors (do not worry about prime95 whining about the one error that has already occurred), then look at upping the memory voltage or reducing the RAM speed. |
|
![]() |
![]() |
![]() |
#11 |
Jan 2019
24 Posts |
![]()
I ran the torture test for over 48 hrs with default settings, no overclocking, no Throttle=30 basically the machine normal state. In the "results.txt" I got a very long listing like this:
. . . [Sun Aug 30 09:36:01 2020] Self-test 240K passed! Self-test 256K passed! Self-test 256K passed! . . . Self-test 256K passed! [Sun Aug 30 09:41:11 2020] Self-test 280K passed! i.e. all "Self-tests" passed and no error messages. I am concluding that hardware problems with my desktop are unlikely. I will stop the torture test and resume the prime95 calculations and if there are error message I will let them be until new exponents are assigned after the completion of the current calculations. I am wondering if the "restart" files are not somehow corrupted and bring these error messages. I do stop prime95 when I must do a Software Update for Ubuntu 18.04 and then resume the calculations after the Software Update. Is there a way to start a "fresh" new calculation with new assigned exponents? |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Possible hardware errors have occurred during the test! 1 ROUNDOFF > 0.4. | Xyzzy | Software | 7 | 2016-12-20 00:01 |
Possible hardware errors... | SverreMunthe | Hardware | 16 | 2013-08-19 14:39 |
Hardware, FFT limits and round off errors | ewergela | Hardware | 9 | 2005-09-01 14:51 |
more about hardware errors | graeme | Hardware | 4 | 2003-07-08 09:14 |
Reproducable hardware errors? | cmokruhl | Software | 2 | 2002-09-17 19:04 |