![]() |
![]() |
#1 |
"J. Gareth Moreton"
Feb 2015
Nomadic
2×32×5 Posts |
![]()
I'm sure this is everyone's worst nightmare, but here goes...
This morning, I arrived at my workplace to find my workstation had shut down. Not thinking anything much of that, apart from minor inconvenience of having lost some prime number search time, I discovered to my horror that it wouldn't boot... keyboard didn't activate, monitors didn't get a signal, and after about a minute, one of the cooling fans started to sound like a jet engine. Initially I thought that Prime95 had somehow caused the CPU to burn out, but upon some diagnostics with the in-house support team, we found that the PSU had failed and the fact that I was running a prime number checker was just a coincidence. Everything else in the computer still works, but they can't simply replace the bust PSU due to 'warranty'. In the meantime I have been given a temporary replacement machine (annoyingly, less powerful than my original workstation), but I did request that I may want to read the hard drive of my old computer (to recover the progress made by Prime95, since one of the tests was a 100-million digit test that had been running for over 100 days), although I'm not sure if I'll be able to get access to that hard drive again. If worst comes to the worst, would those tests have to be started again from scratch or can they be partly recovered (I don't know if partial residues are ever sent to the server)? |
![]() |
![]() |
![]() |
#2 |
Aug 2015
22·17 Posts |
![]()
Partial residues are never sent to the server. You'll need to start them from scratch in the worst case scenario.
|
![]() |
![]() |
![]() |
#3 |
∂2ω=0
Sep 2002
República de California
101101110100002 Posts |
![]()
For very long runs like that, I suggest making a habit of copying one of the redundant residue file every 10Miters or so (I like to append the approx. iter count in M to the filename, e.g. [save].130M to uniquify it) and offloading to somewhere else. Live and learn.
Good luck with the data recovery, in any event! |
![]() |
![]() |
![]() |
#4 |
"J. Gareth Moreton"
Feb 2015
Nomadic
1328 Posts |
![]()
And I just realised that partial residues would be p bits long anyway (from 2p - 1), much too long to send to a server on a periodic basis. I'll see if I can recover the partially-completed work.
Oh well, you live and learn! |
![]() |
![]() |
![]() |
#5 |
Einyen
Dec 2003
Denmark
331310 Posts |
![]()
The line "InterimFiles=10000000" in prime.txt will save a full backup file every 10M iterations, which for a 332M+ exponent will probably be >40 Mb ?
|
![]() |
![]() |
![]() |
#6 |
Romulan Interpreter
"name field"
Jun 2011
Thailand
35·41 Posts |
![]()
@OP:
If you can't get access to the HDD, you will have to do the tests again from scratch. OTOH, the P95 and the crash of the PSU may not be coincidental. If the PSU was somehow at the limit (as in "a 500W" or "a 750W" PSU, depending on the other HW you had in the box), then the additional (and continuous) stress P95 is putting into it, will blow it off (as opposite to "normal work", word, excel, compiling, etc, which still can suck more energy occasionally, but not for long time continuous, so the mosfets in the PSU have some time to "cool down"). Last fiddled with by LaurV on 2015-09-02 at 14:35 Reason: added @OP for clarity |
![]() |
![]() |
![]() |
#7 |
"J. Gareth Moreton"
Feb 2015
Nomadic
2·32·5 Posts |
![]()
Hmmm, that's a good point. The PSU was only 240W (it normally doesn't need much power - the computer doesn't have a dedicated graphics card, for example) so that might have pushed it to breaking point - something I better investigate actually.
In the meantime, I've got the old computer back with a replaced PSU. I've also made sure that Prime95's progress files and work list are saved to a network store. |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Prime 95 result - Hardware Failure | pbunn | Information & Answers | 37 | 2013-04-22 21:41 |
Hardware failure detected !!! | MaZeNsMz | Information & Answers | 2 | 2008-06-21 12:05 |
Hardware Failure Detected | bigal_nz | Hardware | 2 | 2007-02-07 10:43 |
NEW USER - HARDWARE FAILURE - PLEASE HELP | Cosmo | Hardware | 45 | 2005-10-17 10:00 |
Hardware failure only detected on torture test or also when factoring/LL-testing...? | Jasmin | Hardware | 10 | 2005-02-14 01:58 |