![]() |
![]() |
#23 | ||
Sep 2022
Munich, Germany
1610 Posts |
![]() Quote:
I‘ll try the underclocking test as you mentioned, just to help sorting this out. Your theory about cores and extra stress does not explain though why two different machines see the exact same limit on memory. And why only with 240k size. And why only 30.x version. And why always at same PC with a reference to always the same illegal address. I am a rather experienced programmer myself. If it looks like a race condition, then it probably is one … I have seen race conditions pass undetected for years … BTW, Prime95 doesn‘t even stress my machine much. Programs like Y Cruncher or Cinebench cause higher CPU temperatures. I know, means little when accessing a lot of memory. But then, some sort of instability should emerge with v29.6, shouldn‘t it? I added the option AffinityVerbosityTorture=1 in prime.txt, but v30.8 FFT 240k still fails like this, eventually craches: Quote:
Now ran the Underclocking tests. I almost halfed the Clock (to 2GHz as monitored during execution), also increased DRAM voltage up a notch. v30.8 still fails in the exact same manner. I also tried to overclock to 4 GHz (but rather ran at 3.5GHz) and v29.6 ran with no issues. Hope this helps to pin this down. Last fiddled with by falk on 2022-09-27 at 23:23 |
||
![]() |
![]() |
![]() |
#24 |
If I May
"Chris Halsall"
Sep 2002
Barbados
22×2,767 Posts |
![]()
Join the very large club.
Correlation does not *necessarily* mean causality. This is why experienced programmers observe carefully the in situ. |
![]() |
![]() |
![]() |
#25 | |||
P90 years forever!
Aug 2002
Yeehaw, FL
815810 Posts |
![]() Quote:
Quote:
Quote:
|
|||
![]() |
![]() |
![]() |
#26 | |
Sep 2022
Munich, Germany
208 Posts |
![]() Quote:
If you really wanted to help, you would run v30.8 on 128GB+ with 10+ cores, ideally Intel. BTW, correlation does indeed not imply causality (which is why I said "probably"), but correlation is all you can measure and causality is always just theorized. According to the scientific method. In my last few posts above, I used "hypothesis" in my title for a reason. The most nasty bug I chased wasted 3 months of my life. Spending most of the time convincing others that there was a bug in the first place (it was a fully valid program when run, made a supercomputer reboot without leaving any traces that my program ever even existed...). Fortunately, this time I don't depend on a fix. Last fiddled with by falk on 2022-09-27 at 23:38 |
|
![]() |
![]() |
![]() |
#27 | |
Sep 2022
Munich, Germany
24 Posts |
![]() Quote:
Race conditions often disappear when adding debug logging. So, it would be interesting if there were some fine grain control of log verbosity. But maybe, let‘s first find a third machine with the same issue … Tomorrow, I‘ll test v30.9. Saw it on your ftp server. If I find the time, I may do a binary search for the exact version 30.x B.y when the issue first energed. Would that help? Last fiddled with by falk on 2022-09-28 at 00:14 |
|
![]() |
![]() |
![]() |
#28 |
P90 years forever!
Aug 2002
Yeehaw, FL
2·4,079 Posts |
![]()
Please try p95v308b17.win64.zip
|
![]() |
![]() |
![]() |
#29 |
"Mihai Preda"
Apr 2015
22×192 Posts |
![]()
Should be fixed in the new build posted. Was affecting systems with large (>64GB) RAM.
|
![]() |
![]() |
![]() |
#30 | |
Sep 2022
Munich, Germany
24 Posts |
![]() Quote:
@preda seems to be fixed, thanks for your fix. Need participants in GIMPS to be concerned about possibly wrong results posted? The issue seems to resolved. I learned a lot about Mersenne primes along the way and how to do integer multiplication right :) |
|
![]() |
![]() |
![]() |
#31 | ||
P90 years forever!
Aug 2002
Yeehaw, FL
2·4,079 Posts |
![]() Quote:
Quote:
|
||
![]() |
![]() |
![]() |
#32 |
"Mihai Preda"
Apr 2015
22·192 Posts |
![]()
Not my fix, George found the issue and fixed it. I was only affected by it, and I did a thorough hardware debug (replace CPU, replace all memory, replace subsets of DIMMs etc) to reach the conclusion that it's unlikely to be a HW issue. Afterwards I tested the candidate build, all fine, so looked the fix was good.
|
![]() |
![]() |
![]() |
#33 |
If I May
"Chris Halsall"
Sep 2002
Barbados
2B3C16 Posts |
![]() |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
odd segmentation fault | ChristianB | YAFU | 4 | 2015-09-09 19:38 |
Segmentation fault in msieve. | include | Msieve | 4 | 2012-11-14 00:59 |
Segmentation fault | PhilF | Linux | 5 | 2006-01-07 17:12 |
Linux FC3 - mprime v23.9 : Segmentation fault (core dumped) nohup ./mp -d | T.Rex | Software | 5 | 2005-06-22 04:22 |
Segmentation Fault | sirius56 | Software | 2 | 2004-10-02 21:43 |