mersenneforum.org Prime95 30.8 (big P-1 changes, see post #551)
 Register FAQ Search Today's Posts Mark Forums Read

2021-12-03, 22:36   #89
Prime95
P90 years forever!

Aug 2002
Yeehaw, FL

5·7·227 Posts

Quote:
 Originally Posted by R. Gerbicz What is the third number on this line: "D: 1050, 120x403 polynomial multiplication." just guessing that the 2nd is eulerphi(1050)/2=120, but what is the 403 ?
D is the traditional step size incrementing from B1 to B2. 120 is eulerphi(1050)/2.
We create a polynomial with 120 coefficients that must be evaluated at multiples of D.

Montgomery/Silverman/Kruppa show how to evaluate the polynomial at multiple points using polynomial multiplication.

The 403 is the number of polynomial coefficients I can allocate for the second polynomial. FFT size and available memory dictate this number.

A single polynomial multiply evaluates the first polynomial at 403-2*120+1 points. Thus advancing toward B2 in steps of 1050 * 164 = 172200.

2021-12-03, 22:39   #90
Prime95
P90 years forever!

Aug 2002
Yeehaw, FL

5·7·227 Posts

Quote:
 Originally Posted by axn Something's not quite right here. The 24GB option shows about 20% less transforms, yet sees no significant improvement in elapsed time.
The number of transforms is only part of the stage 2 cost. The other significant cost is the polynomial multiplies. At present, there is no data output on the number of polymults or how expensive they were.

2021-12-04, 02:24   #91
axn

Jun 2003

2×2,693 Posts

Quote:
 Originally Posted by petrw1 Would folder 1 prime.txt have UsePrimenet=0? I'd prefer it send in both stages as 1 result.
Sure. I, in fact, use that setting in /both/ folders, and report them manually.

Quote:
 Originally Posted by Prime95 The number of transforms is only part of the stage 2 cost. The other significant cost is the polynomial multiplies. At present, there is no data output on the number of polymults or how expensive they were.
Gotcha.

 2021-12-04, 11:43 #92 SethTro     "Seth" Apr 2019 19·23 Posts With MaxHighMemoryWorkers=1 30.8v2 will resume two high memory workers at the same time. Code: $cat worktodo.txt [Worker #1] Pminus1=1,2,50111,-1,3000000,1000000000 [Worker #2] Pminus1=1,2,50227,-1,6000000,10000000000 [Worker #3] Pminus1=1,2,50263,-1,9000000,100000000000 Code: five:~/Downloads/GIMPS/p95$ ./mprimev308b2 -m -d [Main thread Dec 4 03:39] Mersenne number primality test program version 30.8 [Main thread Dec 4 03:39] Optimizing for CPU architecture: AMD Zen, L2 cache size: 12x512 KB, L3 cache size: 4x16 MB Your choice: 4 Worker to start, 0=all (0): 0 Your choice: [Main thread Dec 4 03:39] Starting workers. [Worker #2 Dec 4 03:39] Waiting 5 seconds to stagger worker starts. [Worker #3 Dec 4 03:39] Waiting 10 seconds to stagger worker starts. [Worker #1 Dec 4 03:39] P-1 on M50111 with B1=3000000, B2=1000000000 [Worker #2 Dec 4 03:39] P-1 on M50227 with B1=6000000, B2=10000000000 [Worker #3 Dec 4 03:39] P-1 on M50263 with B1=9000000, B2=100000000000 [Worker #1 Dec 4 03:39] M50111 stage 1 complete. 8656318 transforms. Total time: 22.501 sec. [Worker #1 Dec 4 03:39] Conversion of stage 1 result complete. 5 transforms, 1 modular inverse. Time: 0.002 sec. [Worker #1 Dec 4 03:39] Available memory is 7916MB. [Worker #1 Dec 4 03:39] Using 7916MB of memory. D: 510510, 46080x279844 polynomial multiplication. ... [Worker #2 Dec 4 03:40] M50227 stage 1 complete. 17311478 transforms. Total time: 45.504 sec. [Worker #2 Dec 4 03:40] Exceeded limit on number of workers that can use lots of memory. [Worker #2 Dec 4 03:40] Looking for work that uses less memory. [Worker #2 Dec 4 03:40] No work to do at the present time. Waiting. ... [Worker #3 Dec 4 03:40] M50263 stage 1 complete. 25971112 transforms. Total time: 68.424 sec. [Worker #3 Dec 4 03:40] Exceeded limit on number of workers that can use lots of memory. [Worker #3 Dec 4 03:40] Looking for work that uses less memory. [Worker #3 Dec 4 03:40] No work to do at the present time. Waiting. ... [Worker #1 Dec 4 03:41] Stage 2 GCD complete. Time: 0.001 sec. [Worker #1 Dec 4 03:41] M50111 completed P-1, B1=3000000, B2=95867651880, Wi8: 53020C14 [Worker #1 Dec 4 03:41] No work to do at the present time. Waiting. [Worker #2 Dec 4 03:41] Restarting worker with new memory settings. [Worker #3 Dec 4 03:41] Restarting worker with new memory settings. [Worker #2 Dec 4 03:41] Resuming. [Worker #3 Dec 4 03:41] Resuming. ... [Worker #2 Dec 4 03:41] P-1 on M50227 with B1=6000000, B2=10000000000 [Worker #3 Dec 4 03:41] P-1 on M50263 with B1=9000000, B2=100000000000 Segmentation fault (core dumped)
2021-12-05, 02:29   #93
techn1ciaN

Oct 2021
U. S. / Maine

2×73 Posts

Quote:
 Originally Posted by lisanderke 14923 I received 2888 GHzDs P-1 credit for this workload while it took me probably less than an hr to complete, perhaps credit given should be recalculated after a full release of 30.8 or higher versions!
A counterpoint: Systems with very large RAM allocations are scarce. Since 30.8's wildly impressive / "headline" improvements only seem possible with lots of RAM allocated, leaving the credit formula where it is might offer a good incentive for the owners of RAM-rich systems to run what their hardware would be most valuable for, i.e. P-1.

By the logic of your suggestion, we might recompute the TF credit formula, since the current one is still from when TF was done by CPU even though today's TF is run on GPUs with vastly greater throughput. While superficially reasonable, this probably doesn't make sense because we can see that having "inflated" credit on offer incentivizes GPU owners to run the more efficient TF and not the less efficient primality testing.

Last fiddled with by techn1ciaN on 2021-12-05 at 02:30 Reason: Clarifying adjective

 2021-12-05, 03:02 #94 alpertron     Aug 2002 Buenos Aires, Argentina 145410 Posts It appears that 30.8 runs faster than previous versions on P-1 not only when there are large amounts on RAM, but also on small exponents. In my case (using 8GB of RAM in an I5 3470) Prime95 required 5 days to get the following: Code: processing: P-1 no-factor for M9325159 (B1=50,000,000, B2=50,001,265,860) CPU credit is 1312.7590 GHz-days. Notice that the file worktodo.txt already had the known factors, but no new factors were found. The difference between 1 hour and 5 days (to get half the credit) cannot be explained only by the amount of RAM in the system.
2021-12-05, 04:36   #95
Prime95
P90 years forever!

Aug 2002
Yeehaw, FL

1F0916 Posts

Quote:
 Originally Posted by axn This is repeatable. Multiple restarts with build 2 all yielded same behavior - top shows consistently at 200% instead of the expected high 500%
Found it. Somehow I accidentally overwrote the affinity changes that were in build 1

2021-12-05, 04:49   #96
axn

Jun 2003

2·2,693 Posts

Quote:
 Originally Posted by Prime95 Found it. Somehow I accidentally overwrote the affinity changes that were in build 1

Anyway, whenever you release build 3(?) (with this and other bug fixes), i'll switch over from build 1 which so far seems to be working fine for my use case.

 2021-12-05, 07:04 #98 axn     Jun 2003 2·2,693 Posts Wow! 330s -> 212s
2021-12-05, 07:44   #99
Luminescence

Oct 2021
Germany

1678 Posts

Quote:
Not sure if this is just a visual bug, but when running a 12-core CPU (Ryzen 9 5900X) with all cores on one worker and setting this option to 12, Prime95 (at least visually) claims to assign all 12 extra polymult helper threads to core 1.

With this setting enabled, stage 2 went from 745s to 725s on a 25.6M exponent (B1/B2 = 700k and 450M).
Stage 2 init went from ~90s to ~60s though

Last fiddled with by Luminescence on 2021-12-05 at 07:58 Reason: Last line

 Similar Threads Thread Thread Starter Forum Replies Last Post kar_bon Prime Wiki 40 2022-04-03 19:05 science_man_88 science_man_88 24 2018-10-19 23:00 xilman Linux 2 2010-12-15 16:39 kar_bon Forum Feedback 3 2010-09-28 08:01 dave_0273 Lounge 1 2005-02-27 18:36

All times are UTC. The time now is 11:57.

Wed Aug 10 11:57:51 UTC 2022 up 34 days, 6:45, 2 users, load averages: 1.43, 1.49, 1.35