mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2021-11-24, 21:21   #12
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

22·1,321 Posts
Default

Quote:
Originally Posted by Luminescence View Post
Are there any diminishing returns? I can run 2 workers with ~50GB each or one with 100-110GB
Depends on how big B2 is, and how big the input is. Once available, experiment. For inputs from this project, two workers and 50GB may be better but for larger inputs a single worker would be. If memory use is like GMP-ECM, it scales linearly with input size and also with the square-root of B2.
VBCurtis is offline   Reply With Quote
Old 2021-11-25, 04:16   #13
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

788810 Posts
Exclamation Prime95 30.8 (pre-beta) (FOR P-1 USERS ONLY; SMALL EXPONENTS ONLY)

For giggles, I tried P-1 on M80071, B1=200M It appears that the code that caps B2 at 999*B1 needs to change.
B2 = 76 billion in under 2 minutes!

Code:
[Work thread Nov 24 22:56] M80071 stage 1 complete. 798217228 transforms. Total time: 3795.041 sec.
[Work thread Nov 24 22:56] Conversion of stage 1 result complete. 5 transforms, 1 modular inverse. Time: 0.004 sec.
[Work thread Nov 24 22:56] Switching to FMA3 FFT length 5K using large pages
[Work thread Nov 24 22:56] With trial factoring done to 2^85, optimal B2 is 293*B1 = 58600000000.
[Work thread Nov 24 22:56] Using 6791MB of memory.  D: 270270, 25920x142152 polynomial multiplication.
[Work thread Nov 24 22:56] Stage 2 init complete. 998106 transforms. Time: 31.144 sec.
[Work thread Nov 24 22:58] M80071 stage 2 complete. 2815495 transforms. Total time: 101.937 sec.
[Work thread Nov 24 22:58] Stage 2 GCD complete. Time: 0.003 sec.
[Work thread Nov 24 22:58] M80071 completed P-1, B1=200000000, B2=76673707110, Wi8: E437AD7F
I'm going to try a few more and see if I can find a new factor.
Prime95 is offline   Reply With Quote
Old 2021-11-25, 05:47   #14
Luminescence
 
Oct 2021
Germany

22×52 Posts
Default

Quote:
Originally Posted by Prime95 View Post
For giggles, I tried P-1 on M80071, B1=200M It appears that the code that caps B2 at 999*B1 needs to change.
B2 = 76 billion in under 2 minutes!

Code:
[Work thread Nov 24 22:56] M80071 stage 1 complete. 798217228 transforms. Total time: 3795.041 sec.
[Work thread Nov 24 22:56] Conversion of stage 1 result complete. 5 transforms, 1 modular inverse. Time: 0.004 sec.
[Work thread Nov 24 22:56] Switching to FMA3 FFT length 5K using large pages
[Work thread Nov 24 22:56] With trial factoring done to 2^85, optimal B2 is 293*B1 = 58600000000.
[Work thread Nov 24 22:56] Using 6791MB of memory.  D: 270270, 25920x142152 polynomial multiplication.
[Work thread Nov 24 22:56] Stage 2 init complete. 998106 transforms. Time: 31.144 sec.
[Work thread Nov 24 22:58] M80071 stage 2 complete. 2815495 transforms. Total time: 101.937 sec.
[Work thread Nov 24 22:58] Stage 2 GCD complete. Time: 0.003 sec.
[Work thread Nov 24 22:58] M80071 completed P-1, B1=200000000, B2=76673707110, Wi8: E437AD7F
I'm going to try a few more and see if I can find a new factor.
Holy smokes, that’s a massive boost to P-1. You guys are some truly brilliant minds.

Luminescence is offline   Reply With Quote
Old 2021-11-25, 06:14   #15
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

120758 Posts
Default

[QUOTE=Prime95;593832]For giggles, I tried P-1 on M80071, B1=200M It appears that the code that caps B2 at 999*B1 needs to change.
B2 = 76 billion in under 2 minutes!

Code:
...
[Work thread Nov 24 22:56] With trial factoring done to 2^85, optimal B2 is 293*B1 = 58600000000.
...
Why would it say 2^85.
Does this have something to do with how much ECM has been done?

And...with Stage 2 being so much faster and supported larger values for B2 ... might there be a chance to use it to find more factors of the smallest unfactored? Maybe those under 20,000?

Last fiddled with by petrw1 on 2021-11-25 at 06:21 Reason: And...
petrw1 is offline   Reply With Quote
Old 2021-11-25, 06:31   #16
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

1ED016 Posts
Default

Quote:
Originally Posted by petrw1 View Post
Why would it say 2^85. Does this have something to do with how much ECM has been done?
Yes. That was just my complete-shot-in-the-dark guess as to ECM's equivalent TF.

I upped B1 to 250M, fixed the 999x cap. B2 = 4.45 trillion in an hour and a half.

Code:
[Work thread Nov 24 23:42] Conversion of stage 1 result complete. 5 transforms, 1 modular inverse. Time: 0.004 sec.
[Work thread Nov 24 23:42] Switching to FMA3 FFT length 5K using large pages
[Work thread Nov 24 23:42] With trial factoring done to 2^90, optimal B2 is 17811*B1 = 4452750000000.
[Work thread Nov 24 23:42] If no prior P-1, chance of a new factor is 6.43%
[Work thread Nov 24 23:42] Using 6791MB of memory.  D: 330330, 31680x136392 polynomial multiplication.
[Work thread Nov 24 23:42] Stage 2 init complete. 1225472 transforms. Time: 37.495 sec.
[Work thread Nov 25 01:17] M80071 stage 2 complete. 145791133 transforms. Total time: 5680.476 sec.
[Work thread Nov 25 01:17] Round off: 0.048828125
[Work thread Nov 25 01:17] Stage 2 GCD complete. Time: 0.003 sec.
[Work thread Nov 25 01:17] M80071 completed P-1, B1=250000000, B2=4459674999780, Wi8: 6A0ECD7D
Quote:
And...with Stage 2 being so much faster and supported larger values for B2 ... might there be a chance to use it to find more factors of the smallest unfactored? Maybe those under 20,000?
Expos under approx 40000 already benefited from GMP-ECM's generous stage 2. I guess there's a better chance for new factors on expos from 50K to 1M. We'll see.
Prime95 is offline   Reply With Quote
Old 2021-11-25, 11:36   #17
Zhangrc
 
"University student"
May 2021
Beijing, China

2×53 Posts
Default

Quote:
Originally Posted by Prime95 View Post
With trial factoring done to 2^90, optimal B2 is 17811*B1 = 4452750000000.
M80071 completed P-1, B1=250000000, B2=4459674999780
With T-level being 30.598, you can assume no factor below 2^102 (30.598/0.301).
Why is the B2 value below inconsistent with the value above?
Also, can Prime95 itself guess the estimated T-level when it's offline?

More problems:
How much can wavefront (107-116M) P-1 benefit from v30.8? what bounds does it use?
Does the larger FFT used in stage 2 hurt throughput? Is it larger than necessary?
Can the new algorithm be implemented in ECM and PP1 too?

Last fiddled with by Zhangrc on 2021-11-25 at 11:49
Zhangrc is offline   Reply With Quote
Old 2021-11-25, 15:37   #18
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

24×17×29 Posts
Default

Quote:
Originally Posted by Zhangrc View Post
Why is the B2 value below inconsistent with the value above?
The new stage 2 selects a D value (330330 in this case) and then does batches of D values with a single polynomial multiplication. The new code completes the full batch that is larger than the target B2.

Quote:
Also, can Prime95 itself guess the estimated T-level when it's offline?
No.

Quote:
How much can wavefront (107-116M) P-1 benefit from v30.8? what bounds does it use?
Does the larger FFT used in stage 2 hurt throughput? Is it larger than necessary?
Can the new algorithm be implemented in ECM and PP1 too?
Sadly, wavefront P-1 will not benefit much. There are only 200 or so temporaries available if given 16GB RAM.
The larger FFT will hurt stage 2 throughput. More study is required to see if prime95 is switching to a larger FFT sooner than necessary. The new algorithm can be implemented for P+1 and ECM with some difficulty. Reading papers by Montgomery / Silverman / Kruppa / Zimmermann is no easy matter!

Last fiddled with by Prime95 on 2021-11-25 at 15:38
Prime95 is offline   Reply With Quote
Old 2021-11-25, 16:10   #19
techn1ciaN
 
techn1ciaN's Avatar
 
Oct 2021
U. S. / Maine

2·73 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Sadly, wavefront P-1 will not benefit much. There are only 200 or so temporaries available if given 16GB RAM.
Does this mean that more impressive improvements, like you're seeing with tiny exponents, might be possible even at the P-1 wavefront if someone has massive RAM (say, 128 or 192 GB) and allocates enough of it?
techn1ciaN is offline   Reply With Quote
Old 2021-11-25, 16:52   #20
axn
 
axn's Avatar
 
Jun 2003

5×29×37 Posts
Default

Quote:
Originally Posted by techn1ciaN View Post
Does this mean that more impressive improvements, like you're seeing with tiny exponents, might be possible even at the P-1 wavefront if someone has massive RAM (say, 128 or 192 GB) and allocates enough of it?
Not to the same extent as tiny ones, but more memory you throw at it, the better the gains. So, yes, those kind of very large RAM allocations will be useful.
axn is offline   Reply With Quote
Old 2021-11-25, 22:31   #21
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

24×17×29 Posts
Default

I found a bug in P-1 stage 2 init that may or may not have affected my previous runs. I'm rerunning all my v30.8 stage 2 work. When using 30.8, I recommend saving your completed P-1 save files until we are confident the new code is working.

Should you wish to try 30.8, links are below.
  • Use this version only for P-1 work on Mersenne numbers. This really is pre-beta!
  • Please rerun your last 3 or 4 successful P-1 runs to QA that the new P-1 stage 2 code finds those factors.
  • Use much more aggressive B2 bounds. While the optimal B2 calculations may not be perfect I recommend using them anyway.
  • Turn on roundoff error checking
  • Give stage 2 as much memory as you can. Only run one worker with high memory. (The default value for MaxHighMemWorkers will be changing).
  • Save files during P-1 stage 2 cannot be created.
  • There is no progress reporting during P-1 stage 2.
  • P-1 stage 2 is untested on 100M+ exponents. I am not sure the code can accurately gauge when the new code is faster than the old code.
  • AVX-512 is untested -- likely to fail (perhaps silently). Pre-AVX is untested but might work. Recommend using only AVX and FMA FFTs.
  • MaxStage0Prime in undoc.txt has changed.

Windows 64-bit: https://mersenne.org/ftp_root/gimps/p95v308b1.win64.zip
Linux 64-bit: https://mersenne.org/ftp_root/gimps/...linux64.tar.gz

Last fiddled with by Prime95 on 2021-11-26 at 01:05
Prime95 is offline   Reply With Quote
Old 2021-11-25, 22:58   #22
lisanderke
 
"Lisander Viaene"
Oct 2020
Belgium

109 Posts
Default

I'll be using 30.8 for re-doing P-1 in ranges where poor P-1 was previously done (in range 8.4M for example)
Currently running the first four of Kriesels recommended P-1 'selftest' exponents/bounds. (Though it is intended for selftesting GPU P-1 software as I understand it. See: https://www.mersenneforum.org/showpo...8&postcount=31 )

All four exponents seem to have returned the correct factors!


(Before editing it out I pointed out in this post that reporting for stage 2 was not working. I now realize reporting wasn't supposed to work, apologies!)
Attached Thumbnails
Click image for larger version

Name:	Screenshot 2021-11-25 235817.png
Views:	85
Size:	50.2 KB
ID:	26130  

Last fiddled with by lisanderke on 2021-11-25 at 23:22
lisanderke is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Prime95 beta version 28.4 Prime95 Software 20 2014-03-02 02:51
Prime95 beta version 28.3 Prime95 Software 68 2014-02-23 05:42
Prime95 version 27.1 early preview, not-even-close-to-beta release Prime95 Software 126 2012-02-09 16:17
Beta version 24.12 available Prime95 Software 33 2005-06-14 13:19
Beta version of PRP Prime95 PSearch 15 2004-09-17 19:21

All times are UTC. The time now is 03:29.


Wed May 25 03:29:03 UTC 2022 up 41 days, 1:30, 0 users, load averages: 1.67, 1.89, 2.00

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔