mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2021-12-03, 13:38   #78
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

3×37×47 Posts
Default

Quote:
Originally Posted by petrw1 View Post
Stage 1 is 10 Mins with AVX512 and 14 Mins with FMA3
Crazy idea ... can I run Stage 1 with AVX512 and Stage 2 with FMA3?
Probably not worth the effort for anyone.
I'll wait for the AVX512 version of 30.8,

28.0M 20GB RAM
600K/273M (Chosen by Prime95)
Stage 1: 14 Mins
Stage 2: 23 Mins

28.0M 20GB RAM
600K/120M (Chosen by me)
Stage 1: 14
Stage 2: 10

28.0M 24GB RAM
600K/120M (Chosen by me)
Stage 1: 14
Stage 2: 10 ... 4 GB not a big diff.
petrw1 is offline   Reply With Quote
Old 2021-12-03, 13:55   #79
axn
 
axn's Avatar
 
Jun 2003

2×2,693 Posts
Default

Quote:
Originally Posted by petrw1 View Post
Stage 1 is 10 Mins with AVX512 and 14 Mins with FMA3
Crazy idea ... can I run Stage 1 with AVX512 and Stage 2 with FMA3?
Probably not worth the effort for anyone.
Easily done. Run stage 1 only (B1=B2) in one folder, and stage 2 in another folder after copying over the save files.
axn is offline   Reply With Quote
Old 2021-12-03, 13:56   #80
axn
 
axn's Avatar
 
Jun 2003

2·2,693 Posts
Default

Quote:
Originally Posted by petrw1 View Post
28.0M 20GB RAM
600K/273M (Chosen by Prime95)
Stage 1: 14 Mins
Stage 2: 23 Mins

28.0M 20GB RAM
600K/120M (Chosen by me)
Stage 1: 14
Stage 2: 10

28.0M 24GB RAM
600K/120M (Chosen by me)
Stage 1: 14
Stage 2: 10 ... 4 GB not a big diff.
How do the parameter selection (D / poly / memory usage / actual B2) look like for these options?
axn is offline   Reply With Quote
Old 2021-12-03, 17:56   #81
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

521710 Posts
Default

Quote:
Originally Posted by axn View Post
How do the parameter selection (D / poly / memory usage / actual B2) look like for these options?
I lost the first option ... scrolled off.
First below uses 24GB
Second uses 20 GB

[Dec 3 00:57]
[Dec 3 00:57] P-1 on M28052377 with B1=600000, B2=120000000
[Dec 3 00:57] Setting affinity to run helper thread 2 on CPU core #3 ... and 6 more of these
[Dec 3 00:57] Using FMA3 FFT length 1536K, Pass1=384, Pass2=4K, clm=2, 8 threads
[Dec 3 01:11] M28052377 stage 1 complete. 1731726 transforms. Total time: 834.494 sec.
[Dec 3 01:11] Round off: 0.06515492749
[Dec 3 01:11] Conversion of stage 1 result complete. 5 transforms, 1 modular inverse. Time: 8.425 sec.
[Dec 3 01:11] Switching to FMA3 FFT length 1680K, Pass1=448, Pass2=3840, clm=2, 8 threads
[Dec 3 01:11] Setting affinity to run helper thread 1 on CPU core #2 ...
[Dec 3 01:11] Using 24827MB of memory. D: 3570, 384x1474 polynomial multiplication.
[Dec 3 01:12] Round off: 0.006904345816, poly_size: 2, EB: 1.04634, SM: 2.39624
[Dec 3 01:12] Round off: 0.0104635992, poly_size: 4
[Dec 3 01:12] Round off: 0.01654535736, poly_size: 8
[Dec 3 01:12] Round off: 0.02423378668, poly_size: 16
[Dec 3 01:12] Round off: 0.04181992679, poly_size: 32
[Dec 3 01:12] Round off: 0.04973827218, poly_size: 64
[Dec 3 01:12] Round off: 0.06909625713, poly_size: 128
[Dec 3 01:12] Round off: 0.09012404759, poly_size: 256
[Dec 3 01:12] Round off: 0.09836611984, poly_size: 512
[Dec 3 01:13] Stage 2 init complete. 10136 transforms. Time: 67.378 sec.
[Dec 3 01:13] Round off: 0.191300468
[Dec 3 01:22] M28052377 stage 2 complete. 384153 transforms. Total time: 579.687 sec.
[Dec 3 01:22] Round off: 0.1936677887
[Dec 3 01:22] Stage 2 GCD complete. Time: 5.263 sec.
[Dec 3 01:22] M28052377 completed P-1, B1=600000, B2=121965480, Wi4: D2F70E68


[Dec 3 08:58]
[Dec 3 08:58] P-1 on M28040009 with B1=600000, B2=120000000
[Dec 3 08:58] Using FMA3 FFT length 1440K, Pass1=320, Pass2=4608, clm=2, 8 threads
[Dec 3 08:58] Setting affinity to run helper thread 1 on CPU core #2 ...
[Dec 3 09:11] M28040009 stage 1 complete. 1731726 transforms. Total time: 802.705 sec.
[Dec 3 09:11] Round off: 0.34375
[Dec 3 09:11] Conversion of stage 1 result complete. 5 transforms, 1 modular inverse. Time: 8.493 sec.
[Dec 3 09:11] Setting affinity to run helper thread 1 on CPU core #2 ...
[Dec 3 09:11] Switching to FMA3 FFT length 1680K, Pass1=448, Pass2=3840, clm=2, 8 threads
[Dec 3 09:11] Using 19761MB of memory. D: 2730, 288x1190 polynomial multiplication.
[Dec 3 09:11] Round off: 0.007246532776, poly_size: 2, EB: 1.26823, SM: 2.29248
[Dec 3 09:11] Round off: 0.01052497622, poly_size: 4
[Dec 3 09:12] Round off: 0.01707247617, poly_size: 8
[Dec 3 09:12] Round off: 0.02619342368, poly_size: 16
[Dec 3 09:12] Round off: 0.04025678108, poly_size: 32
[Dec 3 09:12] Round off: 0.04762847341, poly_size: 64
[Dec 3 09:12] Round off: 0.06455924855, poly_size: 128
[Dec 3 09:12] Round off: 0.09652106448, poly_size: 256
[Dec 3 09:12] Round off: 0.05227529114, poly_size: 512
[Dec 3 09:12] Stage 2 init complete. 7648 transforms. Time: 36.623 sec.
[Dec 3 09:12] Round off: 0.1491780477
[Dec 3 09:22] M28040009 stage 2 complete. 469459 transforms. Total time: 591.273 sec.
[Dec 3 09:22] Round off: 0.1693160104
[Dec 3 09:22] Stage 2 GCD complete. Time: 5.255 sec.
[Dec 3 09:22] M28040009 completed P-1, B1=600000, B2=120040830, Wi4: D2BFE942

Last fiddled with by petrw1 on 2021-12-03 at 17:57
petrw1 is offline   Reply With Quote
Old 2021-12-03, 18:09   #82
axn
 
axn's Avatar
 
Jun 2003

150A16 Posts
Default

Quote:
Originally Posted by petrw1 View Post
[Dec 3 01:22] M28052377 stage 2 complete. 384153 transforms. Total time: 579.687 sec.

[Dec 3 09:22] M28040009 stage 2 complete. 469459 transforms. Total time: 591.273 sec.
Something's not quite right here. The 24GB option shows about 20% less transforms, yet sees no significant improvement in elapsed time. I wonder if there was some interference or something during these tests. If you have build 1 sitting around, can you repeat the tests and see if the pattern holds?
axn is offline   Reply With Quote
Old 2021-12-03, 18:35   #83
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

3·37·47 Posts
Default

Quote:
Originally Posted by axn View Post
Something's not quite right here. The 24GB option shows about 20% less transforms, yet sees no significant improvement in elapsed time. I wonder if there was some interference or something during these tests. If you have build 1 sitting around, can you repeat the tests and see if the pattern holds?
I'm running 24GB in the middle of the night.
No one will be on the computer.
And I am not aware of any overnight processing....at least not all night because these same numbers are on all the overnight runs.
petrw1 is offline   Reply With Quote
Old 2021-12-03, 19:32   #84
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

3×37×47 Posts
Default

I reran B1=600K/B2=TBD
Interestingly when it switched from AVX512 to FMA3 it changed the B2.

[Dec 3 12:41] Worker starting
[Dec 3 12:41] Setting affinity to run worker on CPU core #1
[Dec 3 12:41]
[Dec 3 12:41] P-1 on M28053787 with B1=600000, B2=TBD
[Dec 3 12:41] Setting affinity to run helper thread 2 on CPU core #3 ...
[Dec 3 12:41] Using FMA3 FFT length 1536K, Pass1=256, Pass2=6K, clm=4, 8 threads
[Dec 3 12:41] M28053787 stage 1 is 1.10% complete.
[Dec 3 12:55] M28053787 stage 1 complete. 1712436 transforms. Total time: 818.117 sec.
[Dec 3 12:55] Round off: 0.078125
[Dec 3 12:55] Conversion of stage 1 result complete. 5 transforms, 1 modular inverse. Time: 8.424 sec.
[Dec 3 12:55] With trial factoring done to 2^75, optimal B2 is 615*B1 = 369000000.
[Dec 3 12:55] If no prior P-1, chance of a new factor is 5.55%
[Dec 3 12:55] Switching to FMA3 FFT length 1680K, Pass1=448, Pass2=3840, clm=2, 8 threads
[Dec 3 12:55] Setting affinity to run helper thread 5 on CPU core #6 ...
[Dec 3 12:55] With trial factoring done to 2^75, optimal B2 is 555*B1 = 333000000.
[Dec 3 12:55] If no prior P-1, chance of a new factor is 5.48%
[Dec 3 12:55] Using 24827MB of memory. D: 3570, 384x1474 polynomial multiplication.
[Dec 3 12:55] Round off: 0.007483116829, poly_size: 2, EB: 1.0447, SM: 2.39624
[Dec 3 12:56] Round off: 0.01062373627, poly_size: 4
[Dec 3 12:56] Round off: 0.01606212542, poly_size: 8
[Dec 3 12:56] Round off: 0.02399381441, poly_size: 16
[Dec 3 12:56] Round off: 0.04149382492, poly_size: 32
[Dec 3 12:56] Round off: 0.04831414652, poly_size: 64
[Dec 3 12:56] Round off: 0.07144245388, poly_size: 128
[Dec 3 12:56] Round off: 0.0934058712, poly_size: 256
[Dec 3 12:56] Round off: 0.09837685356, poly_size: 512
[Dec 3 12:56] Stage 2 init complete. 10138 transforms. Time: 60.797 sec.
[Dec 3 12:56] Round off: 0.1785826166
[Dec 3 13:21] M28053787 stage 2 complete. 1047025 transforms. Total time: 1479.956 sec.
[Dec 3 13:21] Round off: 0.2112125547
[Dec 3 13:21] Stage 2 GCD complete. Time: 5.265 sec.
[Dec 3 13:21] M28053787 completed P-1, B1=600000, B2=333152400, Wi4: C626955D
petrw1 is offline   Reply With Quote
Old 2021-12-03, 20:55   #85
lycorn
 
lycorn's Avatar
 
"GIMFS"
Sep 2002
Oeiras, Portugal

110000101102 Posts
Default

Why are you saying it switched from AVX-512 to FMA3 FFT?
As far as I can see, upon entering stage 2 it changed the length of the FFT (and as a consequence the value of B2) but not its type.

Last fiddled with by lycorn on 2021-12-03 at 20:58
lycorn is offline   Reply With Quote
Old 2021-12-03, 22:00   #86
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

3×37×47 Posts
Default

Quote:
Originally Posted by lycorn View Post
Why are you saying it switched from AVX-512 to FMA3 FFT?
As far as I can see, upon entering stage 2 it changed the length of the FFT (and as a consequence the value of B2) but not its type.
I ASS-umed because my PC supports AVX-512 but I turned it off in local.txt.
I couldn't think of another reason for the change...not that there isn't.
petrw1 is offline   Reply With Quote
Old 2021-12-03, 22:00   #87
R. Gerbicz
 
R. Gerbicz's Avatar
 
"Robert Gerbicz"
Oct 2005
Hungary

112·13 Posts
Default

Quote:
Originally Posted by Prime95 View Post
On my quad core, 8GB machine:

version 30.8:
Code:
[Work thread Nov 24 05:56] P-1 on M26899981 with B1=1000000, B2=30000000
[Work thread Nov 24 05:57] Conversion of stage 1 result complete. 5 transforms, 1 modular inverse. Time: 9.500 sec.
[Work thread Nov 24 05:57] Switching to FMA3 FFT length 1600K, Pass1=640, Pass2=2560, clm=1, 4 threads using large pages
[Work thread Nov 24 05:57] Using 6788MB of memory.  D: 1050, 120x403 polynomial multiplication.
[Work thread Nov 24 05:58] Stage 2 init complete. 2842 transforms. Time: 11.922 sec.
[Work thread Nov 24 06:15] M26899981 stage 2 complete. 360415 transforms. Total time: 1052.009 sec.
[Work thread Nov 24 06:15] Stage 2 GCD complete. Time: 5.965 sec.
[Work thread Nov 24 06:15] M26899981 completed P-1, B1=1000000, B2=30000000, Wi8: B63F15AE
What is the third number on this line: "D: 1050, 120x403 polynomial multiplication."
just guessing that the 2nd is eulerphi(1050)/2=120, but what is the 403 ?

Last fiddled with by R. Gerbicz on 2021-12-03 at 22:01
R. Gerbicz is offline   Reply With Quote
Old 2021-12-03, 22:03   #88
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

3×37×47 Posts
Default

Quote:
Originally Posted by axn View Post
Easily done. Run stage 1 only (B1=B2) in one folder, and stage 2 in another folder after copying over the save files.
Would folder 1 prime.txt have UsePrimenet=0?
I'd prefer it send in both stages as 1 result.
petrw1 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Do not post your results here! kar_bon Prime Wiki 40 2022-04-03 19:05
what should I post ? science_man_88 science_man_88 24 2018-10-19 23:00
Where to post job ad? xilman Linux 2 2010-12-15 16:39
Moderated Post kar_bon Forum Feedback 3 2010-09-28 08:01
Something that I just had to post/buy dave_0273 Lounge 1 2005-02-27 18:36

All times are UTC. The time now is 00:04.


Fri Aug 12 00:04:34 UTC 2022 up 35 days, 18:51, 2 users, load averages: 1.03, 1.09, 1.07

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔