mersenneforum.org PFGW 4.0.3 (with gwnum v28.7) Released
 User Name Remember Me? Password
 Register FAQ Search Today's Posts Mark Forums Read

 2010-10-21, 21:37 #45 mdettweiler A Sunny Moo     Aug 2007 USA (GMT-5) 3×2,083 Posts At last, the results are in for the comparison of Prime95 v25.11 vs. v26.3 for 289184*5^477336-1 on my Core 2 Duo. We have: v25.11 started at [Thu Oct 21 14:29:35 2010], finished at [Thu Oct 21 15:30:55 2010] --> total 1:01:20 = 3680 sec. v26.3 started at [Thu Oct 21 15:53:16 2010], finished at [Thu Oct 21 16:38:05 2010] --> total 0:44:49 = 2689 sec. So it would seem that PFGW 3.4.0 (32-bit, as all of these were) is the only one of the triumvirate that exhibits a slowdown going to gwnum v26. Note, however, that the individual gwnum minor versions of the programs used for these tests do not all line up; I tested PFGW 3.3.6 vs. 3.4.0 (gwnum 26.2), LLR 3.8.1 vs. 3.8.2 (gwnum 26.2), and Prime95 25.11 vs. 26.3 (gwnum 26.3). For that reason, I will follow this up shortly with a rerun of 289184*5^477336-1 using PFGW 3.4.2. Stay tuned...
 2010-10-21, 21:50 #46 mdettweiler A Sunny Moo     Aug 2007 USA (GMT-5) 3·2,083 Posts Holy cow! It would seem that 3.4.2 actually chooses an entirely different FFT size for 289184*5^477336-1 than 3.4.1 (which I understand is the same as 3.4.0 on 32-bit--I deleted my copy of 3.4.0, stupid me, and could only get my hands on 3.4.1). Behold: Code: $./pfgw341.exe -F -q289184*5^477336-1 PFGW Version 3.4.1.32BIT.20100927.Win_Dev [GWNUM 26.2] Special modular reduction using zero-padded Core2 type-3 FFT length 128K, Pass1=128, Pass2=1K on 289184*5^477336-1 Special modular reduction using zero-padded Pentium4 type-1 FFT length 144K, Pass1=96, Pass2=1536 on 289184*5^477336-1 Special modular reduction using zero-padded Pentium4 type-3 FFT length 160K, Pass1=640, Pass2=256 on 289184*5^477336-1 Special modular reduction using zero-padded Pentium4 type-3 FFT length 192K, Pass1=256, Pass2=768 on 289184*5^477336-1 Special modular reduction using zero-padded Pentium4 type-3 FFT length 224K, Pass1=896, Pass2=256 on 289184*5^477336-1 Special modular reduction using zero-padded Pentium4 type-3 FFT length 240K, Pass1=320, Pass2=768 on 289184*5^477336-1$ ./pfgw.exe -F -q289184*5^477336-1 PFGW Version 3.4.2.32BIT.20101019.Win_Dev [GWNUM 26.4] Special modular reduction using Core2 type-3 FFT length 112K, Pass1=448, Pass2=256 on 289184*5^477336-1 Not only is the size used different (112K vs. 128K), 3.4.2 omits the "zero-padded" nomenclature entirely. Whether this is just an output difference or a difference in the underlying logic I do not know. This would seem to invalidate 3.4.2 for use in trying to nail down this mystery. However, at this point it would seem rather unnecessary, as whatever happened, it has apparently been fixed in 3.4.2. Thus, I'll just stick with 3.4.2 for all my testing as it seems to now be consistent with the speedups I get from comparable LLR and Prime95 versions. Thanks for taking the time to look into this (and for whatever you guys did to fix it)!
 2010-10-22, 00:07 #47 rogue     "Mark" Apr 2003 Between here and the 41·163 Posts I didn't do anything, but George might have. I find it interesting that the old version specified Pentium4 and the new one specified Core 2.
2010-10-22, 01:08   #48
mdettweiler
A Sunny Moo

Aug 2007
USA (GMT-5)

186916 Posts

Quote:
 Originally Posted by rogue I didn't do anything, but George might have. I find it interesting that the old version specified Pentium4 and the new one specified Core 2.
Well, it gave 6 different potential FFT choices (1 Core 2, 5 P4) when I ran it with -F, but when I run the actual test with -V it uses the Core2 FFT.

BTW: why exactly would it give 6 FFT choices like that? Shouldn't it boil down to exactly one choice just like it would for the real test? (Or might this, whatever the cause, be the reason for the strange slowdown?)

Last fiddled with by mdettweiler on 2010-10-22 at 01:09

2010-10-22, 01:17   #49
rogue

"Mark"
Apr 2003
Between here and the

41×163 Posts

Quote:
 Originally Posted by mdettweiler Well, it gave 6 different potential FFT choices (1 Core 2, 5 P4) when I ran it with -F, but when I run the actual test with -V it uses the Core2 FFT. BTW: why exactly would it give 6 FFT choices like that? Shouldn't it boil down to exactly one choice just like it would for the real test? (Or might this, whatever the cause, be the reason for the strange slowdown?)
That it listed 6 was a bug that I fixed in 3.4.1. Only the first one would be used under normal conditions.

 2010-10-25, 21:01 #50 rogue     "Mark" Apr 2003 Between here and the 41×163 Posts PFGW 3.4.3 Released You can d/l the latest release for Windows, MacIntel, and Linux from here: http://sourceforge.net/projects/openpfgw/ The updates are for 64-bit PFGW users. A bug was found and fixed in the factoring code. For linux, the binary is now statically linked.
2010-10-25, 22:31   #51
Prime95
P90 years forever!

Aug 2002
Yeehaw, FL

2×23×173 Posts

Quote:
 Originally Posted by mdettweiler Holy cow! It would seem that 3.4.2 actually chooses an entirely different FFT size for 289184*5^477336-1 than 3.4.1 ... Not only is the size used different (112K vs. 128K), 3.4.2 omits the "zero-padded" nomenclature entirely. Whether this is just an output difference or a difference in the underlying logic I do not know.
For those that like gory details, gwnum 26.4 can now propagate carries to the next 6 FFT data words whereas 26.3 can only propagate to the next 4 FFT data words. Usually this makes no difference in FFT selection. But for larger k values, 26.4 may use the slightly faster irrational base discrete weighted FFT (Richard Crandall's IBDWT) vs. a zero-padded FFT of the same size. In even rarer cases, 26.4 may use an IBDWT with a smaller FFT length.

2010-11-01, 04:25   #52
Batalov

"Serge"
Mar 2008
Phi(4,2^7658614+1)/2

33×367 Posts

Quote:
 Originally Posted by rogue You can d/l the latest release for Windows, MacIntel, and Linux from here: http://sourceforge.net/projects/openpfgw/ The updates are for 64-bit PFGW users. A bug was found and fixed in the factoring code. For linux, the binary is now statically linked.
I have a small bug. Run pfgw64 (linux), kill it somewhere; then replace the input file (with something else), restart and it reports:

***WARNING! file sr_10.pfgw line 2378 does not match what is expected.
Expecting: 10001001*10^11441+1
File contained: 1001001*10^25534+1
Starting over at the beginning of the file

10001001*10^25535+1 is composite: RES64: [AD505C1D89295440] (24.7044s+0.0002s)
...

Starting over at the beginning of the file, of course, is the usual and in this case desired effect. But it doesn't, it only says that it will, and instead goes from the middle of the file (i.e. the line is not zeroed). This seems to be new (something unitialized in 64-bit version?), -- it worked fine before.

2010-11-01, 14:56   #53
rogue

"Mark"
Apr 2003
Between here and the

41×163 Posts

Quote:
 Originally Posted by Batalov I have a small bug. Run pfgw64 (linux), kill it somewhere; then replace the input file (with something else), restart and it reports: ***WARNING! file sr_10.pfgw line 2378 does not match what is expected. Expecting: 10001001*10^11441+1 File contained: 1001001*10^25534+1 Starting over at the beginning of the file 10001001*10^25535+1 is composite: RES64: [AD505C1D89295440] (24.7044s+0.0002s) ... Starting over at the beginning of the file, of course, is the usual and in this case desired effect. But it doesn't, it only says that it will, and instead goes from the middle of the file (i.e. the line is not zeroed). This seems to be new (something unitialized in 64-bit version?), -- it worked fine before.
This was something I broke when trying to address a crash with ABC2 files. I'll have to look into another fix for that problem.

 2010-11-04, 21:39 #54 rogue     "Mark" Apr 2003 Between here and the 41·163 Posts PFGW 3.4.4 Released You can d/l the latest release for Windows, MacIntel, and Linux from here: http://sourceforge.net/projects/openpfgw/ This fixes a factoring problem on Win64 and fixes the ABC resume problem. I believe that there is still an ABC2 crashing problem, but I can't recall how to produce it. I had to revert that change to correct the ABC resume problem.
 2010-11-26, 08:38 #55 Batalov     "Serge" Mar 2008 Phi(4,2^7658614+1)/2 26B516 Posts Konyagin-Pomerance extension In PFGW, the N-1 Brillhart-Lehmer-Selfridge implements eponymous 1975 algorithm, but would it be hard to extend it with the third-magnitude stage Konyagin-Pomerance extension (as in pages 176-178 of Crandall/Pomerance PN-ACP, Theorem 4.1.6)? Part (1) seems no different from the square test of the second-magnitude stage, and the same code would be called six times with minor variations, but part (2) needs a bit of implementation. There's a GP prototype available, needs a polroots() for a cubic poly and contfrac() rewritten. Was this ever requested before? Could I possibly help? (with a disclaimer that familiarizing with the code could take much more time than "just doing it" for an experienced developer, i.e. Mark )

 Similar Threads Thread Thread Starter Forum Replies Last Post Batalov Software 77 2015-04-14 09:01 rogue Software 94 2010-09-14 21:39 rogue Software 10 2009-10-28 07:07 rogue Software 20 2009-08-23 12:14 rogue Software 5 2009-08-10 01:43

All times are UTC. The time now is 20:50.

Thu Aug 18 20:50:44 UTC 2022 up 18:19, 0 users, load averages: 1.29, 1.54, 1.55

Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔