mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2010-10-21, 21:37   #45
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

3×2,083 Posts
Default

At last, the results are in for the comparison of Prime95 v25.11 vs. v26.3 for 289184*5^477336-1 on my Core 2 Duo. We have:

v25.11 started at [Thu Oct 21 14:29:35 2010], finished at [Thu Oct 21 15:30:55 2010] --> total 1:01:20 = 3680 sec.

v26.3 started at [Thu Oct 21 15:53:16 2010], finished at [Thu Oct 21 16:38:05 2010] --> total 0:44:49 = 2689 sec.

So it would seem that PFGW 3.4.0 (32-bit, as all of these were) is the only one of the triumvirate that exhibits a slowdown going to gwnum v26. Note, however, that the individual gwnum minor versions of the programs used for these tests do not all line up; I tested PFGW 3.3.6 vs. 3.4.0 (gwnum 26.2), LLR 3.8.1 vs. 3.8.2 (gwnum 26.2), and Prime95 25.11 vs. 26.3 (gwnum 26.3). For that reason, I will follow this up shortly with a rerun of 289184*5^477336-1 using PFGW 3.4.2. Stay tuned...
mdettweiler is offline   Reply With Quote
Old 2010-10-21, 21:50   #46
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

141518 Posts
Default

Holy cow! It would seem that 3.4.2 actually chooses an entirely different FFT size for 289184*5^477336-1 than 3.4.1 (which I understand is the same as 3.4.0 on 32-bit--I deleted my copy of 3.4.0, stupid me, and could only get my hands on 3.4.1). Behold:
Code:
$ ./pfgw341.exe -F -q289184*5^477336-1
PFGW Version 3.4.1.32BIT.20100927.Win_Dev [GWNUM 26.2]
Special modular reduction using zero-padded Core2 type-3 FFT length 128K, Pass1=128, Pass2=1K on 289184*5^477336-1
Special modular reduction using zero-padded Pentium4 type-1 FFT length 144K, Pass1=96, Pass2=1536 on 289184*5^477336-1
Special modular reduction using zero-padded Pentium4 type-3 FFT length 160K, Pass1=640, Pass2=256 on 289184*5^477336-1
Special modular reduction using zero-padded Pentium4 type-3 FFT length 192K, Pass1=256, Pass2=768 on 289184*5^477336-1
Special modular reduction using zero-padded Pentium4 type-3 FFT length 224K, Pass1=896, Pass2=256 on 289184*5^477336-1
Special modular reduction using zero-padded Pentium4 type-3 FFT length 240K, Pass1=320, Pass2=768 on 289184*5^477336-1
 
$ ./pfgw.exe -F -q289184*5^477336-1
PFGW Version 3.4.2.32BIT.20101019.Win_Dev [GWNUM 26.4]
Special modular reduction using Core2 type-3 FFT length 112K, Pass1=448, Pass2=256 on 289184*5^477336-1
Not only is the size used different (112K vs. 128K), 3.4.2 omits the "zero-padded" nomenclature entirely. Whether this is just an output difference or a difference in the underlying logic I do not know.

This would seem to invalidate 3.4.2 for use in trying to nail down this mystery. However, at this point it would seem rather unnecessary, as whatever happened, it has apparently been fixed in 3.4.2. Thus, I'll just stick with 3.4.2 for all my testing as it seems to now be consistent with the speedups I get from comparable LLR and Prime95 versions.

Thanks for taking the time to look into this (and for whatever you guys did to fix it)!
mdettweiler is offline   Reply With Quote
Old 2010-10-22, 00:07   #47
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

6,277 Posts
Default

I didn't do anything, but George might have. I find it interesting that the old version specified Pentium4 and the new one specified Core 2.
rogue is offline   Reply With Quote
Old 2010-10-22, 01:08   #48
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

3·2,083 Posts
Default

Quote:
Originally Posted by rogue View Post
I didn't do anything, but George might have. I find it interesting that the old version specified Pentium4 and the new one specified Core 2.
Well, it gave 6 different potential FFT choices (1 Core 2, 5 P4) when I ran it with -F, but when I run the actual test with -V it uses the Core2 FFT.

BTW: why exactly would it give 6 FFT choices like that? Shouldn't it boil down to exactly one choice just like it would for the real test? (Or might this, whatever the cause, be the reason for the strange slowdown?)

Last fiddled with by mdettweiler on 2010-10-22 at 01:09
mdettweiler is offline   Reply With Quote
Old 2010-10-22, 01:17   #49
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

6,277 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
Well, it gave 6 different potential FFT choices (1 Core 2, 5 P4) when I ran it with -F, but when I run the actual test with -V it uses the Core2 FFT.

BTW: why exactly would it give 6 FFT choices like that? Shouldn't it boil down to exactly one choice just like it would for the real test? (Or might this, whatever the cause, be the reason for the strange slowdown?)
That it listed 6 was a bug that I fixed in 3.4.1. Only the first one would be used under normal conditions.
rogue is offline   Reply With Quote
Old 2010-10-25, 21:01   #50
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

6,277 Posts
Default PFGW 3.4.3 Released

You can d/l the latest release for Windows, MacIntel, and Linux from here: http://sourceforge.net/projects/openpfgw/

The updates are for 64-bit PFGW users. A bug was found and fixed in the factoring code. For linux, the binary is now statically linked.
rogue is offline   Reply With Quote
Old 2010-10-25, 22:31   #51
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

11100111010112 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
Holy cow! It would seem that 3.4.2 actually chooses an entirely different FFT size for 289184*5^477336-1 than 3.4.1
...
Not only is the size used different (112K vs. 128K), 3.4.2 omits the "zero-padded" nomenclature entirely. Whether this is just an output difference or a difference in the underlying logic I do not know.
For those that like gory details, gwnum 26.4 can now propagate carries to the next 6 FFT data words whereas 26.3 can only propagate to the next 4 FFT data words. Usually this makes no difference in FFT selection. But for larger k values, 26.4 may use the slightly faster irrational base discrete weighted FFT (Richard Crandall's IBDWT) vs. a zero-padded FFT of the same size. In even rarer cases, 26.4 may use an IBDWT with a smaller FFT length.
Prime95 is offline   Reply With Quote
Old 2010-11-01, 04:25   #52
Batalov
 
Batalov's Avatar
 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2

100100101011012 Posts
Default

Quote:
Originally Posted by rogue View Post
You can d/l the latest release for Windows, MacIntel, and Linux from here: http://sourceforge.net/projects/openpfgw/

The updates are for 64-bit PFGW users. A bug was found and fixed in the factoring code. For linux, the binary is now statically linked.
I have a small bug. Run pfgw64 (linux), kill it somewhere; then replace the input file (with something else), restart and it reports:

***WARNING! file sr_10.pfgw line 2378 does not match what is expected.
Expecting: 10001001*10^11441+1
File contained: 1001001*10^25534+1
Starting over at the beginning of the file

10001001*10^25535+1 is composite: RES64: [AD505C1D89295440] (24.7044s+0.0002s)
...

Starting over at the beginning of the file, of course, is the usual and in this case desired effect. But it doesn't, it only says that it will, and instead goes from the middle of the file (i.e. the line is not zeroed). This seems to be new (something unitialized in 64-bit version?), -- it worked fine before.
Batalov is offline   Reply With Quote
Old 2010-11-01, 14:56   #53
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

6,277 Posts
Default

Quote:
Originally Posted by Batalov View Post
I have a small bug. Run pfgw64 (linux), kill it somewhere; then replace the input file (with something else), restart and it reports:

***WARNING! file sr_10.pfgw line 2378 does not match what is expected.
Expecting: 10001001*10^11441+1
File contained: 1001001*10^25534+1
Starting over at the beginning of the file

10001001*10^25535+1 is composite: RES64: [AD505C1D89295440] (24.7044s+0.0002s)
...

Starting over at the beginning of the file, of course, is the usual and in this case desired effect. But it doesn't, it only says that it will, and instead goes from the middle of the file (i.e. the line is not zeroed). This seems to be new (something unitialized in 64-bit version?), -- it worked fine before.
This was something I broke when trying to address a crash with ABC2 files. I'll have to look into another fix for that problem.
rogue is offline   Reply With Quote
Old 2010-11-04, 21:39   #54
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

6,277 Posts
Default PFGW 3.4.4 Released

You can d/l the latest release for Windows, MacIntel, and Linux from here: http://sourceforge.net/projects/openpfgw/

This fixes a factoring problem on Win64 and fixes the ABC resume problem. I believe that there is still an ABC2 crashing problem, but I can't recall how to produce it. I had to revert that change to correct the ABC resume problem.
rogue is offline   Reply With Quote
Old 2010-11-26, 08:38   #55
Batalov
 
Batalov's Avatar
 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2

41×229 Posts
Default Konyagin-Pomerance extension

In PFGW, the N-1 Brillhart-Lehmer-Selfridge implements eponymous 1975 algorithm, but would it be hard to extend it with the third-magnitude stage Konyagin-Pomerance extension (as in pages 176-178 of Crandall/Pomerance PN-ACP, Theorem 4.1.6)? Part (1) seems no different from the square test of the second-magnitude stage, and the same code would be called six times with minor variations, but part (2) needs a bit of implementation. There's a GP prototype available, needs a polroots() for a cubic poly and contfrac() rewritten.

Was this ever requested before? Could I possibly help? (with a disclaimer that familiarizing with the code could take much more time than "just doing it" for an experienced developer, i.e. Mark )
Batalov is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
A possible bug in LLR/PFGW while using GWNUM (no bug in P95) Batalov Software 77 2015-04-14 09:01
PFGW 3.2.0 has been Released rogue Software 94 2010-09-14 21:39
PFGW 3.2.3 has been Released rogue Software 10 2009-10-28 07:07
PFGW 3.2.2 has been Released rogue Software 20 2009-08-23 12:14
PFGW 3.2.1 has been released rogue Software 5 2009-08-10 01:43

All times are UTC. The time now is 11:53.

Tue Apr 13 11:53:54 UTC 2021 up 5 days, 6:34, 1 user, load averages: 2.62, 2.76, 2.47

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.