![]() |
![]() |
#1 |
Nov 2010
52 Posts |
![]()
I have implemented a version of PFGW which supports multi-threading, essentially along the same lines as LLR i.e. adding a gwset_num_threads() call between gwinit() and gwsetup(). The code is working correctly i.e. giving the correct answers, but does not always give speedup. If anyone wants to play, the patch is here: https://pastebin.com/xw7ykJbM (note that the patch also works with the most recent gwnum, since it removes the use of the deprecated gwmap_to_fft_info() function). Multiple threads are enabled using the -T<number> command line argument.
For example, if I run a PRP test on a large factorial prime 208003!-1 I find that going from 1 to 2 threads I get a 1.6x speedup, which is comparable to what I get using the LLR program to test 387*2^3322763+1, which is a Proth of roughly the same digit length. Code:
../pfgw64 -q'208003!-1' -V -T2 PFGW Version 3.8.3.64BIT.20161203.Mac_Dev [GWNUM 29.2] Generic modular reduction using generic reduction FMA3 FFT length 336K, Pass1=448, Pass2=768, clm=4, 2 threads on A 3374558-bit number ./llr64.3.8.20 -q387*2^3322763+1 -d -t2 Starting Proth prime test of 387*2^3322763+1 Using all-complex FMA3 FFT length 256K, Pass1=128, Pass2=2K, 2 threads, a = 5 Code:
../pfgw64 -q'387*2^3322763+1' -V -T2 PFGW Version 3.8.3.64BIT.20161203.Mac_Dev [GWNUM 29.2] Special modular reduction using all-complex FMA3 FFT length 240K, Pass1=1280, Pass2=192, clm=2, 2 threads on 387*2^3322763+1 ../pfgw64 -q'387*2^3322763+1' -V -t -T2 PFGW Version 3.8.3.64BIT.20161203.Mac_Dev [GWNUM 29.2] Primality testing 387*2^3322763+1 [N-1, Brillhart-Lehmer-Selfridge] Running N-1 test using base 5 Special modular reduction using all-complex FMA3 FFT length 240K, Pass1=1280, Pass2=192, clm=2, 2 threads on 387*2^3322763+1 Code:
../pfgw64 -q'387*2^3322763+1' -V -T2 PFGW Version 3.8.3.64BIT.20161203.Mac_Dev [GWNUM 29.2] Generic modular reduction using all-complex FMA3 FFT length 240K, Pass1=1280, Pass2=192, clm=2, 2 threads on 387*2^3322763+1 - Iain |
![]() |
![]() |
![]() |
#2 |
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2
3·7·479 Posts |
![]()
As soon as you are in the 'Generic modular reduction using generic reduction' territory, by definition you have PFGW doing things differently to LLR. Neither LLR nor P95 have an equivalent use mode.
|
![]() |
![]() |
![]() |
#3 |
"Mark"
Apr 2003
Between here and the
23·13·67 Posts |
![]()
The term "Special" or "Generic" txt is output depending upon the call used to gwnum to set up the modular reduction. gwsetup() vs gwsetup_general_mod()/gwsetup_general_mod_64(). The text following "using" is the output produced by a call to gwmodulo_as_string().
|
![]() |
![]() |
![]() |
#4 |
Nov 2010
318 Posts |
![]()
That's why I'm confused - it doesn't seem to be as simple as generic/special reduction controlling whether the multithreaded FFT performs well or not e.g.
So there is some other effect at work here, which I don't quite understand at this point. I wonder if George has any insight? |
![]() |
![]() |
![]() |
#5 |
Jun 2003
2×2,719 Posts |
![]()
Iain, can you post the output (showing the FFT) from both single thread and 2-thread runs for the Proth number? Also the iteration timings and the CPU usage as well for both of these?
What kind of CPU is it? If it is HT, are both the threads of the 2-thread run somehow running on the same physical core? Last fiddled with by axn on 2017-09-29 at 11:32 Reason: Clarified "output" |
![]() |
![]() |
![]() |
#6 |
"Mark"
Apr 2003
Between here and the
23·13·67 Posts |
![]()
I applied the change and didn't see any improvement either. Did you look at how llr implemented it?
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
PFGW GUI vs CMD | houding | Software | 1 | 2016-06-20 12:11 |
Which Work Types are Multithreaded | tului | Software | 6 | 2015-11-28 21:59 |
PFGW 3.3.6 or PFGW 3.4.2 Please update now! | Joe O | Sierpinski/Riesel Base 5 | 5 | 2010-09-30 14:07 |
Feature request: multithreaded polsel | Andi47 | Msieve | 1 | 2010-02-20 01:16 |
Multithreaded QS/NFS sieve stage | nuggetprime | Msieve | 5 | 2008-08-11 07:48 |