![]() |
![]() |
#782 |
"Mark"
Apr 2003
Between here and the
1C4A16 Posts |
![]() |
![]() |
![]() |
![]() |
#783 |
Random Account
Aug 2009
Not U. + S.A.
1010110111002 Posts |
![]() |
![]() |
![]() |
![]() |
#786 |
"Mark"
Apr 2003
Between here and the
161128 Posts |
![]() |
![]() |
![]() |
![]() |
#787 | |
Random Account
Aug 2009
Not U. + S.A.
ADC16 Posts |
![]() Quote:
Several years ago, when I was running LLR's, I used the srsieve group for several months. The command-line switches are different with this srsieve2. There were two min/max parameters then. I only see one now. My RTX 2080 supports OpenCL, but using it did not seem to make much difference in throughput. I will have to do more experimentation with this, and others. ![]() |
|
![]() |
![]() |
![]() |
#788 | |
"Mark"
Apr 2003
Between here and the
2·3·17·71 Posts |
![]() Quote:
srsieve2cl supports OpenCL. srsieve2 does not. srsieve2cl will start using the GPU when p > 1e6. I have been using -g32 as that provides better rates compared to the default of -g8. With thousands of sequences you might need to use -K or -b with -K. You can also play around with -U, -V, and -X. You will likely need to use -M at lower p due to higher factor density. It will tell you if -M needs to be changed for the range. Unfortunately the program does not "auto-tune" to come up with the best values for these parameters. I recommend that you find a fixed range that takes at least one minute to sieve then create a script to run that range multiple times, but changing the values for those switches. When done look at srsieve2.log to see which combination was the best. |
|
![]() |
![]() |
![]() |
#789 | |
Random Account
Aug 2009
Not U. + S.A.
22×5×139 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#790 | |
"Mark"
Apr 2003
Between here and the
11100010010102 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#791 |
"Mark"
Apr 2003
Between here and the
1C4A16 Posts |
![]()
Here are some relative speeds for the programs. I used S750 from CRUS as the base for the sequences to be tested. I pre-sieved to 1e9. These times (in seconds) are for sieving from 1e9 to 2e9 with default values used for -g and -w. The CPU code ran on i9-11950H and the GPU code ran on NVIDIA RTX A5000.
Code:
sr1sieve sr2sieve sr2sieve srsieve2 srsieve2 srsieve2cl srsieve2cl w/Leg wo/Leg w/Leg wo/Leg w/Leg wo/Leg 1 54 n/a n/a 65 218 30 30 10 n/a 247 282 801 994 *** 91 100 n/a 1214 1580 3645 4198 *** 319 *** -> uses generic sieving logic in the GPU, which does not support Legendre tables for multiple sequences In the future I will add Legendre support in the GPU when using multiple sequences, but I'm not certain how much of a benefit it will have, especially when one has hundreds of sequences. Last fiddled with by rogue on 2022-12-01 at 22:13 |
![]() |
![]() |
![]() |
#792 | |
Random Account
Aug 2009
Not U. + S.A.
22×5×139 Posts |
![]() Quote:
Many thanks! ![]() |
|
![]() |
![]() |