20060629, 04:47  #1 
Mar 2003
New Zealand
13×89 Posts 
Faster sieving with NewPGen and srsieve
Here is a way to increase sieve throughput that works with NewPGen (for bases other than 2) and srsieve. Take this base 5 sequence that I am currently sieving as an example:
52922*5^n1, 50,000 < n < 131072 It happens that all of the remaining terms in the sieve have n that are multiples of 4, and so I could instead sieve the following sequence and find exactly the same factors (since 5^4=625): 52922*625^n1, 12500 < n < 32768 The difference is that the sieve range is 1/4 the size and so is quite a bit faster, roughly twice as fast with NewPGen, a bit less with srsieve because the benefits of the base 5 mulmod are lost. In general, if it is possible to write n in k*b^n+c as n=q*m+r for fixed q,r then I think it will be possible to increase sieve throughput by a factor of about sqrt(q) and reduce bitmap memory use by a factor of (q1)/q. Version 0.2.0 of srsieve implements the memory reduction, but I am still working on the increased throughput part :). For the remaining sequences of the base 5 project it looks like q=4 is a common case, which might mean throughput can be doubled and memory use reduced to 25% of the current requirements. axn1, could you run 'srfile c' (version 0.2.0) on the current sieve file? It will print the values n=q*m+r for each sequence and give an effective q value for the file as a whole. If this pans out, it might be good to list the q values along with the Nash weights of the remaining sequences, as finding a prime for a sequence with a small q value would increase the speed of the sieve more than one with a large q. 
20060629, 06:30  #2 
Jun 2003
2×3×827 Posts 
See the attached file for the stats.
Is it possible to split the sieve files into multiple pieces to maximise efficiency? 
20060629, 23:51  #3 
Jun 2005
3×11 Posts 
Can you take this idea further e.g. if you find that for a particular k all the n values are either 1 or 2 module 12 then you can split it into the sequences (k*5)*(5^12)^n1 and (k*25)*(5^12)^n1. I think this is how prothsieve does things.

20060630, 23:03  #4  
Mar 2003
New Zealand
13×89 Posts 
Quote:
Quote:
Quote:


20060703, 03:52  #5  
Mar 2003
New Zealand
13×89 Posts 
Quote:
For the base 5 sieve all the sequences end up in the form k*b^d*(b^48)^n+c where 0 <=d < 48. It works out quite a bit faster when there are only a couple of sequences in the sieve, but the more sequences I add the slower it gets, and when there are 350 sequences it actually runs at only 1/4 the speed it did before I made any changes :( The problem seems to be that my modular inverse function is extremely slow. I will need to find a faster way to calculate c/(k*b^d) (mod p) before this idea will work out. 

20060704, 20:19  #6  
Jun 2003
2·3·827 Posts 
Quote:
Last fiddled with by axn on 20060704 at 20:20 

20060704, 22:50  #7  
Mar 2003
New Zealand
13×89 Posts 
Quote:
The modular inverse was not a problem after all, instead of calculating 1/(b^d) for each d, it is much faster to calculate 1/b once then calculate (1/b)^d for each d. 

20060706, 23:32  #8 
Mar 2003
New Zealand
13×89 Posts 
Just a note that the reason that a base less than 5^48 was faster was due to inefficiencies in my code that affected the larger bases more than the small ones. Now that I have fixed the worst of these, the speed increase seems to be converging to the expected sqrt(4.79) figure.

20060707, 01:19  #9 
Jun 2005
21_{16} Posts 
Have you tried using a value of Q larger than 48? It could possibly increase the sieve speed further.

20060707, 22:56  #10  
Mar 2003
New Zealand
485_{16} Posts 
Quote:
The method srsieve uses to split up the base 5 sequences into base 5^Q subsequences is to express k*5^n+/1 as k*5^(qm+r)+/1 for fixed q,r. If q is maximised then for the Base 5 project sieve that leads to a maximum Q=48 since every remaining term happens to have a q that divides 48. However it is possible to apply that same method to the base 5^48 sequences and get subsequences in even larger bases. I have added this capability to srsieve 0.3.3 for experiment. If you specify limitbase=X with a value of X greater than the Q selected by srsieve in the first pass, it will continue splitting up the sequences until either the base reaches b^X or the base cannot be increased further. For example with the Base 5 project sieve you could use multiples of 48 for X to get higher bases. I haven't tested this much, but none of the few cases I tried resulted in a speedup. If you find a case where using limitbase=X results in better throughput than the default, I would be interested to hear. There is one good reason why increasing Q will eventually result in reduced throughput: The work in the sieve has to be spread evenly between the baby steps and the giant steps to get good performance, but as the number of sequences increases the number of giant steps has to be reduced to keep that balance. Eventually only one giant step can be done, and it is not possible to rebalance by reducing the giant steps further. With the Base 5 project sieve we are already down to 4 giant steps with Q=48. 

20060708, 00:38  #11  
Jun 2005
3×11 Posts 
Quote:
I'm away for a while now, so won't have a chance to play with the variable Q option for a the next fortnight or so. 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Parallel sieving with newpgen  fivemack  And now for something completely different  3  20170516 17:55 
Sieving twins with srsieve  henryzz  Twin Prime Search  0  20140318 12:44 
Is TPSieve0.2.1 faster than Newpgen?  cipher  Twin Prime Search  4  20090518 18:36 
Sieving multiple NewPGen files in a row(How?)  jasong  Software  18  20051230 20:23 
Sieving with NewPGen  Cruelty  15k Search  41  20051027 10:28 