mersenneforum.org How do you efficiently sieve for prime 3/4-tuples?
 Register FAQ Search Today's Posts Mark Forums Read

2012-04-17, 11:13   #34
henryzz
Just call me Henry

"David"
Sep 2007
Cambridge (GMT/BST)

2·2,909 Posts

Quote:
 Originally Posted by axn I am out of ideas Perhaps you can save/restore registers in each asm block. Try adding "push " statements at the beginning of each asm block and do a "pop " at the end. Save/restore all registers except eax & edx. I am thinking of rewriting the whole thing in C without asm. They look correct. Note that for sextuples, you can include 7 (30 becomes 210) also into the mix and thus be 7 times faster.
Saving and restoring registers will require a bit of thought as I am not very up on asm. Could there be a problem because your asm is 32-bit while I am compiling for 64-bit?

I think I have done the change from 30 to 210 correctly. This will probably limit n to >210 now.

If you rewrite can I suggest you have primes in blocks as well as ranges of k. Currently those arrays use lots of memory if sieving to 1G(in fact can't go quite that far for 6-tuples). Would it be possible to have portions of ndx in the L2 cache as well as sieve?
Attached Files
 Sex_2.zip (2.5 KB, 99 views)

Last fiddled with by henryzz on 2012-04-17 at 11:31 Reason: forgot attachment

2012-04-17, 13:45   #35
axn

Jun 2003

23·607 Posts

Quote:
 Originally Posted by henryzz Could there be a problem because your asm is 32-bit while I am compiling for 64-bit?
Could be. Can you try moving the declaration of pre_sieve and sieve before the declaration of sprime?

Quote:
 Originally Posted by henryzz I think I have done the change from 30 to 210 correctly. This will probably limit n to >210 now.
There needs to be additional modular considerations to correctly implement this. And some other minor changes in the index.

Quote:
 Originally Posted by henryzz If you rewrite can I suggest you have primes in blocks as well as ranges of k. Currently those arrays use lots of memory if sieving to 1G(in fact can't go quite that far for 6-tuples). Would it be possible to have portions of ndx in the L2 cache as well as sieve?
Because of the way the sieving works, it requires that much memory. The good news it that, for the 5- and 6-tuple versions, there really is no point in further sieving by NewPGen. You can directly feed the output to LLR. And you don't need to sieve that deep with the program itself, either.

2012-04-17, 14:50   #36
axn

Jun 2003

23×607 Posts

Quote:
 Originally Posted by axn There needs to be additional modular considerations to correctly implement this. And some other minor changes in the index.
You need to make the following changes for the 6-tuple version:

Code:
n_mod: array[0..11] of byte = (207,51,183,39,177,141,123,9,57,81,93,99);
Code:
nm := n_mod[ THE_N mod 12 ];
and in init_indexes, first line:
Code:
i:=5;

2012-04-18, 23:56   #37
henryzz
Just call me Henry

"David"
Sep 2007
Cambridge (GMT/BST)

10110101110102 Posts

Quote:
 Originally Posted by axn You need to make the following changes for the 6-tuple version: Code: n_mod: array[0..11] of byte = (207,51,183,39,177,141,123,9,57,81,93,99); Code: nm := n_mod[ THE_N mod 12 ]; and in init_indexes, first line: Code: i:=5;
Didn't realize I had to change to mod 12 from mod 4 as well as 30 to 210.

 2012-04-19, 11:47 #38 biwema     Mar 2004 17D16 Posts Hi Peter Congratulations for your new discovery of the new quadruplet. This time you were quite lucky again: At k*2^9999 (k odd) there is on average a 4-tuple every 375T. I am also searching for a 4-tuple for some time. I my case I am sieving candidates of the form k* 2^10000. Our Name is similar and we have similar ideas… NewPGen I also use NewPGen for sieving and it is true that it cannot allocate a lot of memory. Besides that the algorithms are not optimal and there was no update since a long time. I assume that Jocelyn implemented NewPGen when he was a student and now he has a job, family and no time anymore. I did not experience that NewPGen can allocate more memory beyond 1G with the combined sieve. Could be that it is the case if you have more than 100M candidates. But in that case I would not be fast anyway. Below 1G, it uses a bitmap with one bit per entry. That is the fastest mode in NewPGen. After combining the sieve it uses an array or fast array. The array needs 8 Byte for one candidate and it performs a binary lookup if a sieved candidate is in the list. That means Log2(Number of candidates) lookups every time. If there is at least twice as much memory available, it uses fast array, which is a hashtable. In the best case one lookup is enough; in case of hash collisions it needs more. One of the worst disadvantages is, that NewPGen searches for k*2^n instead of k*15*2^n. Otherwise the array would be 15 times more dense. I will talk more about algorithmic improvements in a later post. I did not yet check, which improvement axn has already implemented in his code. Primorial or Power of 2? About 10 years ago I was searching for triplets and was using k*2^n with constant n and was quite alone with that. There was the common opinion that primorial are more efficient than simple powers. The Power of 2 is a special form where it much more efficient to sieve and prp the candidates. On the other hand, primorial are a product of the smallest primes so that they don’t need to be sieved anymore. So the density of candidates is much higher and a smaller range needs to be searched. I experienced, that powers of 2 is faster for twins and 3-tuple; 4-tuple are also faster, but the difference is not big anymore. For 5 and higher tuple I would recommend primorial. Some rough calculations about the expectations: 3 Tuple: Size: 10000 digits PRP test: 2 Seconds Optimal NewPGen Sieve depth: 10T Remaining candidates after sieving a range of 1T to 1G: 52500000 Range per one tuple: 3.5 T 4 Tuple: Size: 3030 digits PRP test: 0.15 Seconds Optimal NewPGen Sieve depth: 0.6T Remaining candidates after sieving a range of 1T to 1G: 2000000 Range per one tuple: 375 T 5 Tuple: Size: 1100 digits PRP test: 0.02 Seconds Optimal NewPGen Sieve depth: 1G Remaining candidates after sieving a range of 1T to 1G: 130000 Range per one tuple: 6400 T 6 Tuple: Size: 560 digits PRP test: 0.005 Seconds Optimal NewPGen Sieve depth: 1G Remaining candidates after sieving a range of 1T to 1G: 6000 Range per one tuple: 146000 T 7 Tuple: Size: 301 digits Remaining candidates after sieving a range of 1T to 1G: 500 Range per one tuple: 730000 T 8 Tuple: Size: 218 digits Remaining candidates after sieving a range of 1T to 1G: 35 Range per one tuple: 9750000 T Even if you sieve 6-tuple only up to 1 Million, there are only 68000 candidates left, which can be prp tested within 6 minutes. It is not difficult to find a 3 tuple with 10000 digits; maybe 2 years with NewPGen and one core. But then primalty certification of the +5 is really hard. Depending on the software that could take some years.
2012-04-19, 20:12   #39
Puzzle-Peter

Jun 2009

683 Posts

Quote:
 Originally Posted by biwema It is not difficult to find a 3 tuple with 10000 digits; maybe 2 years with NewPGen and one core. But then primalty certification of the +5 is really hard. Depending on the software that could take some years.
Thanks for all the input! I looked for a quadruple also at n=10000 up to k=90T or so with no hit. Same with n=10001 to 10010 to k=10T each.

About proving the +5: The new version of PRIMO can use multiple CPUs when running under Linux. According to this table: http://www.ellipsa.eu/public/primo/primo.html a 10000 digit number should not take more than a few weeks on a fast machine.

I really hope somebody will extend the sieve to longer k-tuples as I am still trying to figure out some parts of the code. It's the main sieving procedure that I still just don't understand.

EDIT: By the way, how do you calculate your "Range per one tuple" figures? I get much higher numbers when I try to estimate the k range needed per tuple.

Last fiddled with by Puzzle-Peter on 2012-04-19 at 20:16

2012-04-20, 11:10   #40
henryzz
Just call me Henry

"David"
Sep 2007
Cambridge (GMT/BST)

2·2,909 Posts

I have attached 5, 6 and 7
7 is back with mod 30 as -5 has been lost and I haven't worked out the numbers for mod 210 yet
The tuples we have programs for are:
3: -1 +1 +5
4: -1 +1 +5 +7
5: -1 +1 +5 +7 +11
6: -5 -1 +1 +5 +7 +11
7: -1 +1 +5 +7 +11 +17 +19
@axn Do you have a quick way of calculating the values for n_mod? My way so far has been really slow and isn't really suitable for 12. 4 was bad enough.
How large tuples do you want Puzzle-Peter?
Attached Files
 5and6and7.zip (7.8 KB, 97 views)

 2012-04-20, 12:39 #41 firejuggler     Apr 2010 Over the rainbow 1001111000012 Posts just a minor nitpick : when you sieve for 7-tuple the file is named "quinxxx.txt"
 2012-04-20, 14:54 #42 Puzzle-Peter     Jun 2009 683 Posts I am still trying to understand enough so I can do the modifications myself. There is some success, but I still don't get several things. 1) PreSieve = ... product of all primes <19 minus some of the smaller ones. Multiplying the missing ones seems to give the "step length" which is 30 or 210 in our examples here. But why <19 and not <23, <29 or whichever prime number? 2) Probably the largest question mark: how do you determine the length and the numbers of n_mod? axn posted the numbers for step 210. 12 of them which is 1 more than the +11 used for 6tuples. Is this the way to calculate the length of n_mod? How large should the tuples be? Well, this page http://sites.google.com/site/anthony...cts=1#largest4 lists tuples up to 23 numbers long. I don't understand the "there are no known 24tuples". Why not add the 17 to the 23-tuple? However, the longest non-trivial so far is a 19-tuple. Wouldn't it be fun to experiment in these ranges? I don't want to ask you to write this. As I said I hope I can get to the point where I am able to to this myself. EDIT: Did I get this right: When I don't modify n_mod and stick to 210 with the numbers axn posted, all I need to do is complete the init_indexes according to the pattern I'm looking for? As I understand it, modifying the step (6, 30, 210,...) together with n_mod is purely a performance thing? Last fiddled with by Puzzle-Peter on 2012-04-20 at 15:09
 2012-04-20, 16:26 #43 biwema     Mar 2004 5758 Posts 1) Presieve: (I am not completely sure, if axn uses the presieve for the same purpose, as I do). For me it is not necessary, but it is a way to improve the performance. When you sieve, you remove a candidate from the bitmap, add (for example) 7 remove the next candidate, add 7 and so on. When the prime you sieve is small, it takes a large number of steps to walk through the whole array. That’y why sieving small primes is slow (NewPGen is slow at the beginning). In this case it is easier to create an array of the size 7*11*13*17*19 * 8 (that it into bytes without remainder), sieve these first primes and take then this array as basis for future sieving. A large array (up to 23, 29 etc) is faster, when you sieve a big area, but takes longer to presieve. A combined approach is to take an array of 56 Bits (7 bytes), sieve the prime 7, then concat the resulting array 11 times (77 Bytes). Now you can sieve the prime 11, then concat the resulting array 13 times (1001 Bytes), sieve the prime 13, concat the resulting array 17 times and sieve the prime 17 etc. until you reach the final size. Maybe you want to keep the resulting array in case you want to sieve more than one range, in case one range does not fit into the memory. 2) n_mod: Indeed it is a performance thing (and you can keep the memory usage 15 or 105 times smaller) When you sieve 5-tuple, you have for example the form N + d, where d= -1, 1, 5, 7, 11. You can show that N needs the form 0 (mod 2) 0 (mod 3) 2 (mod 5) When you combine them together you get 12 (mod 30). For 4 tuple and 5 tuple you only need to sieve the candidates of the form 12 (mod 30) You can also show that primes need the form 4 or 5 (mod 7). That will result in 12 or 102 (mod 210). So you can either sieve (mod 30) or you generate 2 runs with 12 (mod 210) and 102 (mod 210). That is not faster but 7 times more candidates fit into memory When sieving 6-tuple (N + d, where d= -5, -1, 1, 5, 7, 11) then just 4 (mod 7) is remaining. In this case you can sieve mod 210. The next larger mod you get when sieveing 10-tuples with mod 2310.
 2012-04-20, 18:27 #44 firejuggler     Apr 2010 Over the rainbow 32×281 Posts where should i change Sprime to sieve a bit deeper for quad? (right now, quad10011_5_20... 32.3 M k ...at 1.22 B, advancing slooowly 1k every 0.2/0.1 sec)

 Similar Threads Thread Thread Starter Forum Replies Last Post Stargate38 And now for something completely different 2 2017-04-28 00:08 fivemack Math 27 2015-12-12 18:42 tapion64 GPU Computing 7 2014-04-10 06:15 Unregistered Information & Answers 2 2010-05-25 20:51 Sloth Prime Sierpinski Project 1 2006-05-10 02:02

All times are UTC. The time now is 00:14.

Sun Feb 28 00:14:03 UTC 2021 up 86 days, 20:25, 0 users, load averages: 2.08, 2.24, 2.20