 mersenneforum.org New Maximal Gaps
 Register FAQ Search Today's Posts Mark Forums Read  2023-05-20, 12:27 #89 Andrew Usher   Dec 2022 24·33 Posts OK, the range to reach the new gaps and a little beyond is 2^57, let's use that. At 10^8 per second that's 50 years, so too slow. As at most one division for each prime should be needed per sieve interval, using the biggest intervals you can (whatever fits in memory) speeds it up. I'm not sure how the divisions are being done now, but I'd think 64-bit assembly to be required for optimal performance - I don't think any high-level language implements full-length integer division, and I've used 32-bit assembly for my trial division codes, hence the limit to 32-bit divisors. Here 128/64 (64-bit DIV) would be really needed to do a division in one piece. I'd imagine the divisions should then take possibly a second per sieve interval. It's likely the rest of sieving would take longer, as it involves a pass over the whole of memory being used, so ultimately memory speed should be limiting. If you want 10^10 per second, you need a memory bandwidth that accommodates that length of sieve (at one odd integer per bit, 600MB/second). This should be possible.   2023-05-20, 17:22 #90 Cybertronic   Jan 2007 Germany 33×52 Posts I have an other way to store primes up to 4311711871. [sqrt (2^64+2^57)] This will take 143,7 MB RAM in total via Byte. Code: Every prime is 30k+1,7,11,13,17,19,23,29 . These are 8 points. 1 Byte have 8 Bits To encode primes 90 to 120, we have  at Array-position 3 To encode primes 120 to 150, we have  at Array-position 4 and so on 143,7MB * 30 => primes up to 4,31 Billion @mart_r: "Currently the division of the 65-bit offset number by the up-to-33-bit numbers is the bottleneck" This is not the problem, also for this I have a trick. You will sieve from 18446744074000000000 to 18446744075000000000 and you will find the starting point for p=997... - take 18446744074000000000\2=9223372037000000000 - 9223372037000000000 MOD 997 = 157 - the starting point for p=997 is 18446744074000000000+997-2*157=18446744074000000683, so 683+(997*2)x - is starting point even, so set +p p=4170000013 -> 4170000013-546082470*2=3077835073 ; 4170000013 times 18446744077077835073 To multiply by 2 ... maybe via SHR option ? Last fiddled with by Cybertronic on 2023-05-20 at 18:02   2023-05-21, 10:02   #91
mart_r

Dec 2008
you know...around...

3×5×59 Posts Quote:
 Originally Posted by Cybertronic You will sieve from 18446744074000000000 to 18446744075000000000 and you will find the starting point for p=997... - take 18446744074000000000\2=9223372037000000000 - 9223372037000000000 MOD 997 = 157 - the starting point for p=997 is 18446744074000000000+997-2*157=18446744074000000683, so 683+(997*2)x - is starting point even, so set +p p=4170000013 -> 4170000013-546082470*2=3077835073 ; 4170000013 times 18446744077077835073 To multiply by 2 ... maybe via SHR option ?
In VBA I work with short integers (a%, b% etc., ranging from -32768 to +32767), long integers (a&, b& etc., from -2^31 to +2^31-1) and floating point variables with 53 bits numerical precision. I've read about 64-bit integers somewhere (I believe they have a ^ sign after the variable), but so far haven't seen them being available. So I have to split the division into several steps anyway, that's why it's so slow. Like Slowpokes.   2023-05-21, 14:52 #92 kog67   Jan 2022 22 Posts I am trying to expand the range of my sieve to about 25*2^64. I am using a segmented sieve, with sieving primes stored in buckets. The sieve is mod 30, so 1 byte = 30n+[1,7,11,13,17,19,23,29]. It takes 1.5-2 s. to sieve an interval of 10^9, just below 2^64. My guess is that i will have almost same speed above 2^64. The main reason for this, is the use of buckets. The starting point for each prime is only calculated once, and then stored in the right bucket. When sieving, i only use the primes in the buckets assosiated with the current segment, and move the primes to the next bucket. This way i can skip using a large prime for many segments. I am using segment size 512k bytes. Any sieving prime close to 4*10^9 will skip at least 127 segments. ( (4*10^9)/(2*30*512k)). The bottleneck is storing the primes, using 8 bytes for each prime. that is at least 1.6 gb of RAM. ( and up to 8 gb, when close to 25*2^64 ) But this is just a free time project, competing with a lot of other interests. So the progress is slow. One second concern, is that first occurrences and maximal gaps are very rare above 2^64. I did not count this, but below 2^64 there are ~750 first occurrences, and i estimated there to be ~15-25 first occurrences between 2^64 and 2^65. So even with fast algoritms there will only be a small gain in known first occurrences.   2023-05-21, 15:36 #93 Cybertronic   Jan 2007 Germany 33·52 Posts @kog67 To store alle primes for 25*2^64 take 715 MB, not 8GB. Another way is, to store differences between primes-> 3,2,2,4,2,... This will take 2Byte per prime-> ~ 1.8GB in total.   2023-05-21, 18:20   #94
kog67

Jan 2022

22 Posts Quote:
 To store alle primes for 25*2^64 take 715 MB, not 8GB.
I can't store primes like that, when using buckets.

A simple segmented sieve, is 3 nested loops like this:

1) loop through all segments
2) Loop through all primes < sgrt( end of segment )
3) loop through all positions inside segments that divide the primenumber
this is slow, when starting at a large number, using a small segment. in step 2, large amounts of primes will miss the segment, and the starting position will be calculated repeated times until you reach the correct segment.
But using a larger segment is not cache-friendly, resulting in slow memory access.

Improving with buckets:
1) Initialize buckets
- loop through all primes, and calculate start position
- calculate witch segment it belongs ( Segment id )
- Calculate position within that segment.
- Save prime number, and position within segment in Bucket[ Segment id ]

in step 2) we loop through primes only in Bucket[ Segment id ]
in this loop the bucket has to be loaded, processed, and distributed to next buckets
That is the reason the prime number has to be saved, along with its position in the next segment

an example:

Start of sieve = 10^10
Segment size = 1000
only odd numbers are considered

Prime = 99707

First multiple is 10^10 + 113565
We save prime 99707, and position 565 in bucket[ 113 ]
Sieve segments 0..112 without the need to consider this prime
in Segment 113 we cross of position 565
we find next position : 565 + 2*99707 = 119979
Next bucket for this prime is 113 + 119 = 312
Move 99707 with position 979 to bucket[ 312 ]

After the initializing no division or modulo is calculated

in my actual implementation, i am using 32 bits to save the prime number divided by 30, and then 3 bits to mark the class of the prime modulo 30. ( 0=1, 1=7, 2=11...7=29 ).24 bits is used for position inside segment, and 3 bits is used for current multiple modulo 30. ( if we cross off M * prime, and m =1 (mod 30 ), then the next cross off is (M+6) * prime, and i need to store this info in the bucket ) 62 bits used, 2 dummy bits.   2023-05-22, 19:04   #95
mart_r

Dec 2008
you know...around...

88510 Posts Quote:
 Originally Posted by kog67 I am trying to expand the range of my sieve to about 25*2^64. Great to see you're still in the game!   2023-05-24, 11:56 #96 Andrew Usher   Dec 2022 24·33 Posts I see some possibly good ideas here - Cybertronic's proposal of dropping the last bit in division could avoid 64-bit DIV for all but the small number of 33-bit primes, which may be faster and even might make it possible to still do this in 32-bit code. The last, describing the use of 'buckets', would need testing. Small segments are inefficient, as he admits, and it's not clear that using buckets would overcome this. While it's true that a long segment is 'not cache friendly' I'm not sure that any alternative is, which is why I stated that the speed of main memory is ultimately going to limit sieving. The list of primes and their buckets won't fit in a cache any more than a long sieve segment would; the idea I guess is that at least the first can be scanned linearly only once, but that applies for each (shorter) segment, which would appear to destroy the advantage. And again I repeat that I'd need to know that this would be accepted as an official verification (and who is making that decision) - certainly with a large investment of my and my computer's time, I don't want to have repeated what happened with F22 (https://mersenneforum.org/showpost.p...6&postcount=38) and only be ignored and mocked.   Thread Tools Show Printable Version Email this Page Similar Threads Thread Thread Starter Forum Replies Last Post Bobby Jacobs Prime Gap Searches 7 2022-08-28 12:12 Bobby Jacobs Prime Gap Searches 52 2020-08-22 15:20 Bobby Jacobs Prime Gap Searches 5 2019-03-17 20:01 robert44444uk Prime Gap Searches 1 2018-07-10 20:50 gd_barnes Riesel Prime Search 11 2007-06-27 04:12

All times are UTC. The time now is 03:33.

Sun May 28 03:33:10 UTC 2023 up 283 days, 1:01, 0 users, load averages: 0.85, 0.98, 0.93