mersenneforum.org  

Go Back   mersenneforum.org > Extra Stuff > Miscellaneous Math

Reply
 
Thread Tools
Old 2022-07-16, 10:46   #1
User140242
 
Jul 2022

22·13 Posts
Default Sieve Wheel

To reduce the memory size of the storage of prime numbers I had the idea of ​​using modular arithmetic and then choosing a suitable base bW < sqrt(n) and storing prime numbers greater than p_s, with p_1 * p_2 *...* p_s <= bW .


Taking RW a vector, of length nR, of the possible remainder modulo bW not divisible for prime p with p <= p_s
it is possible to store in an array PRIMES of dimension nR*n/bW+1 the prime numbers greater than p_s and less than n.


If PRIMES[i,j]=1 then RW[i]+bW*j is prime.


This way you can use a wheel sieve to find prime numbers with this algorithm:


In the Boolean array SIEVE of size nR*(n//bW+2) initially all set to true multiples of primes p=RW[j]+bW*k can be set to false so:


Code:
for (k=1; k<=sqrt(n)/bW+1; k++)
     for (j=0; j<len(RW); j++)
         if(SIEVE[j,k])
         {
             for (i=0; i<len(RW); i++)
             {
                  m_min=bW*k*k+k*C1[i,j]+C2[i,j];
                  for (m=m_min; m<n/bW+2; m+=RW[j]+bW*k)
                      SIEVE[i,m]=false;
            }
         }
where the coefficients C1 and C2 are two arrays of size nR*nR obtained in this way:


for each RW[j] finding RW[x] such that (RW[j]*RW[x])%bW=RW[i] then C1[i,j]=RW[j]+RW[x]


and if (RW[j]*RW[x])%bW ==1 then C2[i,j]=(RW[j]*RW[x])//bW


otherwise C2[i,j]= 1 + (RW[j]*RW[x])//bW



Is this procedure already known?


Can it be useful for making a quick sieve?




------------
Here you can find some examples: https://gist.github.com/user140242

Last fiddled with by User140242 on 2022-07-16 at 11:09
User140242 is offline   Reply With Quote
Old 2022-07-19, 11:21   #2
User140242
 
Jul 2022

22×13 Posts
Default

In the previous post I forgot that for convenience RW[i] <= 1 and bW must be divisible by the prime numbers p with p <= p_s to find possible remains.



The sieve wheel can be adapted to Mersenne numbers using a base bW = 8*P.

In this example https://www.mersenneforum.org/showpo...4&postcount=21 the multiples of the primes less than min(bW, 524288) are eliminated, which is greater than 40000 as indicated here https://www.mersenne.org/various/math.php .

Last fiddled with by User140242 on 2022-07-19 at 11:33
User140242 is offline   Reply With Quote
Old 2022-07-27, 08:56   #3
User140242
 
Jul 2022

22·13 Posts
Default

I tried looking for something similar but for example in gpusieve.cpp of mfacto I saw something like this still used:


Code:
for (i=1; i<=sqrt(n)/2; i++)
    if (SIEVE[i]){
        p=2*i+1;
        for (m=i+p; m<n/2+1; m+=p)
           SIEVE[m]=false;
    }
where is it sieve_size=n/2+1.





Instead using this:
Code:
for (i=1; i<=sqrt(n)/6+1; i++){
    if (SIEVE5mod6[i]){
        p=6*i-1;
        for (m=6*i*i; m<n/6+1; m+=p)
           SIEVE5mod6[m]=false;
        for (m=6*i*i-2*i; m<n/6+1; m+=p)
           SIEVE1mod6[m]=false;
    }
    if (SIEVE1mod6[i]){
        p=6*i+1;
        for (m=6*i*i; m<n/6+1; m+=p)
           SIEVE5mod6[m]=false;
        for (m=6*i*i+2*i; m<n/6+1; m+=p)
           SIEVE1mod6[m]=false;
    }
}
with almost the same number of iterations and the same complexity but with less memory used (2* sieve_size=n/6+1)
should result in a better result in terms of speed. I'm wrong?

Last fiddled with by User140242 on 2022-07-27 at 08:57
User140242 is offline   Reply With Quote
Old 2022-07-27, 11:23   #4
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

7·983 Posts
Default

In post one you refer to using Boolean arrays. https://www.scaler.com/topics/boolean-in-c/ indicates those use a byte for storage.
Mfaktx use bit maps for sieving for maximum storage efficiency. I don't know whether that was driven by capacity limitations of GPU ram in circa 2009 GPUs or imposes any performance penalty.
Some versions maxed out at gpusievesize 128M bits ~ 16MiB.

Current maximum gpusievesize is 2047M bits ~256 MiB, limited by max magnitude of a signed 32 bit int variable used in bit position computations, 231-1.

All reasonably modern GPUs have considerably more VRAM than that, generally orders of magnitude.
On the faster GPUs (~RTX2080) with the current bit mapped storage there are indications there would be slightly more performance if it was rewritten to unsigned to allow up to 4095M bits ~512MiB.


In post 3 IIUC you paraphrase the mfakto code for sieving small primes. Those are done once per run, and are relatively few, so its performance optimization is not necessary.
(I may be missing some factors of two or more in the following, but I don't think it changes the conclusion.)

We want to check for factors of some substantial Mersenne number, in a range of trial factor candidates 2start to 2stop, say for example 275 to 276 on M(~110,000,000). Since only 1 or 7 mod 8 may be factors, that's
~275/110000000/4 ~ 86.E12 possible factor candidates, of which a more-classes wheel generator "only" produces ~ 1/2 2/3 4/5 6/7 10/11 ~0.208 or ~17.8E12.

It produces more performance in mfaktx to allow a relatively few composite candidates to slip through to be tried as potential factors, than to completely sieve all ~17.8E12 candidates. So it doesn't sieve the factor candidates all the way to sqrt(candidate) ~237.5 to 238.
Instead it goes to ~256M to ~4G dependent on program limits & on GPU&operands-specific tuning for performance.
Small-prime limits observed in performance tuning are ~100k values typically, so values up to ~220. Sieving those ~105 values once, up to a suitable small limit <~1000 is relatively quick, so optimizing its comparatively small amount of work is not worthwhile.

Last fiddled with by kriesel on 2022-07-27 at 12:20
kriesel is offline   Reply With Quote
Old 2022-07-27, 11:56   #5
User140242
 
Jul 2022

22×13 Posts
Default

Quote:
Originally Posted by kriesel View Post
In post one you refer to using Boolean arrays.


Yes, in the first post I used Boolean arrays only to understand how it works, of course it must be transformed through the use of bits to make it more efficient.


What I noticed however in the various sieves used that the final size of the SIEVE is always n or even the prime numbers are stored in a second array. But using a sieve wheel described by me this is not necessary because by choosing an appropriate base bW the space used to store the data is always less and the complexity does not vary much and the number of iterations is the same furthermore we must not store the prime numbers in a second vector but using p=RW[i]+bW*j.
User140242 is offline   Reply With Quote
Old 2022-07-27, 12:28   #6
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

7×983 Posts
Default

The small-primes array is reusable on different bit levels or Mersenne numbers, since its size therefore content depend only on maktx.ini tuning parameters.
The factor candidates must be generated for each bit level & Mersenne number combination.
(edit time expired on post 4)

Last fiddled with by kriesel on 2022-07-27 at 12:38
kriesel is offline   Reply With Quote
Old 2022-07-27, 12:43   #7
User140242
 
Jul 2022

22·13 Posts
Default

Quote:
Originally Posted by kriesel View Post
depend only on maktx.ini tuning parameters.


I tried to see how maktx works but it's the code is too long and I can't understand everything.


But as I understand it is used primes[SIEVE_PRIMES_MAX] and SIEVE_PRIMES_MAX is the number of primes that are used. But with the post 3 with sieve base 6 it is possible to generate primes equal to 3*SIEVE_PRIMES_MAX using the same memory and the same complexity. This reduces the factors to be verified and therefore increases the speed.


I hadn't read this


Quote:
Originally Posted by kriesel View Post
It produces more performance in mfaktx to allow a relatively few composite candidates to slip through to be tried as potential factors, than to completely sieve all ~17.8E12 candidates.

Last fiddled with by User140242 on 2022-07-27 at 13:09
User140242 is offline   Reply With Quote
Old 2022-07-27, 13:34   #8
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

7·983 Posts
Default

See also Concepts in GIMPS trial factoring (TF) https://www.mersenneforum.org/showpo...23&postcount=6
kriesel is offline   Reply With Quote
Old 2022-08-10, 09:41   #9
User140242
 
Jul 2022

22×13 Posts
Default

I propose this thread again, I have implemented this version of the wheel sieve with adjustable base:


https://gist.github.com/user140242/e...8e5f9bafdd20f7


I do not have the opportunity to test it to see if it is performing like the codes reported in this thread.



Can anyone tell me if it is a good idea and if it is possible to make it more performing?

MODERATOR NOTE: This post was found starting a new thread elsewhere. I moved it here.

It is NOT a good idea to start a new thread on an existing topic, especially if it's an earlier thread you started. Don't do that again.

Last fiddled with by Dr Sardonicus on 2022-08-10 at 11:48
User140242 is offline   Reply With Quote
Old 2022-08-10, 11:56   #10
User140242
 
Jul 2022

22×13 Posts
Default

Quote:

MODERATOR NOTE: This post was found starting a new thread elsewhere. I moved it here.

It is NOT a good idea to start a new thread on an existing topic, especially if it's an earlier thread you started. Don't do that again.

I inserted the post in this thread:

https://www.mersenneforum.org/showthread.php?t=22961

because I thought it was more suitable since I needed help to make a comparison with the codes reported in the thread.
User140242 is offline   Reply With Quote
Old 2022-08-10, 13:04   #11
Dr Sardonicus
 
Dr Sardonicus's Avatar
 
Feb 2017
Nowhere

135578 Posts
Default

Quote:
Originally Posted by User140242 View Post
I inserted the post in this thread:

https://www.mersenneforum.org/showthread.php?t=22961

because I thought it was more suitable since I needed help to make a comparison with the codes reported in the thread.
My mistake. I apologize.

It just happened that your post started a new page, I read "I propose this thread again," and thought it was a new thread.

That'll teach me to do moderator things before I'm fully awake...
Dr Sardonicus is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
PunchWheel: A (new?) wheel / segmented sieve idea (w/ code) hansl Computer Science & Computational Number Theory 6 2019-10-16 20:36
Wheel Factorization a1call Factoring 11 2017-06-19 14:04
Advantage of lattice sieve over line sieve binu Factoring 3 2013-04-13 16:32
Wheel factorization: Efficient? 3.14159 Miscellaneous Math 520 2010-08-30 07:21
A Wheel storm5510 Puzzles 7 2010-06-25 10:29

All times are UTC. The time now is 13:31.


Fri Oct 7 13:31:17 UTC 2022 up 50 days, 10:59, 1 user, load averages: 1.07, 1.15, 1.21

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔