mersenneforum.org Sieving polynoms in Quadratic Sieve
 Register FAQ Search Today's Posts Mark Forums Read

 2008-12-23, 13:38 #1 ThiloHarich     Nov 2005 101 Posts Sieving polynoms in Quadratic Sieve Hi I have a dump question. In MPQS we determine the sieve interval M of a function q(x) = (ax + b)^2 - N such that the |q(x)| <= N (-M <= x <= M) Why do we do this? In fact we consider q'(x) = (ax + b)^2 mod N, which means that the value of q'(x) is limited by N anyway. Can't we use just one Polynom with a unlimited range M? Another dump question don't it make sense to look at values q(x) = x^2 mod N which are in the near of sqrt (l * N). Can't we just look at values q(x) = \ceil {sqrt (l*N)}^2 mod N. This ensures that the generated values |q(x)| are small: |q(x)| <= (sqrt (l*N)+1)^2 mod N = sqrt (l*N)^2 + 2 * sqrt (l*N) + 1 mod N = l*N + 2 * sqrt (l*N) + 1 mod N = O(sqrt (N)) So we have a much better chance so find smooth values. Of coarse we can not sieve here, but we can use the gcd method of bernstein.
 2008-12-23, 16:34 #2 jasonp Tribal Bullet     Oct 2004 353710 Posts MPQS needs a method to determine when it is better to switch to another polynomial because the sieve values of the current polynomial are getting too large, and hence you are not finding relations fast enough because they are too unlikely. You can sieve each polynomial as much as you want, but eventually you will sieve 100x as long to find 2x as many relations. You'd find more relations in the same time if you switched to another polynomial before that point. Also, knowing in advance how far you want to sieve lets you choose A*X+B so that the sieve forms a parabola whose average value is zero (i.e. has approximately equal numbers of positive and negative values), so that the worst size of polynomial value is two or three bits smaller than you'd otherwise have to deal with. Your second question is interesting, but I suspect you'll find that you have to perform a lot of computation to sample a large enough arithmetic progression of polynomial values to compare favorably with the number of samples you can get in a modest size sieve. Last fiddled with by jasonp on 2008-12-23 at 16:36
2008-12-23, 16:47   #3
bsquared

"Ben"
Feb 2007

22×23×37 Posts

forgive me for being so slow, but what does the I refer to in your notation? for instance here:

Quote:
 sqrt (l * N)

 2008-12-23, 18:50 #4 fivemack (loop (#_fork))     Feb 2006 Cambridge, England 13·491 Posts I think his idea is, instead of looking at values of (ax+b)^2 for various carefully chosen a and b, to look at values of ceil(sqrt(xN))^2 for lots of x. When I try this, the values seem to have much worse smoothness properties than the quadratic sieve; I suspect it's a reinvention of the continued-fraction method without the advantages of using continued-fraction approximants. My vague belief is that taking numbers that get bigger than N and reducing them mod N leaves them of size very close to N (expectation one bit shorter) while destroying any arithmetic properties that they have.
 2008-12-23, 21:13 #5 ThiloHarich     Nov 2005 1458 Posts @ bsquared I mean the square root of a (integer) multiple of N (maybe I should have used 'i' instead of 'l'). In Fermat we choose x to be near the square root of N.So the result is in the near of N. So mod N the value is small. Sine we always compute mod N we can look for values which will be in the near of 2*N, 3*N, .... So the preimage is something like the rounded square root of i*N , i \in N. For x = \round {sqrt (i*N)} we look for q(x) = x^2 mod N @jasonp the (intermediate) values q(x) will never exceed N^2 since we always calculate mod N. Of coarse the x might get bigger. But we start of with x ~ sqrt (N) and the sieve interval is something far below sqrt (N) as well. So we end up here far below N as well. @ fivemack unfortunately I have only a rough idea of the CFRAC. Has anybody a good and short introduction to it? Last fiddled with by ThiloHarich on 2008-12-23 at 21:20
2008-12-23, 21:38   #6
jasonp
Tribal Bullet

Oct 2004

33×131 Posts

Quote:
 Originally Posted by ThiloHarich @jasonp the (intermediate) values q(x) will never exceed N^2 since we always calculate mod N. Of coarse the x might get bigger. But we start of with x ~ sqrt (N) and the sieve interval is something far below sqrt (N) as well. So we end up here far below N as well.
I wasn't clear about what I meant. If the sieve interval has a million sieve values, you only have to perform a few thousand single-precision operations to determine most of the smoothness properties of all of those sieve values. Your method would require, at the very least, computing the product of all one million multiple-precision sieve values before using Bernstein's remainder tree. Just that would be more work than the sieving. Sieving does work proportional to the factor base size with a very small proportionality constant, factoring the sieve values directly does work proportional to (at best) the number of sieve values.

You can use both a sieve and a remainder tree on the survivors, and I suspect this would be faster than sieving alone when you allow three or more large primes in relations.

PS: Regarding your original question, it is a misnomer to say that MPQS has "multiple polynomials". Any relation found by MPQS would be found with ordinary QS using the single ordinary QS polynomial. What MPQS really does is sample the ordinary QS polynomial at very widely separated values of x, where the sampling is structured so that the values examined are known to be divisible by some big numbers. This is just like using a lattice sieve for NFS problems, except the lattice is 1-dimensional instead of 2-dimensional.

Last fiddled with by jasonp on 2008-12-23 at 21:49

 2008-12-24, 00:35 #7 ThiloHarich     Nov 2005 101 Posts @ jasonp on my second question you might be right. There might be a lot of computation to get the corresponding values. Maybe the CFRAC is just an approximation/calculation of the sqrt (i*N) values. But i think we can reduce the effort. If we have a sieve intervall M, we can calculate sqrt (N) with arbitrary log (M) after the decimal point. sine M is much smaller then sqrt (N) this can be done easy. This only must be done once. So we have just the multiplication (of this value) with sqrt (i). If we just use i=j^2 this just means that we have to calculate j * sqrt (N). If we calculate sqrt (N) * 2^k for some k we can calculate the values easy with the graycode style calculation we did it for calculating the b's for the SIQS. On my first question I'm not really satisfied with your answer. I do not see that we get much fewer relation if we have big values for x since we always calculate q(x) mod N. So the values should be 1/2 N in average. we loose just a very small linear factor which should be smaller then 2 (if we might calculate the average size of the resulting factors of the original parabola). Why cant we use just one polynomial?
 2008-12-24, 00:50 #8 jasonp Tribal Bullet     Oct 2004 33×131 Posts first google hit for CFRAC I'll have to think more about your method, but I still think Tom is right. Bernstein has proposed using a remainder tree for CFRAC too.
2008-12-24, 03:50   #9
bsquared

"Ben"
Feb 2007

22·23·37 Posts

Quote:
 Originally Posted by ThiloHarich On my first question I'm not really satisfied with your answer. I do not see that we get much fewer relation if we have big values for x since we always calculate q(x) mod N. So the values should be 1/2 N in average. we loose just a very small linear factor which should be smaller then 2 (if we might calculate the average size of the resulting factors of the original parabola). Why cant we use just one polynomial?
We can choose the polynomial such that we always find the "null" of the parabola in the sieve interval, to either side of our starting x. Attached is a graph of sieve locations, x, where we find a relation, along with the size in bits of the q(x) from which it came. This was generated from 1 polynomial having ~ 78 million total sieve locations. The average bit value of q(x) is ~ 140 (the input N has size of 232 bits for this example).

If instead, we use 100 polynomials each with 1/100th sized sieve interval, we again sieve over 78 million sieve locations but now the average bit value of q(x) is ~ 132. In practice I find about twice the number of relations in less time than it takes to sieve one giant sieve interval. The lower average size of q(x) occurs because the parabola is not nearly so "spread out" in order to force its null into a huge sieve interval. This will be illustrated in the next post, because I can only attach one thing to a post...

- ben.
Attached Files
 1_poly_78M_loc.pdf (17.8 KB, 140 views)

Last fiddled with by bsquared on 2008-12-24 at 03:53 Reason: size of N

2008-12-24, 03:52   #10
bsquared

"Ben"
Feb 2007

340410 Posts

And the 100 poly case...

So there is no reason why we can't sieve over 1 polynomial. In either case the size of the q(x) is much less than N (slightly bigger than sqrt(N)). The q(x) are just a bit smaller for smaller sieve intervals, and in practice it's faster.
Attached Files
 100_poly_78M_loc.pdf (30.6 KB, 148 views)

 2008-12-24, 20:51 #11 ThiloHarich     Nov 2005 101 Posts @ bsquared I think you argue from the fact that we choose a search Intervall M then determine the factor a (of the resulting residues) to be something like sqrt(N)/M for one polynomial. If we use more polynomials we can reduce M (simply split it over more polynomials) this means that we can increase a, if we rely on the assumption just to search around the roots of the polynomial. I want to choose 'a' very big independent of M. If we choose a to be something like N/M we might still get M different values for this polynomial. And we have a known (chosen) big factor a ~ N/M in the residues. So a unknown factor of a most M remains. Since M is subpolynomial (the running time of the algorithm) this might be easy to determine. The only issue i see is setting up the a^-1 mod p of the factor base to do sieving. But we can use a product of the factor base like we did it in SIQS. Setinng up one polynomial should be easier as setting up multiple of it. So where is my mistake? Concerning CFRAC, I did not find a concrete description of the algorithm. But what I guess from the theory behind it they try to approximate sqrt (N), which results in some values which are near some i * sqrt (N) as well. But I think they will not hit every i. This is my dump interpretation of it. On the algorithm I suggested, it might be done like this: What about calculating s = sqrt (N) in a normal way up to an reasonable precision, and then calculating q_0 = s q_i = q_{i-1} + s. This is just a normal addition. Then we round it and find the remainder over the Factorbase? improvement for i=2*j do q_{2j} = 2* q_j so this will not cost us much compared with the determination of the factors of this value.

 Similar Threads Thread Thread Starter Forum Replies Last Post mickfrancis Factoring 2 2016-05-06 08:13 mickfrancis Factoring 5 2016-03-31 06:21 paul0 Factoring 3 2011-09-22 17:12 R1zZ1 Factoring 36 2007-11-02 15:59 ThiloHarich Factoring 5 2006-07-14 09:51

All times are UTC. The time now is 18:47.

Tue Apr 20 18:47:49 UTC 2021 up 12 days, 13:28, 0 users, load averages: 3.24, 3.44, 3.46