mersenneforum.org Sieving freakishly big MMs (was "World record" phone number?)
 Register FAQ Search Today's Posts Mark Forums Read

2012-10-04, 19:27   #89
philmoore

"Phil"
Sep 2002
Tracktown, U.S.A.

3·373 Posts

Quote:
 Originally Posted by ET_ How many k did you sieve up to 2^31, per exponent? Luigi
I only sieved the k less than 256 as I recall. You can notice that the k values I subsequently subjected to a prp test were fairly small.

2012-10-04, 19:53   #90
philmoore

"Phil"
Sep 2002
Tracktown, U.S.A.

3·373 Posts

Quote:
 Originally Posted by ewmayer You got the gist of it right, but the work estimate is an order of magnitude too optimistic, because there is no fast DWT-method for multiply modulo such candidate factors, as there is for the multiply modulo Mp used in LL testing, and p-1/ecm done on Mp. As a result one must do a generalized modmul. In my code - which does not yet have the FFT-enabled versions of all these needed to do this efficiently on the really big M(Mp) - I use a Montgomery-mul-based modpow algorithm, which needs not 1 bignum MUL (as for LL) but 3 MUL operations, as detailed on p.3 of the draft manuscript I posted in this Math forum thread. The only part of that MUL sequence which permits optimization beyond general FFT-based MUL is the final UMULH_b(q, lo), which can take advantage of the special form of q for double-Mersennes to achieve the result with just some shift/add/scalar-mul work. So the best one could hope to achieve is a per-powering-bit time perhaps 3-5x longer than the per-iteration time for LL run on the same Mp.
Suppose we wanted to do a PRP test on 2*185*(2^43112609-1)+1. If I write this as 185*2^43112610-369, Prime95 uses (on my computer) a "Core 2 type-3 FFT length 4608K" as opposed to a 2240K FFT on M43112609, so I would expect that, using George's routines, it would take about twice as long to test this number as to do a simple LL test on M43112609. I originally tested the numbers listed on Will Edgington's page with pfgw and a SCRIPT file that:
a) tested to see if the given number was a factor of the iterated Mersenne number, which would have automatically made the factor a prime.
b) finished a primality test on this number if it was not a factor, since it still could possibly have been prime without being a factor. (The primality test is described in Hardy and Wright's Number Theory book.)
However, it probably would be sufficient to simply do a prp test, since if the number fails it, as is most likely, it automatically fails both a) and b). In the extraordinary case that it passes, further testing would be required to determine whether a) or b) is the case.

2012-10-04, 21:10   #91
ewmayer
2ω=0

Sep 2002
República de California

1169010 Posts

Quote:
 Originally Posted by philmoore Suppose we wanted to do a PRP test on 2*185*(2^43112609-1)+1. If I write this as 185*2^43112610-369, Prime95 uses (on my computer) a "Core 2 type-3 FFT length 4608K" as opposed to a 2240K FFT on M43112609, so I would expect that, using George's routines, it would take about twice as long to test this number as to do a simple LL test on M43112609.
That assumes that only one such double-width FFT-mul is being done per powering bit processed - Unless there is a really sweet DWT-based implicit-mod method for such moduli (in which case I would think that one would not need a full double-wide product to begin with), it would require several such MULs to process each bit, irrespective of which general modmul (e.g. Barrett vs Montgomery) is being used. Did you compare the actual per-iteration timings with an LL test at the same 4608K FFT length?

2012-10-05, 04:46   #92
ATH
Einyen

Dec 2003
Denmark

CB716 Posts

Quote:
 Originally Posted by ewmayer Did you compare the actual per-iteration timings with an LL test at the same 4608K FFT length?
LL test 43112609 (FFT 2240K): 8 ms/iteration
PRP=185,2,43112610,-369 (FFT 4608K): 18.5 ms/iteration
LL test 87M (FFT 4608K): 15 ms/iteration

2012-10-05, 05:59   #93
aketilander

"Åke Tilander"
Apr 2011
Sandviken, Sweden

23616 Posts

Quote:
 Originally Posted by LaurV And that factor q will be PRIME. And it will be for the first time in a VERY LONG history when the LARGEST KNOWN PRIME is not a mersenne number.
And if we are looking for the largest prime, Wouldn't it be most time saving to look for (prime) factors 2*k*[M#47+n]+1 of M[M#47+n] only for k=1 and succesive n:s (2, 4, 6, 8 ...) after sieving to omit those who obviously are composit/have small factors?

Last fiddled with by aketilander on 2012-10-05 at 06:10

2012-10-05, 07:57   #94
ET_
Banned

"Luigi"
Aug 2002
Team Italia

2·41·59 Posts

Quote:
 Originally Posted by ATH LL test 43112609 (FFT 2240K): 8 ms/iteration PRP=185,2,43112610,-369 (FFT 4608K): 18.5 ms/iteration LL test 87M (FFT 4608K): 15 ms/iteration
Only 1 to 2 weeks... Or less if you use CUDALucas.

Luigi

2012-10-05, 18:15   #95
ewmayer
2ω=0

Sep 2002
República de California

2DAA16 Posts

Quote:
 Originally Posted by ATH LL test 43112609 (FFT 2240K): 8 ms/iteration PRP=185,2,43112610,-369 (FFT 4608K): 18.5 ms/iteration LL test 87M (FFT 4608K): 15 ms/iteration
Interesting - It seems as if it is doing an implicit-DWT-mod, but one which allows only half as many input bits per word as does the Mersenne-mod DWT. Now that I see this, I seem to recall exchanging e-mail with George a few years back about generalized DWTs, where he mentioned something along these lines. I shall dig into my old PMs and see if I can find it.

In any event, the ration for the above is ~2.3x the cost of an LL-style modmul for a similar-sized modulus, which is slightly better than I hoped for.

2012-10-05, 20:07   #96
philmoore

"Phil"
Sep 2002
Tracktown, U.S.A.

3×373 Posts

Quote:
 Originally Posted by ewmayer Interesting - It seems as if it is doing an implicit-DWT-mod, but one which allows only half as many input bits per word as does the Mersenne-mod DWT. Now that I see this, I seem to recall exchanging e-mail with George a few years back about generalized DWTs, where he mentioned something along these lines. I shall dig into my old PMs and see if I can find it. In any event, the ration for the above is ~2.3x the cost of an LL-style modmul for a similar-sized modulus, which is slightly better than I hoped for.
George described his implementation in section 4 of our Sierpinski paper here:
http://www.integers-ejcnt.org/vol8.html
(Scroll down to paper A61.)

2012-10-08, 00:23   #97
ewmayer
2ω=0

Sep 2002
República de California

2·5·7·167 Posts

Quote:
 Originally Posted by philmoore George described his implementation in section 4 of our Sierpinski paper here: http://www.integers-ejcnt.org/vol8.html (Scroll down to paper A61.)
Thanks - that is interesting. But a couple of DWT-related points are not clear:

1. "Mersenne-like DWT on 2^{n+log k} ± c" - That should read log2(k), shouldn't it? That exponent is no longer an integer, so how does one compute the basic DWT params, e.g. (for FFT length = N) #bigwords = exp%N ? Does one simply round log2(k) up or down?

2. Why does the resulting DWT need 2x the runlength of a Mersenne-mod DWT on a similar-sized input? The paper mentions that one of the advantages of the approach described vs Percival's is more allowable bits-per-input, so the 2x runlength does not jibe with that. Nor does the paper mention a full double-wide product (via zero-padding the inputs) being needed.

(I PMed George, hopefully he will be kind enough to help in providing clarification).

 2012-10-08, 02:59 #98 c10ck3r     Aug 2010 Kansas 547 Posts 185*2^43112610-369 completed P-1, B1=100000, B2=2000000, We1: 5B4A8858 201*2^43112610-401 completed P-1, B1=100000, B2=2000000, We1: 5B4A8858
2012-10-08, 03:35   #99
Prime95
P90 years forever!

Aug 2002
Yeehaw, FL

52·311 Posts

Quote:
 Originally Posted by ewmayer 1. "Mersenne-like DWT on 2^{n+log k} ± c" - That should read log2(k), shouldn't it? That exponent is no longer an integer, so how does one compute the basic DWT params, e.g. (for FFT length = N) #bigwords = exp%N ? Does one simply round log2(k) up or down?
Number of big words is (n + log2 k)%N bits. Every FFT word has an integral number of bits except the most significant FFT word. This doesn't cause an increase in the FFT size, but during carry propagation FFT words must be multiplied by k before rounding. This gives us log2 k bit fewer bits of precision in the output. Also, wrapping the carry to the least significant FFT word is tricky but do-able.

The +/- c values causes us to use weights between 1 and log2 (abs (c)). This costs us precision on the FFT input words.

Quote:
 2. Why does the resulting DWT need 2x the runlength of a Mersenne-mod DWT on a similar-sized input? Nor does the paper mention a full double-wide product (via zero-padding the inputs) being needed.
Zero-padding is not required. Really small k values often end up using the same FFT size as Mersenne LL tests.

 Similar Threads Thread Thread Starter Forum Replies Last Post CRGreathouse Number Theory Discussion Group 51 2018-12-16 21:55 LaurV Hobbies 74 2018-07-11 19:33 Batalov Computer Science & Computational Number Theory 40 2013-03-16 09:19 outlnder Soap Box 20 2005-02-03 09:30 nitai1999 Software 7 2004-08-26 18:12

All times are UTC. The time now is 22:48.

Mon Jan 24 22:48:08 UTC 2022 up 185 days, 17:17, 1 user, load averages: 1.04, 1.20, 1.40