mersenneforum.org > Math Mersenne trial division implementation
 Register FAQ Search Today's Posts Mark Forums Read

 2017-04-18, 02:03 #1 mathPuzzles   Mar 2017 368 Posts Mersenne trial division implementation I was reading the great wiki on the stages of Mersenne testing. The initial filtering step is trial division. After some sieving and modular rejection, you still end up with various primes which are worthwhile to explicitly test as factors. And the wiki says "The remaining potential factors are tested using the [traditional generic] modular powering algorithm above." Is this correct? Is the best Mersenne hunting software literally using the classic power-mod evaluation, or is are there any enhancements since you know you're taking the modulus of a number with the very simple form of $$2^n$$? Put another way, Is there a especially good way to efficiently evaluate $$2^n \mod m$$ as opposed to a generic $$a^n \mod m$$? I can imagine that you could speed up the square-and-multiply loop with an especially efficient Brauer $$2^k$$-ary ladder since you can build any $$2^b$$ for $$b < \log_2 m$$ for free by simply left shifting a single bit by $$b$$. That should save about 1/3 of the modular multiplies, but you still need to do the $$\log_2 n$$ squarings. If $$n$$ is large enough it would eventually be useful to use Montgomery multiplication instead, but you'd lose the elegant "shift a bit" ladder and again it'd be no different for $$2^n$$ than a generic $$a^n$$. Are there any other speedup tricks or is it really like the wiki says and it's still really just the classic square-and-multiply modular powering algorithm?
2017-04-18, 11:23   #2
science_man_88

"Forget I exist"
Jul 2009
Dumbassville

836910 Posts

Quote:
 Originally Posted by mathPuzzles I was reading the great wiki on the stages of Mersenne testing. The initial filtering step is trial division. After some sieving and modular rejection, you still end up with various primes which are worthwhile to explicitly test as factors. And the wiki says "The remaining potential factors are tested using the [traditional generic] modular powering algorithm above." Is this correct? Is the best Mersenne hunting software literally using the classic power-mod evaluation, or is are there any enhancements since you know you're taking the modulus of a number with the very simple form of $$2^n$$? Put another way, Is there a especially good way to efficiently evaluate $$2^n \mod m$$ as opposed to a generic $$a^n \mod m$$? I can imagine that you could speed up the square-and-multiply loop with an especially efficient Brauer $$2^k$$-ary ladder since you can build any $$2^b$$ for $$b < \log_2 m$$ for free by simply left shifting a single bit by $$b$$. That should save about 1/3 of the modular multiplies, but you still need to do the $$\log_2 n$$ squarings. If $$n$$ is large enough it would eventually be useful to use Montgomery multiplication instead, but you'd lose the elegant "shift a bit" ladder and again it'd be no different for $$2^n$$ than a generic $$a^n$$. Are there any other speedup tricks or is it really like the wiki says and it's still really just the classic square-and-multiply modular powering algorithm?
they could use factoring of p-1 and then use the factors of p-1 to calculate it for 2^(p-1) and then for 2^p.

http://primes.utm.edu/glossary/xpage...entiation.html also suggest they could go left to right binary exponentiation but that's a time versus space tradeoff.

Last fiddled with by science_man_88 on 2017-04-18 at 11:27

 2017-04-18, 19:48 #3 mathPuzzles   Mar 2017 2×3×5 Posts An extremely similar question comes up when doing the precompute necessary for Montgomery multiplication of any number, since you almost always use a computational base like $$R=2^{64i}$$ for some convienient $$i$$. The one-time precompute needs to find the modular inverse of R in the base in question, usually using the extended Euclidian algorithm. So, is there any special method or optimization to computing $$(2^n)^{-1}\mod m?$$ Last fiddled with by mathPuzzles on 2017-04-18 at 19:49
2017-04-18, 19:53   #4
mathPuzzles

Mar 2017

2×3×5 Posts

Quote:
 Originally Posted by science_man_88;456967[url http://primes.utm.edu/glossary/xpage/BinaryExponentiation.html[/url] also suggest they could go left to right binary exponentiation but that's a time versus space tradeoff.
Yep, to get the full advantage of the (free) ladder precompute, you'd work left to right in chunks of $$b$$ bits, since those are the sizes where you can get the (free) evaluation by shifting. And then of course the actual multiplication of that ladder is really just shifting the second multiplicand, so it's not even a mulmod, it's just a mod. The bulk of the evaluation cost will be all the squarings, which as far as I can see are unavoidable.

2017-04-18, 21:36   #5
ewmayer
2ω=0

Sep 2002
República de California

32·1,093 Posts

Quote:
 Originally Posted by mathPuzzles I was reading the great wiki on the stages of Mersenne testing. The initial filtering step is trial division. After some sieving and modular rejection, you still end up with various primes
No - lots of composites, too.

Quote:
 Originally Posted by mathPuzzles An extremely similar question comes up when doing the precompute necessary for Montgomery multiplication of any number, since you almost always use a computational base like $$R=2^{64i}$$ for some convienient $$i$$. The one-time precompute needs to find the modular inverse of R in the base in question, usually using the extended Euclidian algorithm.
No eGCD needed.

2017-04-19, 00:47   #6
mathPuzzles

Mar 2017

3010 Posts

Quote:
 Originally Posted by ewmayer No eGCD needed.
That's very interesting and exactly what I'm trying to learn! So what is this method to compute $$(2^n)^{-1}\mod m$$ without eGCD? Perhaps some variant in the flavor of the binary GCD?

2017-04-19, 02:57   #7
ewmayer
2ω=0

Sep 2002
República de California

32×1,093 Posts

Quote:
 Originally Posted by mathPuzzles That's very interesting and exactly what I'm trying to learn! So what is this method to compute $$(2^n)^{-1}\mod m$$ without eGCD? Perhaps some variant in the flavor of the binary GCD?
The mod-inverse needed by the Montgomery mod is not the one you cite, but rather an inverse of m (mod R), where R is the arithmetic radix convenient for hardware arithmetic, e.g. 2^64 (or some suitable power thereof which is > m) on a 64-bit CPU. To get that mod-inverse one needs just the fast Newtonian-iterative mod-inverse described in Montgomery's paper. (See also here for some improved-initial-guess tweaks.) The ensuing binary modpow then computes 2^n (mod m) times some power of R which scaling can be 'unwound' in log log n time if the true mod is needed - which it isn't for e.g. Mersenne-mod TF work.

 2017-04-21, 03:47 #8 mathPuzzles   Mar 2017 2×3×5 Posts ewmayer, you're right, I swapped the R and modulus value in my question. But your answer is also part of what I'm trying to learn! Now I have to study your paper. It's looks like part of the solution to my efficiency puzzle. I appreciate it!
2017-04-21, 07:21   #9
ewmayer
2ω=0

Sep 2002
República de California

32·1,093 Posts

Quote:
 Originally Posted by mathPuzzles ewmayer, you're right, I swapped the R and modulus value in my question. But your answer is also part of what I'm trying to learn! Now I have to study your paper. It's looks like part of the solution to my efficiency puzzle. I appreciate it!
Algorithm E in the paper has an efficient modpow based on Montgomery-mul, but I encourage you to try to slog though the preceding material, as well. :) I included several worked 64-bit-modulus examples; hopefully you will find those useful.

 Similar Threads Thread Thread Starter Forum Replies Last Post yih117 Math 5 2018-02-02 02:49 SPWorley Math 8 2009-08-24 23:26 SPWorley Math 5 2009-08-22 14:36 SPWorley Factoring 7 2009-08-16 00:23 ewmayer Factoring 7 2008-12-11 22:12

All times are UTC. The time now is 23:53.

Sun Nov 29 23:53:10 UTC 2020 up 80 days, 21:04, 3 users, load averages: 1.50, 1.36, 1.35