Thread: P95 trial division strategy View Single Post
2009-08-24, 23:22   #8
SPWorley

Jul 2009

378 Posts

Quote:
 Originally Posted by Prime95 There are actually many different algorithms in factor32.asm - each written for a different target CPU. Also, it has been a long time since I last looked at the code, so I've forgotten quite a lot. To compute x mod y, you take the approximate reciprocal accurate to 53-bits and compute (top bits of x) * (approx. recip) to get 32 bits of the quotient (almost always accurate, Knuth guarantees it is accurate to within 2). Multiply q by y, subtract from x, repeat til done. Also, note that you can compute the reciprocal quickly without division. Just take the last reciprocal you computed (it will be real close) and use Newton's method to compute the new reciprocal.
Thanks much for the authoritative answer!
I did notice the (many!) multiple methods defined, especially for the low bit ranges of < 2^64. I guess in the days of 486 processors, it was a struggle even to test up that high and having custom code for different low bit ranges was useful.

I may try a couple old-school divide approaches as well... we'll see how they compare to Montgomery.