20090103, 00:24  #1 
Dec 2008
Boycotting the Soapbox
2^{4}×3^{2}×5 Posts 
Faster factoring with binary splitting?
Heuristic for the speedup:
Dividing a ndigit integer mtimes by a small integer has the arithmetic complexity of n*m. For sake of simplicity let's say m~n, so the complexity of this process is O(n^2). Using binary splitting we can reduce this to n*log(n)^2 See, e.g. http://numbers.computation.free.fr/C...splitting.html Therefore, instead of trying whether M(n) is divisible by 2kp+1 we can do the following: Compute the product P of n possible factors F until P~sqrt(M) using the following method: P(1,2)=F(1)*F(2) ... P(m1,m)=F(m1)*F(m) P(1,2,3,4)=P(1,2)*P(3,4) ... P(1,...,m) Using FFTs for the multiplications, complexity is O(n*log(n)^2). Then reverse the process to compute the remainders (using FFTs this is also O(n*log(n)^2) ): R(1,...,m)=M mod P(1,...,m) ... R(1)=R(1,2) mod F(1) R(2)=R(1,2) mod F(2) If any R(m)=0 then F(m) is a factor. Ballpark analysis for M(3.3E8): 3.3E8/2/64bit factors ~ 250000 3.3E8/64*log2(3.3E8/64)^2 ~ 2E9 which is in the order of seconds on a 3Ghz processor. Now, 250000 trial division/second seems pretty good to me. :) Last fiddled with by __HRB__ on 20090103 at 01:09 Reason: changed 'factors' to 'possible factors' 
20090103, 05:29  #2 
Dec 2008
Boycotting the Soapbox
2^{4}·3^{2}·5 Posts 
Maybe I should point out the following:
The file "factor64.asm" in the source code is absolutely littered with divassembly instructions. The div instruction with a dividend >64bit has a latency of 78 clocks on Athlon64s (116 clocks on Core 2!), and to make matters worse, it isn't pipelined. Therefore, any trial factoring doing x^2 (mod trial_factor) with this instruction will be sloooooooow. A multiprecision division with Newton's method on the other hand only costs as much as 2 multiprecision multiplies, so the above scheme might not be as silly as it appears at first. Maybe going all the way to sqrt(M) is too much of a good thing, but I think we could do a 1024bit bignum (Karatsuba) multiply in the time it takes to complete a division with a divisor>64 bit. I think trial division could easily see a speedup of 48x. Last fiddled with by __HRB__ on 20090103 at 05:43 
20090103, 15:25  #3 
Dec 2008
11 Posts 
Maybe this comment helps in the discussion:
According to http://v5www.mersenne.org/various/math.php trial dividing is much more efficient than O(n) (where n is number of digits). We have to do only O(log n) operations for small factors. So total cost of trial factoring is O(n log n) [based on your assumption: m ~ n]. 
20090103, 16:27  #4  
Dec 2008
Boycotting the Soapbox
1011010000_{2} Posts 
Quote:
Using the div instruction is O(1), whereas doing trial division for several factors at once with binary splitting is O(log(n)) per factor. If the constant implicit in the O(1) is 100, and the one implicit in O(log(n)) is 10, you break even only at n=22026. So, doing trial division by products of factors would be faster, if you use a product with less than 22026 factors. Last fiddled with by __HRB__ on 20090103 at 16:31 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Splitting Work over Dual Boot  justinstevens42  Information & Answers  3  20180207 12:51 
Splitting P1 stages  TheMawn  Information & Answers  3  20131013 00:07 
File Splitting Utility  Antonio  Software  5  20130418 14:22 
Faster Factoring Algorithm?  Citrix  Factoring  6  20071223 11:36 
faster factoring?  S80780  Lone Mersenne Hunters  10  20030408 21:51 