20090525, 18:31  #1 
May 2009
2^{2} Posts 
Interested in C/C++ libraries for factoring
Hi everyone,
I'm researching libraries for fast integer factorization and have been advised that here might be a good place to ask any questions I had. On the assumption that this is the appropriate place to ask, I'll continue! ;) I will be factoring general numbers without any special properties to them of size up to 10^{12}. I have quite a lot of them and the efficiency of the their factorization is critical. I have worked out that for my program to be feasible, I would need to be able to factor 12digit integers in a millionth of a second (or close) on average. With this in mind, does that sound possible? If it is, I am interested in C/C++ factorization libraries and which one would be ideal. The integers will be stored as long integers in either C or C++. I can afford around 512MiB of memory to be used during factorization (possibly more). Can anyone recommend a library suitable? Thank you for your time, PythonPower PS: My original question that led me here can be found here. 
20090525, 20:14  #2 
Aug 2004
New Zealand
2·3·37 Posts 
I don't think you need to use a library. Trial division by all primes up to 10^6 will be sufficient to factor your numbers. Precompute the primes and store them (e.g. using the sieve of Eratosthenes), they will easily fit in you 512MB constraint. Whether it will be fast enough probably depends on your CPU and whether or not you can do multiple numbers in parallel.

20090525, 20:22  #3 
Feb 2004
2×3×43 Posts 
A millionth of a second sounds like a rather short time interval. You'd only have enough time to use a thousand or so instructions. I'm not sure, but I think you may have trouble completing your 12digit factorisations that fast. Hopefully I'm wrong.
I don't know what the best method might be, but you could always try the gmpecm lib and see what kind of timings you get with ecm. You'll probably need to know a little about both the lib and the method to use it. Maybe the msieve lib is interesting? Someone else can no doubt give you a better answer. 
20090525, 20:40  #4 
"Nancy"
Aug 2002
Alexandria
100110100011_{2} Posts 
GMPECM wouldn't meet the time criterion. The only thing I can think of that would satisfy your speed demand is a sieving technique, if the input numbers follow some pattern that allows sieving. Even an implementation of ECM optimized for small numbers would be about an order of magnitude slower than you want. GMPECM will be at least another order of magnitude slower.
Alex 
20090525, 20:58  #5 
"Robert Gerbicz"
Oct 2005
Hungary
2·7·103 Posts 
My not public code can factor 12000 (40 bits) numbers in 1 second using Bernstein's method on one core of amd triple core phenom 2.4 GHz , see: http://cr.yp.to/factorization/smoothparts20040510.pdf. This cost in my code is independent from the numbers, by trial division and a prime test it would be somewhat faster.
Last fiddled with by R. Gerbicz on 20090525 at 21:00 
20090525, 21:46  #6 
Sep 2005
Berlin
2·3·11 Posts 
Pollard Rho could be a possibility here. To find a factor around 10^6 it will need ~ 10^3 iterations. So you could perform a tiny trial division to remove small factors and then run Pollard Rho.
10^12 fits into a 64bit integer, so you do not need any library. 
20090525, 23:06  #7 
"Robert Gerbicz"
Oct 2005
Hungary
1442_{10} Posts 
Yes, in fact Bernstein and Pollard Rho (by Brent's modification)'s speed is about the same for 40 bits, but for example for 50 bits as I remember Bernstein is faster. It isn't very surprising, because Bernstein's method is polynomial (if we have got about O(sqrt(N)/log(N)^O(1)) numbers). And note that for 50 bits in the root of the remainder tree the number has got about 50 million bits and currently gmp's division is not optimal, it can be about 4 times faster in this range.
Last fiddled with by R. Gerbicz on 20090525 at 23:10 
20090526, 01:14  #8  
Aug 2006
5972_{10} Posts 
Quote:
The only primality tests viable at that speed would be trial division or a pseudoprime test and search of small pseudoprimes. There are 101,629 2pseudoprimes up to 10^12, so that's 17 steps for a binary search. The list is only 800 kB, so it could fit in the cache. Trial division up to B, a single quick Pollard rho, a pseudoprime test, binary search if needed, another few rho tests, then trial division up to sqrt(remaining cofactor)? That's a lot to do in very little time. Of course the trial division can be optimized by combining primes into either 32 or 64bit chunks. (32bit chunks would allow all but the first division to be 32 bits rather than 64 bits.) PythonPower: What, if anything, is known about the numbers to be factored? Are they chosen uniformly at random (so lots of small divisors like 2, 3, 5)? Are they all 'hard' composites (semiprimes with roughly equal factors), or all composites? Knowing things like this might help us determine if the task can be done. 

20090526, 02:36  #9 
Jul 2003
So Cal
2^{2}·11·47 Posts 
How about using a video card? In CUDA, you are hampered by the facts that 64bit integer math is emulated using 32bit integers and that both mod and division are quite slow, but on a newer card you have 240 "cores" to work with. You can expect > 2GB/s bandwidth to and from the card and 70GB/s on the card, so that shouldn't be a bottleneck at a million factors/s.

20090526, 03:07  #10  
Aug 2006
2^{2}×1,493 Posts 
Quote:
The 32bit math wouldn't be too much of a limitation since most operations could be reduced fairly quickly to 32 bits. The slowness of the division would hurt, though. 

20090526, 07:36  #11 
"Robert Gerbicz"
Oct 2005
Hungary
2642_{8} Posts 
If you've got lots of RAM, then it is possible to precompute all factorizations by sieving up to 10^12. If you store only those where gcd(n,210)=1 and the index of the smallest prime divisor (or 0 if n is prime), then you need: eulerphi(210)/210*10^12*17./2^33=453 GB RAM. (there are 78498<2^17 prime up to 10^6, so need 17 bits for each index.)

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
suggestion: separation of libraries  ixfd64  Software  10  20110120 23:02 
interested in theory+implementation: FFT multipl  antlu  Programming  17  20080820 18:17 
Large Integer Libraries  andi314  Programming  18  20040621 22:37 
Is anybody interested in this?  ET_  ElevenSmooth  2  20040326 16:29 