mersenneforum.org Some code for factoring numbers up to 2^63
 Register FAQ Search Today's Posts Mark Forums Read

 2017-10-11, 00:48 #34 jasonp Tribal Bullet     Oct 2004 2·3·19·31 Posts The MPQS code in GGNFS was incredibly fast back when I tested it; the ECM code in CADO-NFS is also blazing fast for ranges of inputs that are above the sizes discussed here (64-128 bits). If you want another (not well optimized) SIQS, I wrote this in 2005 and modernized it a short while ago.
2017-10-11, 04:00   #35
danaj

"Dana Jacobsen"
Feb 2011
Bangkok, TH

22×227 Posts

Quote:
 Originally Posted by jasonp The MPQS code in GGNFS was incredibly fast back when I tested it; the ECM code in CADO-NFS is also blazing fast for ranges of inputs that are above the sizes discussed here (64-128 bits). If you want another (not well optimized) SIQS, I wrote this in 2005 and modernized it a short while ago.
I think 64-128 bit is a different set of codes to examine. Looking at code that doesn't require GMP is also interesting for some uses (albeit not big factoring projects).

The cof-siqs run on the semiprime set does quite poorly at small bit sizes. It misses factoring a few at each size so would need a backup. It looks like it outperforms smallmpqs on the data sets I have (which stop at 64-bit). At 63-bit it's close but Ben's new Pollard-Rho beats it as does mine when I use his new mulredc. At 64-bit it is ~30% faster than my native Pollard-Rho, about 10% faster than Ben's 64-bit version, and factor63 doesn't handle that size.

2017-10-11, 13:36   #37
bsquared

"Ben"
Feb 2007

1101001010112 Posts

Quote:
1) I like it, but even better how about creating a second program that generates the table completely? I would much rather download a 1kB file, compile and let run for a while, and then start using factor63, than download 4 or 6 GB. But I am ignorant - does full table building take much much longer than a typical download (i.e. hours)?
2) A small up front cost is fine for most of the projects that I imagine would use factor63 since it would be greatly amortized. That said, the systems I've been running it on seem to cache it just fine (real ~= user)
3) I'm leaning toward no, mostly because of the additional file size and because my pbrent routine is now within about 1% or even slightly faster than factor63, using no table. But admittedly that is for the specialized semiprime problem, not for general inputs. And also the speed of my pbrent routine I think mostly comes from an assembler implementation of mulredc/addmod, which you are welcome to use (P.M me if interested), and that would likely put factor63 solidly in the lead again.

 2017-10-11, 14:00 #38 bsquared     "Ben" Feb 2007 3,371 Posts Here are some quick comparisons. "rand" is the first test we tried with inputs from an LCG. I haven't tried "rand" with the new pbrent yet - it needs a recipe for checking isprime, finding small factors, etc., before trying that test. Code: bits pbrent f63 f63-asm 48 3.6 4.2 3.9 55 11.3 13.3 12.2 60 27.4 26.5 24.2 rand -- 1.62 1.47
2017-10-11, 14:53   #39
arbooker

"Andrew Booker"
Mar 2013

5×17 Posts

Quote:
 Originally Posted by bsquared 1) I like it, but even better how about creating a second program that generates the table completely? I would much rather download a 1kB file, compile and let run for a while, and then start using factor63, than download 4 or 6 GB. But I am ignorant - does full table building take much much longer than a typical download (i.e. hours)?
There are actually three tables:
• Factorization table for small numbers. This is the largest (about 3GB in factor63) and easiest to generate--it takes just a few minutes.
• Table of Pollard rho collisions, about 560MB. This could in principle be done by the user, but we're talking at least days to compute the full table. My memory is hazy, but it was some number of hours on 64 cores when I computed it.
• Table of strong pseudoprimes, about 180 MB. This is the result of a lot of computing effort by Feitsma et al and could not be easily reproduced. (In fact to my knowledge no one has independently verified the computation, which should be done at some point.) An alternative is to add a second base or Lucas test (a la Baillie-PSW) so that a much smaller table (or even no table) suffices, but that would make isprime() take about twice as long when the input is prime.
Quote:
 2) A small up front cost is fine for most of the projects that I imagine would use factor63 since it would be greatly amortized. That said, the systems I've been running it on seem to cache it just fine (real ~= user)
I could add it as an option, but I won't bother if there's no real need.
Quote:
 3) I'm leaning toward no, mostly because of the additional file size and because my pbrent routine is now within about 1% or even slightly faster than factor63, using no table. But admittedly that is for the specialized semiprime problem, not for general inputs.
This is understandable. The collision table handles some rare failure cases for Pollard rho, but those could be dealt with by a backup algorithm without hurting the average performance. (This could even be just a second P-R with a different polynomial, which would shrink the table a lot.) Where I think the tables are most likely to help is in factoring small numbers. In factor63 that's done entirely by table lookup for numbers of square root size (at most 2^31.5). That even helps when doing Pollard rho, since smooth composite factors are pretty common. That speedup won't be apparent from a semiprime test, so it would be worth comparing performance on random numbers also.
Quote:
 And also the speed of my pbrent routine I think mostly comes from an assembler implementation of mulredc/addmod, which you are welcome to use
Thanks. That's definitely worth a try, though I'm skeptical of making a huge improvement for 63-bit mulredc. Pre-pentium I understood what the chip was doing well enough to write better assembly language than my compiler. But that's no longer the case, and I usually find that it's best to coax the C compiler into emitting something close to what I want. That's not too hard for 63-bit mulredc, but for 64-bit you need access to the carry bit, so assembly language is the only reasonable route. [EDIT: I just saw your stats on this. Nice! How about making a pull request on github with your asm mulredc?]
Quote:
 that would likely put factor63 solidly in the lead again.
We shouldn't view this as a competition, and I don't mind if anyone wants to extract just some bits of the factor63 code and leave the rest.

2017-10-11, 16:34   #40
danaj

"Dana Jacobsen"
Feb 2011
Bangkok, TH

22·227 Posts

Quote:
 Originally Posted by arbooker Table of strong pseudoprimes, about 180 MB. This is the result of a lot of computing effort by Feitsma et al and could not be easily reproduced. (In fact to my knowledge no one has independently verified the computation, which should be done at some point.) An alternative is to add a second base or Lucas test (a la Baillie-PSW) so that a much smaller table (or even no table) suffices, but that would make isprime() take about twice as long when the input is prime.
You could use an Euler-Plumb test which is only a couple extra lines, typically faster, and generates about 40% the number of pseudoprimes. Or a strong pseudoprime test which isn't much different in cost than a Fermat test and generates about 30% the number. Both are subsets of psp-2 so you can easily create the complete list. Or as you say add a Lucas test though that makes it take longer.

Quote:
 [...] In factor63 that's done entirely by table lookup for numbers of square root size (at most 2^31.5). That even helps when doing Pollard rho, since smooth composite factors are pretty common. That speedup won't be apparent from a semiprime test, so it would be worth comparing performance on random numbers also.
Both types of tests are important. Ben's Lehman implementation helps immensely for small sizes. That's going to be an interesting thing to see. So far there's been some good progress in 32- to 64-bit single-algorithm semiprime factoring.

Quote:
 Nice! How about making a pull request on github with your asm mulredc?
No pull request, but you can see Ben's 63-bit and 64-bit asm here

Quote:
 We shouldn't view this as a competition, and I don't mind if anyone wants to extract just some bits of the factor63 code and leave the rest.
I think this has been great, since this is something I care about. Everyone has something to offer, and together we can make faster solutions. There are also a variety of use cases, making some solutions better for some purposes. There's plenty of room. Hopefully this will pull out a few more implementations as well.

2017-10-11, 16:55   #41
bsquared

"Ben"
Feb 2007

3,371 Posts

Quote:
 Originally Posted by danaj Both types of tests are important. Ben's Lehman implementation helps immensely for small sizes. That's going to be an interesting thing to see. So far there's been some good progress in 32- to 64-bit single-algorithm semiprime factoring.
Proper attribution: Warren D. Smith kindly provided that Lehman implementation to yafu some time ago. It is very well tuned.

I don't really view this as a competition, but a competitive angle does sometimes provide motivation :)

 2017-10-11, 22:19 #42 danaj   "Dana Jacobsen" Feb 2011 Bangkok, TH 16148 Posts Nvm on the test update. I got confused by the terms "psp" and "Fermat test" in the code. You're doing a Miller-Rabin aka strong pseudoprime test already. For 32-bit inputs you could use a small hash table of bases so only one test is necessary for correct answers, though you're already using your big factor table for most of that range, and it doesn't materially change your table size. After that I guess your choice of big table (memory cost), two tests and a ~1M entry table (+1x CPU, still some memory cost), or a Lucas test and no table (+1.5x CPU, no memory). There are hashed 2-M-R test solutions but they only go to ~49 bits which doesn't seem to really help you. Ben is right that Warren D. Smith wrote the Lehman implementation which has had minor changes since. My HOLF implementation seems to do well at small sizes -- more testing would be useful here for those few people that care about microsecond differences in factoring 17- to 32-bit composites. 32-bit to 42-bit where Pollard-Rho crosses HOLF/Lehman is another interesting range. From 42- to 63-bit so far it seems the optimized Pollard-Rho implementations are winning.
2017-10-12, 21:17   #43
arbooker

"Andrew Booker"
Mar 2013

5×17 Posts

Quote:
 Originally Posted by danaj After that I guess your choice of big table (memory cost), two tests and a ~1M entry table (+1x CPU, still some memory cost), or a Lucas test and no table (+1.5x CPU, no memory).
Agreed, though I strongly favor the first solution. For PCs (or even smartphones), 180MB isn't too much memory to sacrifice these days, and despite the poor cache locality, it's still quite a bit quicker than running a second test.

I'm experimenting with a 64-bit version using Ben's asm mulredc. One thing that surprised me is how much time is saved in factor63 by using the Sylvester sequence, which allowed me to replace one addmod by a simple addition (the line y = mulredc(y,y+one)). For 64 bits we need to use addmod (or submod, which is slightly quicker), and that single change adds more time than the savings from using asm mulredc. (On the bright side, this shows how close to the bone we're cutting in the code.) I noticed that Ben's routine handles 63-bit and 64-bit n separately, and I might just move the if check farther out so the penalty is only incurred for 64-bit numbers.

2017-10-12, 22:10   #44
bsquared

"Ben"
Feb 2007

64538 Posts

Quote:
 Originally Posted by arbooker One thing that surprised me is how much time is saved in factor63 by using the Sylvester sequence, which allowed me to replace one addmod by a simple addition (the line y = mulredc(y,y+one)).
This is a new one for me... do you have a reference you could point me to? I was wondering was kind of iteration was going on in your code. I had assumed there was an addmod buried in there somewhere that I wasn't smart enough to find.

 Similar Threads Thread Thread Starter Forum Replies Last Post paulunderwood Miscellaneous Math 18 2017-08-27 14:56 ShridharRasal Factoring 10 2008-03-20 17:17 Peter Nelson Software 9 2005-03-30 18:28 Axel Fox Software 14 2003-07-04 18:57 Joe O Software 2 2002-09-13 23:39

All times are UTC. The time now is 12:22.

Sun Mar 7 12:22:47 UTC 2021 up 94 days, 8:34, 0 users, load averages: 1.09, 1.32, 1.36