![]() |
![]() |
#12 |
"Dana Jacobsen"
Feb 2011
Bangkok, TH
22×227 Posts |
![]()
I got TIFA to compile, but it's really not set up for this sort of test. I'm not sure I want to slog through trying to write a factor() routine for it when I'm dubious how competitive it will be.
I think yafu will be good to see. Probably the best competition is NT Factor since it is a small self-contained C file with two small header files. It seems to do very well and has the advantages of (1) supports 127-bit instead of 63, or could be trimmed for 64-bit support, (2) doesn't rely on 128-bit compiler type, (3) no huge file. Right now my code seriously needs to be turned into a more generic C library so it's easier to use standalone, but regardless is going to be a lot more code. |
![]() |
![]() |
![]() |
#13 | |
"Ben"
Feb 2007
1101001010112 Posts |
![]() Quote:
Code:
real 0m4.104s user 0m3.099s sys 0m0.093s Code:
time ./yafu "frange(2^63-200000,2^63-100000)" > out.txt 46.660u 2.174s 1:38.26 49.6% 0+0k 0+11208io 1pf+0w |
|
![]() |
![]() |
![]() |
#14 | |||
"Andrew Booker"
Mar 2013
5×17 Posts |
![]() Quote:
That said, I agree that it isn't well suited for a project with wide distribution. The file size could be reduced, since the largest part is a factorization table that could be generated at initialization time. I've put it in the file simply because that's more convenient. Quote:
Quote:
|
|||
![]() |
![]() |
![]() |
#15 |
"Dana Jacobsen"
Feb 2011
Bangkok, TH
22×227 Posts |
![]()
I figured the goal was to look at just the C function. So for Perl/ntheory I wrote up a similar driver in the XS.xs (the C interface) file. But turns out I get basically the same time time just calling directly assuming /tmp/n.txt has the list of 100k numbers produced by the LCG:
Code:
$ function mpufile { perl -Iblib/lib -Iblib/arch -MMath::Prime::Util=:all -nE "$1" $2; } $ time mpufile 'chomp; say "$_ has ",scalar(factor_exp($_))," factors";' /tmp/n.txt | shasum 0778274b7c0b2679825496df0c9a62222dc22593 - real 0m2.722s user 0m2.730s sys 0m0.027s It looks like spfactorlist could be relatively easily modified to instead of n=nstart+i have it roll the LCG and then n = state>>1. I'm not sure the timing results will be much different than just the linear range you ran. I would have thought Lehman would have helped, but if most of that is in SQUFOF then that seems to be not dissimilar to the results from my and NT Factor's SQUFOF results, which were 27-35 seconds on my machine. Where SQUFOF seems to excel is in the larger semiprimes that take excessively long for Pollard-Rho. That would be another good test. Last fiddled with by danaj on 2017-10-06 at 15:26 Reason: Add mpufile alias |
![]() |
![]() |
![]() |
#16 | |
"Andrew Booker"
Mar 2013
10101012 Posts |
![]() Quote:
There are a lot more restrictions on the iteration count than I thought. In particular, it's implicitly assumed that the chosen iteration count actually occurs for some prime. Since that isn't true for 500000 (by which point the stopping times get very sparse), the effect of your change was to make it go through trial division by all primes when Pollard rho fails. I have added some comments to the code to make the restrictions clear. I've also repaired the earlier bug in a better way. The only numbers left standing by that point in the code are prime squares and semiprimes, so we can avoid the final primality test. |
|
![]() |
![]() |
![]() |
#17 |
"Dana Jacobsen"
Feb 2011
Bangkok, TH
22·227 Posts |
![]()
I haven't seen any issues with the newest code.
I did some benchmarks of random semiprimes. There may be more convenient way for sharing, but I used what I know. Code:
perl -Mntheory=:all -E 'csrand(0); say random_semiprime(63) for 1..100000' Code:
bits factor63 MPU NTFactor 40 1.56 / 1.45 1.84 / 1.79 2.26 / 2.15 55 17.10 / 14.47 18.65 / 18.12 22.68 / 22.37 60 44.68 / 34.74 46.18 / 43.76 56.92 / 55.15 63 84.48 / 57.57 73.25 / 71.14 93.32 / 92.51 64 ----- / ----- 101.36 / 99.61 113.06 /111.19 Code:
mpu 'csrand(0); say random_semiprime(63) for 1..100000' | time ./f63 | shasum mpu 'csrand(0); say random_semiprime(63) for 1..100000' | time ~/src/nt-factor/factor | shasum mpu 'csrand(0); say random_semiprime(63) for 1..100000' | time perl -Iblib/lib -Iblib/arch -Mntheory=:all -nE 'chomp; say "$_: ",join(" ",factor($_));' | shasum Last fiddled with by danaj on 2017-10-06 at 22:53 |
![]() |
![]() |
![]() |
#18 |
"Andrew Booker"
Mar 2013
8510 Posts |
![]()
Thanks, that's a really helpful comparison. I am perplexed by the difference between user and wall time that you keep observing, though. I ran a similar test of 100k semiprimes on my system and saw almost no difference (0.6 seconds out of about 70s total). How much memory does your test system have? Maybe the OS is never loading the whole file into memory, which will surely degrade performance. Does it do better if you replace mmap by malloc+read?
|
![]() |
![]() |
![]() |
#19 | |
"Dana Jacobsen"
Feb 2011
Bangkok, TH
38C16 Posts |
![]() Quote:
What I see is that if you run it again almost all the time difference goes away. I'm guessing the entire file is in your cache as you run lots of experiments with it. I find if I work on other things for a while, then run the same command again it will again add significant extra time. Running again on the same dataset yields only a 5% or so adder. It really will depend on the access pattern of the client. Note for benchmark: The csrand(0) seeds the RNG so we get the same output every time. For each bit size the shasum matched for the three programs. Last fiddled with by danaj on 2017-10-06 at 23:45 |
|
![]() |
![]() |
![]() |
#20 |
Just call me Henry
"David"
Sep 2007
Cambridge (GMT/BST)
2×41×71 Posts |
![]()
It may be worth considering the qs implementation that is part of GGNFS. It at least had a very good reputation for factoring small semiprimes quickly. It would be nice to know if we now have better code that should take its place in ggnfs.
Also interesting would be a plot of the average time for each algorithm/combination of algorithms/implementation for n bits similar to how ggnfs and cado were compared. |
![]() |
![]() |
![]() |
#21 |
"Andrew Booker"
Mar 2013
5·17 Posts |
![]()
OK, at least it's working. Do you see a significant difference when run for the first time with a long test (an hour, say)? I agree that for any problem whose running time is measured in minutes or seconds, this isn't the best approach. But that case isn't worth optimizing much anyway.
|
![]() |
![]() |
![]() |
#22 | |
"Dana Jacobsen"
Feb 2011
Bangkok, TH
11100011002 Posts |
![]() Quote:
I'm surprised about QS being used for 64-bit numbers. That seems really small -- most QS implementations won't work on such small numbers. yafu's smallmpqs.c does. A comment on Pollard-Rho vs. SQUFOF. Pollard-Rho's inner loop is just two mulmods and two addmods (with a deferred gcd). Fast 64-bit mulmods are critical -- with asm mulmod it's a lot faster, and with a proper Montgomery mulmod (either asm or using compiler 128-bit type) it's faster yet. Both factor63 and MPU use fast Montgomery mulmod, though MPU has fallbacks to other methods on other platforms. SQUFOF's inner loop is more or less 4 multiplies, 2 divides, and a perfect square check, all standard integer versions so far less platform dependent. x86_64 is rather ubiquitous these days so that probably matters less than 10 years ago. Doing the random-semiprime test from before, calling yafu -silent with a line "smallmpqs(...)" or similar for each number. It still does verbose logging to factor.log, and the generic yafu front end isn't optimized for such tiny inputs (where, for instance, GNU NT Factor or BSD factor are extremely low overhead). Different front-end design goals. Code:
40-bit: 75.90 real / 62.94 user n 58.90 real / 54.18 user rho(n) 40.19 real / 36.21 user smallmpqs(n) 37.02 real / 27.16 user squfof(n) bits factor63 MPU yafu QS 40 1.56 / 1.45 1.84 / 1.79 40.19 / 36.21 55 17.10 / 14.47 18.65 / 18.12 70.93 / 65.09 60 44.68 / 34.74 46.18 / 43.76 105.81 / 96.92 63 84.48 / 57.57 73.25 / 71.14 105.09 / 99.66 64 ----- / ----- 101.36 / 99.61 107.12 /102.08 Code:
bits MPU pbrent MPU SQUFOF yafu SQUFOF 40 1.69 / 1.67 2.43 / 2.40 37.02 / 27.16 55 19.43 / 18.34 31.20 / 30.44 62.04 / 50.49 60 45.62 / 44.11 86.31 / 84.92 103.87 / 91.31 63 71.60 / 70.51 240.61 /236.82 -- / -- 64 85.67 / 84.43 326.82 /321.82 -- / -- This seems to be turning into an update of my old 2009 presentation, albeit limited to just 64-bit. Last fiddled with by danaj on 2017-10-07 at 21:49 |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Factoring Mersenne numbers | paulunderwood | Miscellaneous Math | 18 | 2017-08-27 14:56 |
Factoring Big Numbers In c# | ShridharRasal | Factoring | 10 | 2008-03-20 17:17 |
Question about factoring code | Peter Nelson | Software | 9 | 2005-03-30 18:28 |
Factoring Fermat Numbers | Axel Fox | Software | 14 | 2003-07-04 18:57 |
Questions about SSE2 code and Factoring | Joe O | Software | 2 | 2002-09-13 23:39 |