![]() |
![]() |
#1 |
Nov 2007
Halifax, Nova Scotia
23·7 Posts |
![]()
Hello everyone,
I am wondering how the runtime of a single curve in the ECM scales in practice with N, B1, and B2. I know from Silverman and Wagstaff's 1993 paper that the total runtime scales for constant factor size with the time for a single multiplication modulo N. For large N using FFT etc. multiplication I suppose this means the runtime of a single should go with log_2(N) for fixed B1 and B2. For fixed N, runtime of a single curve seems to scale linearly with B1, which seems reasonable from my understanding of the algorithm. Also, as long as a reasonable size is taken for B2 the runtime of 1 curve seems to be roughly independent of B2, with some optimal value of B2 as given in Zimmerman's table. Have I got these three scalings roughly correct? Finally one more question: what are some runtimes for single curves on the Fermat numbers? In particular I am wondering how long curves with B1=10^6 and B1=3*10^6 take on F20 (and what I should expect for 11*10^6). Answers to any of these questions would be greatly appreciated, Thanks, Doug |
![]() |
![]() |
![]() |
#2 | |
Nov 2007
Halifax, Nova Scotia
5610 Posts |
![]()
I should have mentioned that the thread "Factoring F14" thread provides a partial answer to my performance on Fermat numbers question:
Quote:
Has anyone in particular ran curves on F20? |
|
![]() |
![]() |
![]() |
#3 |
"Phil"
Sep 2002
Tracktown, U.S.A.
25·5·7 Posts |
![]()
[Fri Sep 29 10:45:27 2006]
2^1048576+1 completed 5 ECM curves, B1=1000000, B2=100000000 These 5 curves took about 10 days on an Athlon system, 1400 MHz. My guess is that they would have taken about 1/3 of the time on a 3000 MHz Pentium processor. Your scaling assumption sounds roughly correct, but from F14 to around F17, running time seems to more than double, sometimes tripling even, as you double the exponent. I don't know why, but I'm guessing that it may have something to do with cache utilization. Does anyone else have an opinion on this, or some hard data? Once we get to large numbers, the doubling approximation seems to hold, at least for stage 1, although stage 2 sometimes takes a hit because of main memory limitations. |
![]() |
![]() |
![]() |
#4 |
Einyen
Dec 2003
Denmark
D6E16 Posts |
![]()
I did a small speedtest 2 years ago: http://www.mersenneforum.org/showthread.php?t=4223
Original data: ECM.txt Data in Excel where I did regression analysis: ECM.xls Regression formulas: ECMregression.txt step1: When B1 is constant: time=constant*digits1.7 When n is constant: time=constant*B1 Step1 conclusion: time = constant*B1*digits1.7 step2: When B2 is constant: time=constant*digits1.2 When n is constant: time=constant*B20.6 step2 conclusion: time=constant*B20.6*digits1.2 Offcourse this will hold less and less when you get to very high B1/B2 or n when memory becomes a problem. |
![]() |
![]() |
![]() |
#5 | |
Oct 2004
Austria
2×17×73 Posts |
![]() Quote:
Last fiddled with by Andi47 on 2007-11-28 at 14:09 |
|
![]() |
![]() |
![]() |
#6 |
"Bob Silverman"
Nov 2003
North of Boston
11101010101002 Posts |
![]() |
![]() |
![]() |
![]() |
#7 |
Nov 2007
Halifax, Nova Scotia
23×7 Posts |
![]()
Thanks everyone for your replies and comments. This gives me a good feeling for the subject. I will run some of my own calculations on F20,
as well as (properly) (re)read Silverman and Wagstaff's paper, and share any insight I garnish, if it is anything beyond what you all have said here. |
![]() |
![]() |
![]() |
#8 |
Einyen
Dec 2003
Denmark
65568 Posts |
![]()
I did some new tests of ECM, p-1 and p+1 with GMP-ECM 6.1.3 on GMP 4.2.2 on 3 different computers: p4 Northwood 2.4 Ghz 1Gb ram, p4 Prescott 3.4 Ghz 2Gb ram, Athlon XP 2200+ 1.80 Ghz 1Gb ram.
ECM: athlonECM.xls athlonECMregression.txt p4nECM.xls p4nECMregression.txt p4pECM.xls p4pECMregression.txt P-1: athlonP-1.xls athlonP-1regression.txt p4nP-1.xls p4nP-1regression.txt p4pP-1.xls p4pP-1regression.txt P+1: athlonP+1.xls athlonP+1regression.txt p4nP+1.xls p4nP+1regression.txt p4pP+1.xls p4pP+1regression.txt ECM: Ath: step1: time=constant*B1*digits1.63 step2: time=constant*B20.63*digits1.3 P4N: step1: time=constant*B1*digits1.75 step2: time=constant*B20.63*digits1.3 P4P: step1: time=constant*B1*digits1.71 step2: time=constant*B20.63*digits1.4 P-1: Ath: step1: time=constant*B1*digits1.6 step2: time=constant*B20.6*digits1.3 P4N: step1: time=constant*B1*digits1.7 step2: time=constant*B20.6*digits1.4 P4P: step1: time=constant*B1*digits1.7 step2: time=constant*B20.62*digits1.35 Alot bigger variation on the exponent on the digits for P-1 than for ECM, ranging from 1.54 to 1.68 for athlon step1 and 1.29 to 1.45 for athlon step2, same kind of variation for the 2 other computers. Probably some of it caused by P-1 being so much faster than ECM that many of the timings are so small that random fluxuation plays a bigger role. P+1: Ath: step1: time=constant*B1*digits1.59 step2: time=constant*B20.64*digits1.3 P4N: step1: time=constant*B1*digits1.70 step2: time=constant*B20.64*digits1.3 P4P: step1: time=constant*B1*digits1.67 step2: time=constant*B20.64*digits1.35 Almost as low variation on exponents in P+1 as on ECM in step1, but higher variation like in P-1 in step2. Looking at the timings ECM is about 9-10 times slower than P-1 at the same B1/B2 and 5-6 times slower than P+1, which fits with the advice in GMP-ECM readme to run P-1 with 10*B1 and P+1 with 5*B1, where B1 is the B1-value you use for ECM at each factor level. Last fiddled with by ATH on 2007-12-02 at 03:42 |
![]() |
![]() |
![]() |
#9 |
"Jason Goatcher"
Mar 2005
3×7×167 Posts |
![]()
It might be a good idea to take a bit of post 8, and give it it's own Sickey thread in the GMP-ECM forum.
Just my opinion. :) |
![]() |
![]() |
![]() |
#10 | |
Nov 2007
Halifax, Nova Scotia
23·7 Posts |
![]() Quote:
Code:
echo "2^(2^20)+1" | ./ecm -power 1 -maxmem 2048 -v 1000000 Code:
GMP-ECM 6.1.3 [powered by GMP 4.2.2] [ECM] Input number is 2^(2^20)+1 (315653 digits) Using B1=1000000, B2=904700226, polynomial x^1, sigma=2813888680 dF=128, k=5933, d=1020, d2=7, i0=974 Expected number of curves to find a factor of n digits: 20 25 30 35 40 45 50 55 60 65 5 21 130 1014 9706 110526 1461187 2.2e+07 3.7e+08 7.2e+09 Step 1 took 273197586ms Estimated memory usage: 720M ... Lots of timing data removed here ... Step 2 took 432207265ms (a) the verbose option is causing gmp-ecm to spend a lot of time doing timings etc. (b) gmp-ecm is not nearly as fast as mprime for numbers of this size (c) I have somehow made a serious mistake with the software Does anyway have an explanation for what I have found? |
|
![]() |
![]() |
![]() |
#11 | |
"Bob Silverman"
Nov 2003
North of Boston
22×1,877 Posts |
![]() Quote:
P95 will be a lot faster in stage 1 than GMP-ECM. GMP-ECM will be faster in stage 2. |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
gpuowl: runtime error | SELROC | GpuOwl | 69 | 2021-09-29 10:07 |
runtime question | yoyo | YAFU | 1 | 2015-01-08 07:07 |
Predicting QS and NFS runtime | jasonp | Factoring | 2 | 2011-03-07 01:22 |
gmp-ecm needed memory and runtime | yoyo | GMP-ECM | 7 | 2010-04-09 16:48 |
runtime error when using redc | ltd | GMP-ECM | 5 | 2009-10-30 13:09 |