20090427, 15:40  #1 
Jun 2003
Ottawa, Canada
10010010001_{2} Posts 
gmpecm 6.2.3 discussion & benchmarks
Instead of putting everything into the binaries thread why don't we start a new thread related to 6.2.3, problems, benchmarks, and other details.
The configure system changed since 6.2.2 where it used to detect my core2/penryn system as a pentium3pccygwin by default it now detects it as i686pccygwin. If I manually put "build=pentium4" then it shows up as i786pccygwin. I think Alex was saying he is still working on SSE2 detection support but it would be good if it could at least detect the system as a pentium4 or better. As you will see below that code path is faster. GMP 4.3.0 detects the system as: core2pccygwin MPIR 1.1.1 detects the system as: penrynpccygwin Jeff. Last fiddled with by Jeff Gilchrist on 20090427 at 15:40 
20090427, 15:52  #2 
Jun 2003
Ottawa, Canada
491_{16} Posts 
GMPECM Win32 Benchmarks
On a Core2 Q9550 system at 3.4GHz using the following:
./configure withgmp=/home/Jeff/gmp4.3.0/ enableasmredc enablesse2 and ./configure withgmp=/home/Jeff/gmp4.3.0/ enableasmredc enablesse2 build=pentium4 There are the results using cygwin to create Win32 binaries. Code:
MSVC 32bit: GMPECM 6.2.3 [powered by GMP 4.2.1_MPIR_1.1.1] [ECM] Input number is 187713882435985950801552411965250686960095972178128917919069302730202867937737100156 1 (85 digits) Using B1=20000000, B2=2158570060, polynomial Dickson(6), sigma=980060817 Step 1 took 84911ms Step 2 took 4477ms cygwin 32bit (pentium4) with MPIR 1.1.1: GMPECM 6.2.3 [powered by GMP 4.2.1] [ECM] Input number is 187713882435985950801552411965250686960095972178128917919069302730202867937737100156 1 (85 digits) Using B1=20000000, B2=2158570060, polynomial Dickson(6), sigma=980060817 Step 1 took 79015ms Step 2 took 2745ms cygwin 32bit (pentium4) with GMP 4.3.0: GMPECM 6.2.3 [powered by GMP 4.3.0] [ECM] Input number is 187713882435985950801552411965250686960095972178128917919069302730202867937737100156 1 (85 digits) Using B1=20000000, B2=2158570060, polynomial Dickson(6), sigma=980060817 Step 1 took 81245ms Step 2 took 2792ms cygwin 32bit (default config i686) with MPIR 1.1.1: GMPECM 6.2.3 [powered by GMP 4.2.1] [ECM] Input number is 187713882435985950801552411965250686960095972178128917919069302730202867937737100156 1 (85 digits) Using B1=20000000, B2=2158570060, polynomial Dickson(6), sigma=980060817 Step 1 took 94038ms Step 2 took 2839ms cygwin 32bit (default config i686) with GMP 4.3.0: GMPECM 6.2.3 [powered by GMP 4.3.0] [ECM] Input number is 187713882435985950801552411965250686960095972178128917919069302730202867937737100156 1 (85 digits) Using B1=20000000, B2=2158570060, polynomial Dickson(6), sigma=980060817 Step 1 took 94506ms Step 2 took 2667ms Last fiddled with by Jeff Gilchrist on 20090427 at 15:55 
20090427, 15:59  #3 
Jun 2003
Ottawa, Canada
7×167 Posts 
GMPECM Win64 Benchmarks
Looks like that assembler padding/realignment change you made did make a difference with the core2 speed.
These are both MSVC 64bit compiled binaries. Code:
GMPECM 6.2.2 [powered by GMP 4.2.1_MPIR_1.0.0] [ECM] Input number is 187713882435985950801552411965250686960095972178128917919069302730202867937737100156 1 (85 digits) Using B1=20000000, B2=2158570060, polynomial Dickson(6), sigma=980060817 Step 1 took 42479ms Step 2 took 3026ms real 0m46.153s GMPECM 6.2.3 [powered by GMP 4.2.1_MPIR_1.1.1] [ECM] Input number is 187713882435985950801552411965250686960095972178128917919069302730202867937737100156 1 (85 digits) Using B1=20000000, B2=2158570060, polynomial Dickson(6), sigma=980060817 Step 1 took 41730ms Step 2 took 3057ms real 0m44.862s 
20090429, 09:56  #4 
Sep 2005
Berlin
2×3×11 Posts 
I have noticed, that the time for both stage 1+2 increases considerably for numbers > 2^640, if a binary with mulredc code is used:
Code:
// a binary with enabled mulredc: > echo '2^640305'  ./ecm 3e6 GMPECM 6.2.3 [powered by GMP 4.3.0] [ECM] Input number is 2^640305 (193 digits) Using B1=3000000, B2=5706890290, polynomial Dickson(6), sigma=2697909633 Step 1 took 15128ms Step 2 took 5973ms > echo '2^640+115'  ./ecm 3e6 GMPECM 6.2.3 [powered by GMP 4.3.0] [ECM] Input number is 2^640+115 (193 digits) Using B1=3000000, B2=5706890290, polynomial Dickson(6), sigma=2510852341 Step 1 took 17077ms Step 2 took 6324ms 
20090430, 18:49  #5 
(loop (#_fork))
Feb 2006
Cambridge, England
2×3,191 Posts 
Small logging complaint
I'm currently running a p+1 job on a C7252 with b1=1e8, default b2, maxmem 1024, and v.
It's taken 2653 minutes so far, and the last few thousand lines of logs have been, after a few at the top Code:
Using lmax = 65536 with two pass NTT which takes about 875MB of memory Using B1=100000000, B2=3951153670810, polynomial x^1, x0=823969581 P = 59053995, l = 65536, s_1 = 32076, k = s_2 = 640, m_1 = 3 Step 1 took 52295356ms Computing F from factored S_1 took 94586ms Computing h_x and h_y took 54619ms Computing DCTI of h_x took 4277ms Computing DCTI of h_y took 4316ms Computing g_x took 102942ms Computing g_x*h_x took 30038ms Computing g_y took 102943ms Computing g_y*h_y took 9740ms Computing gcd of coefficients and N took 28050ms Code:
Computing g_x took 103178ms Computing g_x*h_x took 30050ms Computing g_y took 103055ms Computing g_y*h_y took 9736ms Computing gcd of coefficients and N took 28046ms Last fiddled with by fivemack on 20090430 at 18:49 
20090430, 18:59  #6 
Just call me Henry
"David"
Sep 2007
Cambridge (GMT/BST)
1011010111010_{2} Posts 
just look at the k value
on that one it is 640 which is the largest i have every seen 
20090430, 22:22  #7 
"Nancy"
Aug 2002
Alexandria
2,467 Posts 
Adding a running counter to the output is trivial: look for the loop
for (l = 0; l < params>s_2; l++) in pm1fs2.c (it occurs in 4 functions: P1 and P+1, NTT or non. You're running P+1 NTT) and add some output per iteration. (Edit: outputf (OUTPUT_VERBOSE, "bla"); outputs only with v parameter.) I'll add something to the SVN code. A k value that high cries out for more memory... with 4GB, k would be only 40. Alex Last fiddled with by akruppa on 20090430 at 22:45 Reason: forgot to write in which file... stupid 
20090430, 22:51  #8 
(loop (#_fork))
Feb 2006
Cambridge, England
2·3,191 Posts 
I realise that k=640 is generally a sign that you've got the parameter choice wrong; I was running a p+1 job with maxmem 6144 on another CPU and have only 8G on this machine, and hadn't realised that maxmem {N/M} took M^2 times as long ... this is why an ETA would be nice.
(I'm at the 441st of presumably 640 GCDs, so I'll leave it another 24 hours and it'll be done) 
20090430, 22:55  #9  
Einyen
Dec 2003
Denmark
2^{2}×757 Posts 
Quote:
Btw, is there any plans on raising max B1 on P+1 above 2^321 ? or is this a huge job? Last fiddled with by ATH on 20090430 at 23:06 

20090501, 06:23  #10 
Nov 2008
2·3^{3}·43 Posts 
I notice Alex's avatar has changed back. Remind me, did it always have a red shirt?

20090501, 09:43  #11 
Oct 2004
Austria
2·17·73 Posts 
Benchmarks on a P4 (3.4 GHz)
ECM6.2 with GMP4.2.2, built with ./configure withgmp=/usr/local
Code:
GMPECM 6.2 [powered by GMP 4.2.2] [ECM] Input number is 18816141541139222511309815439534127651955651212035224345737809063431028813620968356115158131612051597 (101 digits) Using B1=3000000, B2=5706890290, polynomial Dickson(6), sigma=180303176 Step 1 took 28469ms Step 2 took 12938ms Run 2 out of 3: Using B1=3000000, B2=5706890290, polynomial Dickson(6), sigma=580726752 Step 1 took 29375ms Step 2 took 13656ms Run 3 out of 3: Using B1=3000000, B2=5706890290, polynomial Dickson(6), sigma=3060674203 Step 1 took 28969ms Step 2 took 13765ms Code:
GMPECM 6.2.3 [powered by GMP 4.3.0] [ECM] Input number is 18816141541139222511309815439534127651955651212035224345737809063431028813620968356115158131612051597 (101 digits) Using B1=3000000, B2=5706890290, polynomial Dickson(6), sigma=2628599422 Step 1 took 23078ms Step 2 took 10938ms Run 2 out of 3: Using B1=3000000, B2=5706890290, polynomial Dickson(6), sigma=2541381111 Step 1 took 23453ms Step 2 took 10922ms Run 3 out of 3: Using B1=3000000, B2=5706890290, polynomial Dickson(6), sigma=1407409785 Step 1 took 23641ms Step 2 took 10718ms P1: Code:
GMPECM 6.2 [powered by GMP 4.2.2] [P1] Input number is 18816141541139222511309815439534127651955651212035224345737809063431028813620968356115158131612051597 (101 digits) Using B1=10000000, B2=117875629818, polynomial x^1, x0=1103425920 Step 1 took 8094ms Step 2 took 9156ms Code:
GMPECM 6.2.3 [powered by GMP 4.3.0] [P1] Input number is 18816141541139222511309815439534127651955651212035224345737809063431028813620968356115158131612051597 (101 digits) Using B1=10000000, B2=117875629818, polynomial x^1, x0=3580459987 Step 1 took 7766ms Step 2 took 9312ms 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Benchmarks  Pjetrode  Information & Answers  3  20180107 23:23 
RPS benchmarks  pinhodecarlos  Riesel Prime Search  29  20141207 07:13 
Benchmarks  ET_  Operazione Doppi Mersennes  18  20130424 06:38 
Where are the Benchmarks  Sandman192  Homework Help  17  20120405 19:03 
Benchmarks  Vandy  Hardware  6  20021028 13:45 