20090407, 11:27  #155 
"Sander"
Oct 2002
52.345322,5.52471
4A5_{16} Posts 
I've tried the newest 64bit core2 version from Jeff's site and tested it against the above c85
Code:
GMPECM 6.2.2 [powered by GMP 4.2.1_MPIR_1.0.0] [ECM] Input number is 1877138824359859508015524119652506869600959721781289179190693027302028679377371001561 (85 digits) Using B1=3000000, B2=5706890290, polynomial Dickson(6), sigma=3473972786 Step 1 took 8673ms Step 2 took 7332ms Code:
GMPECM 6.2.2 [powered by GMP 4.2.4] [ECM] Input number is 1877138824359859508015524119652506869600959721781289179190693027302028679377371001561 (85 digits) Using B1=3000000, B2=5706890290, polynomial Dickson(6), sigma=1798745233 Step 1 took 7872ms Step 2 took 4660ms 
20090407, 14:59  #156  
Jun 2003
Ottawa, Canada
7×167 Posts 
Quote:
The Windows MSVC code uses a different set of assembler than the Linux code so it doesn't surprise me that the timing is different. If you choose the same sigma for both your Windows and Linux tests, and choose a larger B2 value so the test runs a little longer do you still see the huge difference? Try running each test twice just to make sure the numbers are similar in case your system decided to do something during the test and artificially slowed down the benchmark for one. Jeff. Last fiddled with by Jeff Gilchrist on 20090407 at 14:59 

20090407, 15:26  #157 
"Sander"
Oct 2002
52.345322,5.52471
29×41 Posts 
I see that you used a B1=300M, i used 3M.
I wasn't comparing directly with your run. I did two runs on my laptop (Core2duo T7800 @2,6GHz) on both the host (64bit Vista) and a VM (64bit Ubuntu 8.10). 
20090407, 15:32  #158 
Sep 2005
Berlin
102_{8} Posts 
@smh:
Could you please post this binary and/or compare it with my 64bit binary? I found binaries optimised for Athlon64 are even faster on Core2, in comparison to Core2optimised ones. 
20090407, 17:01  #159  
Jun 2003
Ottawa, Canada
7·167 Posts 
Ah, that would explain the difference.
Quote:
As I said before, Brian Gladman had to translate the assembler from the syntax used by GCC to the one that YASM (used in the MSVC) build understands. I think he said that some of the code in the linux source is still newer than what he has translated. Since I'm not familiar with the code, I'm not sure why there is such a big difference. Jeff. 

20090407, 18:54  #160  
"Sander"
Oct 2002
52.345322,5.52471
29·41 Posts 
Quote:
I did limited testing, but with larger composites yours might also be faster in step 1. Notice i used GMPECM 6.2.2 and GMP 4.2.4 (with the core2 patch), so it might be apples and oranges. With B1=3M Code:
GMPECM 6.2.1 [powered by GMP 4.2.3] [ECM] Input number is 1877138824359859508015524119652506869600959721781289179190693027302028679377371001561 (85 digits) Using B1=3000000, B2=5706890290, polynomial Dickson(6), sigma=959787799 Step 1 took 8008ms Step 2 took 4496ms Using B1=3000000, B2=30000005706890290, polynomial Dickson(6), sigma=1211299266 Step 1 took 7865ms Step 2 took 4328ms Using B1=3000000, B2=30000005706890290, polynomial Dickson(6), sigma=573230298 Step 1 took 7989ms Step 2 took 4340ms GMPECM 6.2.2 [powered by GMP 4.2.4] [ECM] Input number is 1877138824359859508015524119652506869600959721781289179190693027302028679377371001561 (85 digits) Using B1=3000000, B2=5706890290, polynomial Dickson(6), sigma=937001321 Step 1 took 7808ms Step 2 took 4500ms Using B1=3000000, B2=30000005706890290, polynomial Dickson(6), sigma=1410435444 Step 1 took 7773ms Step 2 took 4500ms Using B1=3000000, B2=30000005706890290, polynomial Dickson(6), sigma=3426145601 Step 1 took 7921ms Step 2 took 4500ms Code:
GMPECM 6.2.1 [powered by GMP 4.2.3] [ECM] Input number is 1877138824359859508015524119652506869600959721781289179190693027302028679377371001561 (85 digits) Using B1=11000000, B2=35133391030, polynomial Dickson(12), sigma=1064336844 Step 1 took 29329ms Step 2 took 14061ms Using B1=11000000, B2=1100000035133391030, polynomial Dickson(12), sigma=3355605506 Step 1 took 28858ms Step 2 took 14157ms Using B1=11000000, B2=1100000035133391030, polynomial Dickson(12), sigma=191990272 Step 1 took 29342ms Step 2 took 14181ms GMPECM 6.2.2 [powered by GMP 4.2.4] [ECM] Input number is 1877138824359859508015524119652506869600959721781289179190693027302028679377371001561 (85 digits) Using B1=11000000, B2=35133391030, polynomial Dickson(12), sigma=1387859769 Step 1 took 28389ms Step 2 took 14777ms Using B1=11000000, B2=1100000035133391030, polynomial Dickson(12), sigma=4281716356 Step 1 took 27850ms Step 2 took 14685ms Using B1=11000000, B2=1100000035133391030, polynomial Dickson(12), sigma=3779197836 Step 1 took 27638ms Step 2 took 14681ms 

20090410, 13:51  #161 
Jun 2003
Ottawa, Canada
7×167 Posts 
I took ECM 6.2.2 and compiled it with MPIR 1.0 in cygwin to compare the LINUX code to what Windows MSVC code is doing. I saw a similar pattern to all of you as well. This is all 32bit code run on an Intel Core2 Q9550 @ 3.4GHz.
ECM Factoring: 1877138824359859508015524119652506869600959721781289179190693027302028679377371001561 B1=20000000 Sigma: 980060817 MSVC 6.2.2 with new SSE2: Step 1 took 82837ms  Step 1 took 82790ms Step 2 took 41137ms  Step 2 took 41402ms MSVC 6.2.2 without SSE2: Step 1 took 82867ms  Step 1 took 83071ms Step 2 took 42557ms  Step 2 took 43337ms GCC cygwin (enablesse2 enableasmredc) builds as pentium3 Step 1 took 78359ms  Step 1 took 78531ms Step 2 took 34695ms  Step 2 took 34086ms GCC cygwin (enablesse2 enableasmredc build=pentium4pccygwin) Step 1 took 78375ms  Step 1 took 78718ms Step 2 took 24445ms  Step 2 took 24367ms P1 Factoring: 1877138824359859508015524119652506869600959721781289179190693027302028679377371001561 B1=20000000 x0: 524328229 MSVC 6.2.2 with new SSE2: Step 1 took 9469ms  Step 1 took 9563ms Step 2 took 7098ms  Step 2 took 7051ms MSVC 6.2.2 without SSE2: Step 1 took 9360ms  Step 1 took 9235ms Step 2 took 11731ms  Step 2 took 11404ms GCC cygwin (enablesse2 enableasmredc) builds as pentium3 Step 1 took 8751ms  Step 1 took 8487ms Step 2 took 5788ms  Step 2 took 5740ms GCC cygwin (enablesse2 enableasmredc build=pentium4pccygwin) Step 1 took 8455ms  Step 1 took 8658ms Step 2 took 5788ms  Step 2 took 5710ms P+1 Factoring: 1877138824359859508015524119652506869600959721781289179190693027302028679377371001561 B1=20000000 x0: 524328229 MSVC 6.2.2 with new SSE2: Step 1 took 17082ms  Step 1 took 17145ms Step 2 took 8596ms  Step 2 took 8408ms MSVC 6.2.2 without SSE2: Step 1 took 17675ms  Step 1 took 17566ms Step 2 took 15585ms  Step 2 took 15553ms GCC cygwin (enablesse2 enableasmredc) builds as pentium3 Step 1 took 14570ms  Step 1 took 14617ms Step 2 took 7566ms  Step 2 took 7816ms GCC cygwin (enablesse2 enableasmredc build=pentium4pccygwin) Step 1 took 14929ms  Step 1 took 14602ms Step 2 took 7706ms  Step 2 took 7862ms You can see that the new MSVC build that uses SSE2 is much faster in Stage 2 than the old build, but the Linux code built with gcc (in cygwin on Windows or whatever) is faster in both Stage1 and Stage2. So if you want the fastest possible ECM/P1/P+1 you could install cygwin/mingw or run Linux/Linux in VM Jeff. Last fiddled with by Jeff Gilchrist on 20090410 at 14:49 
20090410, 14:38  #162  
"Nancy"
Aug 2002
Alexandria
100110100011_{2} Posts 
Quote:
Then, with build type pentium4, the mulredc asm code from pentium4/ should be used instead of the code from athlon/, so on an actual Pentium 4 at least, the stage 1 time should differ. On what CPU type did you run these tests? Alex 

20090410, 14:50  #163  
Jun 2003
Ottawa, Canada
2221_{8} Posts 
Quote:
Both config.h files contain #define HAVE_SSE2 1 Both linked the mulredc files from pentium4/ Jeff. Last fiddled with by Jeff Gilchrist on 20090410 at 15:15 

20090410, 15:31  #164 
"Mark"
Apr 2003
Between here and the
1100001100101_{2} Posts 
Are you referring to GMP or GMPECM thinking it is a P3. My understanding (from the GMP folks) is that the Core 2 is built on a P3 architecture, not the P4 architecture, thus the P3 optimizations work better than the P4 optimizations. That doesn't explain the difference of your ECM run.

20090410, 16:11  #165  
Jun 2003
Ottawa, Canada
7·167 Posts 
Quote:
Jeff. 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Project Links  masser  Sierpinski/Riesel Base 5  25  20111126 09:21 
Links to Precompiled Msieve versions  wblipp  Msieve  0  20110717 20:59 
Links  davieddy  Information & Answers  9  20101008 14:27 
Links question  ET_  PrimeNet  0  20080126 09:35 
Links.  Xyzzy  Forum Feedback  2  20070318 02:17 