![]() |
![]() |
#34 | |
Sep 2009
11·89 Posts |
![]()
Another data point, for numbers between C144 and C29x: C237 is slower on the GPU, but obviously faster on the CPU, than C29x:
Code:
$ echo 472367364481324943429608990380363865230376899949857658144588096283146783114430372207621802600829155058766951167631153619328587819346877117165453306904995816614534365740792256712736351604580048562330248528078693598071309876495244264859329 | ./gpu_ecm -vv -n 64 -save 80009_248_3e6_1 3000000 #Compiled for a NVIDIA GPU with compute capability 1.3. #Will use device 0 : GeForce GT 540M, compute capability 2.1, 2 MPs. #s has 4328086 bits Precomputation of s took 0.256s Input number is 472367364481324943429608990380363865230376899949857658144588096283146783114430372207621802600829155058766951167631153619328587819346877117165453306904995816614534365740792256712736351604580048562330248528078693598071309876495244264859329 (237 digits) Using B1=3000000, firstinvd=563947071, with 64 curves [snip] gpu_ecm took : 1637.614s (0.000+1637.610+0.004) Throughput : 0.039 $ echo 472367364481324943429608990380363865230376899949857658144588096283146783114430372207621802600829155058766951167631153619328587819346877117165453306904995816614534365740792256712736351604580048562330248528078693598071309876495244264859329 | ./ecm -c 1 3000000 bash: ./ecm: Aucun fichier ou dossier de ce type debrouxl@asus2:~/ecm/gpu/gpu_ecm$ echo 472367364481324943429608990380363865230376899949857658144588096283146783114430372207621802600829155058766951167631153619328587819346877117165453306904995816614534365740792256712736351604580048562330248528078693598071309876495244264859329 | ecm -c 1 3000000 GMP-ECM 6.5-dev [configured with GMP 5.0.90, --enable-asm-redc, --enable-assert] [ECM] Input number is 472367364481324943429608990380363865230376899949857658144588096283146783114430372207621802600829155058766951167631153619328587819346877117165453306904995816614534365740792256712736351604580048562330248528078693598071309876495244264859329 (237 digits) Using B1=3000000, B2=5706890290, polynomial Dickson(6), sigma=379651352 Step 1 took 42974ms Step 2 took 12981ms Quote:
|
|
![]() |
![]() |
![]() |
#35 | |
Sep 2010
Scandinavia
3×5×41 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#36 |
Bamboozled!
"๐บ๐๐ท๐ท๐ญ"
May 2003
Down not across
3·132·23 Posts |
![]()
Since my first experiments, I've been playing with a version which uses 512-bit arithmetic (fudged with CFLAGS+=-DNB_DIGITS=16 in the relevant line of Makefile). As expected, ECM runs around 3 times faster on ~500 bit numbers with this change.
One of the things on my to-do list is to add greater flexibility to the choice of bignum sizes. Experiments with both 1024 and 512-bit arithmetic indicate that running more than the default number of curves is a Good Thing, presumably by hiding memory latency. The downside, of course, is that the display stays rather sluggish for a proportionately long time. I'm trying to estimate how long a run will take and then kick it off overnight when display latency is likely to be unimportant. Paul |
![]() |
![]() |
![]() |
#37 |
Jul 2003
So Cal
23×52×13 Posts |
![]()
I added a percent complete counter in the for loop launching the kernels in cudautil.cu. I don't think adding an ETA would be difficult.
|
![]() |
![]() |
![]() |
#38 | |
"Bob Silverman"
Nov 2003
North of Boston
22·1,877 Posts |
![]() Quote:
However, allow me to point out that when I present a similar attitude toward the learning of the algorithms discussed herein and the mathematics behind them, I am lambasted for my efforts. Participants should be willing to put in the effort or they should leave. |
|
![]() |
![]() |
![]() |
#39 | |
Bamboozled!
"๐บ๐๐ท๐ท๐ญ"
May 2003
Down not across
101101100011012 Posts |
![]() Quote:
Much of the mathematics discussed here is not at the bleeding edge, IMO. It is closer in spirit to oft-times cranky but nonetheless well understood and supported applications such as mainstream gmp-ecm. IMO, your diatribes against those wishing to perform bleeding edge mathematics are fully justified. They are less appropriate, again IMO, further away from the bleeding edge. I hope I would never feel the urge to issue my earlier warnings to those who only wish to use gmp-ecm and are confused by its jargon and multitudinous options. |
|
![]() |
![]() |
![]() |
#40 | ||
"Bob Silverman"
Nov 2003
North of Boston
22·1,877 Posts |
![]() Quote:
Indeed. I have even heard one of the people (whom I hold in contempt) admit that he does not even know how to use a compiler. Quote:
not understand things even at that level. Nor do they seem willing to make the attempt. They don't even understand mathematics that was known 150+ years ago. Nor do they want to make the effort. |
||
![]() |
![]() |
![]() |
#41 | |
Bamboozled!
"๐บ๐๐ท๐ท๐ญ"
May 2003
Down not across
3×132×23 Posts |
![]() Quote:
Out of the box (well, my box anyway) the default build appears to use parameters suitable for a CC1.3 system, despite there being a Fermi card installed. A run on a C302 with these parameters chooses 112 curves arranged 32x16 x 7x1x1 and takes 3845.428 seconds. Rebuilding with "make cc=2" and re-running took 5539.049 seconds for 224 curves arranged 32x32 x 7x1x1. The ratio (224/112) * (3845.428 / 5539.049) is 1.388. I suggest a 39% speed-up is worth having. |
|
![]() |
![]() |
![]() |
#42 |
Oct 2010
101111112 Posts |
![]()
CC 2.0 card (GTX 470, stock clocks), 512 bit arithmetic, CUDA SDK 4.0. The c151 was taken from the Aliquot sequence 890460:i898
Code:
ralf@quadriga:~/dev/gpu_ecm$ LD_LIBRARY_PATH=/usr/local/cuda/lib64/ ./gpu_ecm -d 0 -save c151.save 250000 < c151 Precomputation of s took 0.004s Input number is 4355109846524047003246531292211765742521128216321735054909228664961069056051308281896789359834792526662067203883345116753066761522281210568477760081509 (151 digits) Using B1=250000, firstinvd=24351435, with 448 curves gpu_ecm took : 116.363s (0.000+116.355+0.008) Throughput : 3.850 Code:
ralf@quadriga:~/dev/gpu_ecm$ LD_LIBRARY_PATH=/usr/local/cuda/lib64/ ./gpu_ecm -d 0 -n 896 -save c151.save 250000 < c151 Precomputation of s took 0.004s Input number is 4355109846524047003246531292211765742521128216321735054909228664961069056051308281896789359834792526662067203883345116753066761522281210568477760081509 (151 digits) Using B1=250000, firstinvd=1471710578, with 896 curves gpu_ecm took : 179.747s (0.000+179.731+0.016) Throughput : 4.985 Code:
ralf@quadriga:~/dev/gpu_ecm$ LD_LIBRARY_PATH=/usr/local/cuda/lib64/ ./gpu_ecm -d 0 -n 864 -save c151.save 250000 < c151 Precomputation of s took 0.004s Input number is 4355109846524047003246531292211765742521128216321735054909228664961069056051308281896789359834792526662067203883345116753066761522281210568477760081509 (151 digits) Using B1=250000, firstinvd=1374804691, with 864 curves gpu_ecm took : 130.964s (0.000+130.948+0.016) Throughput : 6.597 Code:
224 curves - Throughput : 2.289 416 curves - Throughput : 4.223 448 curves - Throughput : 4.547 480 curves - Throughput : 3.039 672 curves - Throughput : 4.233 896 curves - Throughput : 4.638 1792 curves - Throughput : 4.753 Last fiddled with by Ralf Recker on 2012-02-14 at 22:36 Reason: Caption, CC 2.1 results |
![]() |
![]() |
![]() |
#43 |
Banned
"Luigi"
Aug 2002
Team Italia
3·1,619 Posts |
![]()
OK, I downloaded the source code with cc=1.3, and successfully compiled it
![]() Sadly, I see differences between the Xilman and Ralf Recker outputs. The executable passes the test. What represents the (needed) parameter N in the command line? All I can see is that it has to do with the xfin, zfin and xunif parameters, and should be odd... I also tried ./gpu_ecm 9699691 11000 -n 1 <in where in contains the number 65798732165875434667. I got the factor 347 that is not a factor of the number in input... To testify my good will: Code:
./gpu_ecm 9699691 11000 -n 1 <in #Compiled for a NVIDIA GPU with compute capability 1.3. #Will use device 0 : GeForce GTX 275, compute capability 1.3, 30 MPs. #gpu_ecm launched with : N=9699691 B1=11000 curves=1 firstsigma=11 #used seed 1329332970 to generate sigma #Begin GPU computation... #All kernels launched, waiting for results... #All kernels finished, analysing results... #Looking for factors for the curves with sigma=11 xfin=3111202 zfin=7720056 #Factor found : 347 (with z) #Results : 1 factor found #Temps gpu : 15.080 init©=0.040 computation=15.040 Would you mind (now that my hands have been contaminated by bits and compilers) shedding some light to this obscure valley? Even a link explaining what N means in this context would suffice... ![]() Many thanks... Luigi P.S. after some more fiddling, I noticed that 347 is a factor of 9699691, so I think I got the meaning of N after all... ![]() With N3 and 448 curves, my GTX275 has the same speed of my Intel I5-750. Last fiddled with by ET_ on 2012-02-15 at 19:51 Reason: Gee... I shouldn't mess with it when I'm back from work. |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Running CUDA on non-Nvidia GPUs | Rodrigo | GPU Computing | 3 | 2016-05-17 05:43 |
Error in GMP-ECM 6.4.3 and latest svn | ATH | GMP-ECM | 10 | 2012-07-29 17:15 |
latest SVN 1677 | ATH | GMP-ECM | 7 | 2012-01-07 18:34 |
Has anyone seen my latest treatise? | davieddy | Lounge | 0 | 2011-01-21 19:29 |
Latest version? | [CZ]Pegas | Software | 3 | 2002-08-23 17:05 |