mersenneforum.org ECM for CUDA GPUs in latest GMP-ECM ?
 Register FAQ Search Today's Posts Mark Forums Read

2012-02-11, 12:54   #23
pinhodecarlos

"Carlos Pinho"
Oct 2011
Milton Keynes, UK

3×1,621 Posts

Quote:
 Originally Posted by xilman I screwed up computing the time per curve 1792 curves took 141 hours to run. I evaluated (1792 * 141 / 3600) to obtain the quoted figure of 70 seconds per curve. The correct expression is (141 * 3600 / 1792), which evaluates to 283 seconds per curve. Although this is four times worse than the initial figure, it is still 2.4 times faster than a singe core. Sorry about that.
Paul, do you see cpu usage when running GPU-ECM?
BTW, I'm too lazy even to install linux...lol When I look back to the software I use on windows I just don't think I can use all of them in linux.

Last fiddled with by pinhodecarlos on 2012-02-11 at 12:56

2012-02-11, 16:39   #24
ATH
Einyen

Dec 2003
Denmark

22·757 Posts

Quote:
 Originally Posted by xilman If you have Linux you can build from the SVN sources as easily as I can. The process really is very straightforward and you'll end up with something which doesn't carry the risk of the Linux equivalent of DLL-hell.
Can you post your compile options please? Then maybe I can figure out how to compile this in msys/mingw for "windoze".

2012-02-11, 17:43   #25
xilman
Bamboozled!

"𒉺𒌌𒇷𒆷𒀭"
May 2003
Down not across

3·3,529 Posts

Quote:
 Originally Posted by ATH Can you post your compile options please? Then maybe I can figure out how to compile this in msys/mingw for "windoze".
I don't use compile options per se, just configure and make.

You will almost certainly find life much much easier if you install VirtualBox or the like and then a Linux inside a virtual machine. Building GPM and GPM-ECM is then a complete doddle --- essentially a matter of saying "./configure; make ; make check ; make install" in the respective build directories. Once you've done that for each, you have everything you need --- working binaries which you can us as-is or use as a gold standard against which to check new builds, together with all the documentation, compile options, etc, which you can cut and paste into either the host environment or into other hosted machines.

2012-02-11, 19:01   #26
Brain

Dec 2009
Peine, Germany

331 Posts
Makefile for CC13

Here's the currently available trunk makefile for CC13.
Attached Files
 Makefile.zip (502 Bytes, 262 views)

2012-02-11, 19:06   #27
WraithX

Mar 2006

23·59 Posts

Quote:
 Originally Posted by xilman I don't use compile options per se, just configure and make. Building GPM and GPM-ECM is then a complete doddle --- essentially a matter of saying "./configure; make ; make check ; make install" in the respective build directories.
I was wondering, inside the gpu directory is a makefile and there are also two other directories (gpu_ecm and gpu_ecm_cc13) that both have makefiles. In which directory, or directories, do you run make? In which directory do you create the binary that you are referencing?

Also, inside the gpu directories, I see no configure file. So, it's not "configure and make" that you run, it's just "make", correct?

If I had an nVidia video card, I would try this myself. However, I do not, so I will leave it to others to try.

2012-02-11, 19:16   #28
xilman
Bamboozled!

"𒉺𒌌𒇷𒆷𒀭"
May 2003
Down not across

3×3,529 Posts

Quote:
 Originally Posted by WraithX I was wondering, inside the gpu directory is a makefile and there are also two other directories (gpu_ecm and gpu_ecm_cc13) that both have makefiles. In which directory, or directories, do you run make? In which directory do you create the binary that you are referencing? Also, inside the gpu directories, I see no configure file. So, it's not "configure and make" that you run, it's just "make", correct? If I had an nVidia video card, I would try this myself. However, I do not, so I will leave it to others to try.
My main machine is a Fermi so I didn't even bother with the cc13 version.

To answer your other question: you should read README.dev in the trunk directory. I'm not being wilfully obtuse. You really should read how to configure the development code environment.

Once everything is in place, you do indeed just run make.

2012-02-11, 19:24   #29
xilman
Bamboozled!

"𒉺𒌌𒇷𒆷𒀭"
May 2003
Down not across

245338 Posts

Quote:
 Originally Posted by xilman I'm not being wilfully obtuse. You really should read how to configure the development code environment.
In case it is not clear to bystanders, this code is not fire and forget. It is not production quality.

If you want to use it, you will need to get your hands dirty. I'm prepared to help as best I can after you've followed the instructions in the svn distro and after you've made a sincere effort to get things working by yourself. I am not prepared to bottle-feed, to wipe noses or to change {nappies,diapers}.

That may sound harsh but it's the way the world of alpha-code development works and you'll need to get used to it if you want to play with the big boys and girls. Once you pass the audition you'll find most developers are very friendly and helpful.

Neither am I addressing these remarks to any particular individuals who may, or may not, have posted in this thread.

Paul

 2012-02-11, 22:34 #30 frmky     Jul 2003 So Cal 22·11·47 Posts I'm getting results slower than the cpu. I'm using a c144 (from the 4788 aliquoit sequence) on a Core i7 CPU and GTX 480 GPU: Code: ~/ecmtest$~/bin/ecm 11e6 < c144 GMP-ECM 6.5-dev [configured with GMP 5.0.4, --enable-asm-redc] [ECM] Input number is 216210261026078873728038575619824007502275880651339130269087415140753033343108746166779571643387473335848998664028620971224681169067812545897739 (144 digits) Using B1=11000000, B2=35133391030, polynomial Dickson(12), sigma=1115846 Step 1 took 35420ms Step 2 took 13620ms ~/bin/gpu_ecm -n 256 -save test 11000000 < c144 Precomputation of s took 0.950s Input number is 216210261026078873728038575619824007502275880651339130269087415140753033343108746166779571643387473335848998664028620971224681169067812545897739 (144 digits) Using B1=11000000, firstinvd=724674352, with 256 curves gpu_ecm took : 13144.730s (0.000+13144.720+0.010) Throughput : 0.019 ~/bin/gpu_ecm -n 480 -save test 11000000 < c144 Precomputation of s took 0.950s Input number is 216210261026078873728038575619824007502275880651339130269087415140753033343108746166779571643387473335848998664028620971224681169067812545897739 (144 digits) Using B1=11000000, firstinvd=1789558835, with 480 curves gpu_ecm took : 24198.970s (0.000+24198.960+0.010) Throughput : 0.020 This GPU has 15 MP's, so gpu_ecm defaults to 480 curves, but that was only slightly faster than using 256 curves: CPU: 35.4 s GPU 256: 51.3 s GPU 480: 50.4 s Hmmm... Why do larger numbers take less time? Code: ~/bin/gpu_ecm -n 480 11000 < c144 Precomputation of s took 0.000s Input number is 216210261026078873728038575619824007502275880651339130269087415140753033343108746166779571643387473335848998664028620971224681169067812545897739 (144 digits) Using B1=11000, firstinvd=1718283956, with 480 curves gpu_ecm took : 24.260s (0.000+24.250+0.010) Throughput : 19.786 ~/bin/gpu_ecm -n 480 11000 < 10p332 Precomputation of s took 0.000s Input number is 3082036244247618744713879350181267942494229636149227133619560368864804688115816966917438461372823837680425045410470575056718115654210704653050148781462686145415984611154261527877775921978501350266306075811598788040720480163782506686648165217270804627622798871662974986806951627082442232588805761 (295 digits) Using B1=11000, firstinvd=412318627, with 480 curves gpu_ecm took : 12.530s (0.000+12.520+0.010) Throughput : 38.308  2012-02-12, 02:19 #31 jasonp Tribal Bullet Oct 2004 DCE16 Posts Maybe having more multiprocessors means that larger blocks of work have to be given in a kernel launch.  2012-02-12, 06:27 #32 frmky Jul 2003 So Cal 81416 Posts With the larger number 10,332+, gpu_ecm is indeed about 4x faster: Code: ~/bin/gpu_ecm -n 480 11000000 < 10p332 Precomputation of s took 0.950s Input number is 3082036244247618744713879350181267942494229636149227133619560368864804688115816966917438461372823837680425045410470575056718115654210704653050148781462686145415984611154261527877775921978501350266306075811598788040720480163782506686648165217270804627622798871662974986806951627082442232588805761 (295 digits) Using B1=11000000, firstinvd=197457519, with 480 curves gpu_ecm took : 12460.500s (0.000+12460.490+0.010) Throughput : 0.039 26 s/curve ~/bin/ecm 11e6 < 10p332 GMP-ECM 6.5-dev [configured with GMP 5.0.4, --enable-asm-redc] [ECM] Input number is 3082036244247618744713879350181267942494229636149227133619560368864804688115816966917438461372823837680425045410470575056718115654210704653050148781462686145415984611154261527877775921978501350266306075811598788040720480163782506686648165217270804627622798871662974986806951627082442232588805761 (295 digits) Using B1=11000000, B2=35133391030, polynomial Dickson(12), sigma=770548151 Step 1 took 116030ms Step 2 took 31240ms  2012-02-12, 08:08 #33 debrouxl Sep 2009 977 Posts Indeed, it's much better with larger numbers, even on fairly low-end GPUs Code: $ echo 77777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777677777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777 | ./gpu_ecm -vv -n 64 -save 77677_149_3e6_1 3000000 #Compiled for a NVIDIA GPU with compute capability 1.3. #Will use device 0 : GeForce GT 540M, compute capability 2.1, 2 MPs. #s has 4328086 bits Precomputation of s took 0.260s Input number is 77777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777677777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777 (299 digits) Using B1=3000000, firstinvd=1956725845, with 64 curves 8+64*d=15748722851276397705078124999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999979751642048358917236328125000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000037 8+64*d=15748816728591918945312499999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999979751521348953247070312500000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000037 #Begin GPU computation... Block: 32x16x1 Grid: 4x1x1 #Looking for factors for the curves with (d*2^32) mod N = 1956725845 xfin=30793582623383249085792654048330071529605422286616239783953502163671251485785103766950295774964496127287193014815633605233749525283700023789095348986097253179578725011454469815600504523632074726379316756899242430619968238212335497266401636095557828925843807562412336497359441279441288718847426104007 zfin=15653600481320091921866091998449420910733335116855114761017777079039414436628501359278897983474928453087970457407357858920433473350010208701033703353460574476309980980321812580804654671402344055561817931400864403015931562744329996813634620940647894525755619616828739385722340718168011611105085982277 xunif=39408042568104336805270379492712518016456213440668263700932960319496692204712994493267826611404606523547436560316003809780479305567060943184017323896559838396801417755478902976203909415762293776133099854903949460128487164294408353102629988943560190937509387321676681877121514979954245436789912376565 #Looking for factors for the curves with (d*2^32) mod N = 1956725908 xfin=14966698215750072697023404424489655322417510285848104176664523355430448144237921356160942849117869928416482104860850242503535702178551676734700459600950072236295600757108345379537820143078621679600800366849565012827321265584610218377563003322400088365819158936957519419145860156952262705564563161722 zfin=13196899771013716409148933418583531970092448862012559897727436095137691137124639002299074611035125565787268265934160640735274655897627921875382930307151913405846752684117468279045581956406343173251902392466987748520166146927437149020702523884077620927133537101563815962380052471821344666329942845796 xunif=16137963010874506957647933426009827242074704110758480638668956781797848968933741211221737736217202543887736175366874915988459776979538810976607154822667437291905995042783104430898367764816031215094913761969808431783659168083194336651034563970539910660975759146036916968022060602536777108941718805587 gpu_ecm took : 1420.292s (0.000+1420.288+0.004) Throughput : 0.045 (~22 s/curve) \$ echo 77777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777677777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777 | ecm -c 1 3000000 GMP-ECM 6.5-dev [configured with GMP 5.0.90, --enable-asm-redc, --enable-assert] [ECM] Input number is 77777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777677777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777 (299 digits) Using B1=3000000, B2=5706890290, polynomial Dickson(6), sigma=2227022774 Step 1 took 58271ms Step 2 took 17197ms

 Similar Threads Thread Thread Starter Forum Replies Last Post Rodrigo GPU Computing 3 2016-05-17 05:43 ATH GMP-ECM 10 2012-07-29 17:15 ATH GMP-ECM 7 2012-01-07 18:34 davieddy Lounge 0 2011-01-21 19:29 [CZ]Pegas Software 3 2002-08-23 17:05

All times are UTC. The time now is 03:09.

Thu Feb 25 03:09:39 UTC 2021 up 83 days, 23:20, 0 users, load averages: 2.13, 2.25, 2.28