mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2016-08-15, 21:42   #1
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

2×3×113 Posts
Default Sieving k * 2^n +- c with Nvidia GPU's for fixed k

Status update.

Basically comes down to doing the BSGS algorithm to crack the discrete algorithm on the GPU.

Past month and a little bit more developing at Nvidia GTX580.

Though testing it currently for k * 2^n - 1, should work with minor changes for similar formula's with a single k.

Speeds up more and more yet goes step by step to get it faster. Initially was slower than newpgen.

Right for n-range is 7 million, it's about 17x faster than newpgen at a single CPU core here. For smaller n-ranges it's linear faster. For n-range about 4 million it'll be a 30x faster than newpgen. This is at a GTx580.

Trying to speed it up. Basically busy saving out cache usage. Hope to post code within a few weeks, maybe sooner that works a little.

At a remote GTX980 in the States i tested a little - yet will require total new kernel. Those pictures they draw of the Kepler on homepages online are marketing pictures. Right now is considerable slower than GTX580 - yet with special kernel doing 128 streamcores in a single kernel, instead of 32, should speedup nearly factor 4, though for now factor 2 would be nice...

Fermi, Maxwell and every GPU generation will require its own kernel.

Right now is Fermi kernel. That means it runs of course at all those GPU's, yet doesn't benefit from the architecture of Maxwell right now. Will come!

As Fermi (4xx and 5xx series) has 32 streamcores in a single multiprocessor,
the 6xx series has 192 streamcores in a single multiprocessor (big problem)
and Maxwell has 128 streamcores in a single multiprocessor and where Fermi and Maxwell have similar L1 datacache, the 6xx series has weirdo design of its own.

Using: primesieve and intrinsics from TheJudger.

To be continued. What's a good spot to upload working source codes to so everyone can download it?

Regards,
Vincent Diepeveen
diep is offline   Reply With Quote
Old 2016-08-29, 23:27   #2
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

2×3×113 Posts
Default

Looking for someone with Kepler series GPU to benchmark.

New kernel regrettably not faster at my GTX580 than old kernel - yet it is a whopping 3x faster nearly now at a GTX980 - old kernel was slower there.
diep is offline   Reply With Quote
Old 2016-08-31, 21:03   #3
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

32·227 Posts
Default

I can try it on a K20c...
frmky is offline   Reply With Quote
Old 2016-08-31, 22:06   #4
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

2·3·113 Posts
Default

Quote:
Originally Posted by frmky View Post
I can try it on a K20c...
Cool give me an email address i didn't figure out yet how to attach files here. it's 'nearly' in production state now.

or give me a mail at diep at s4all dot nl then i return the source.
consider GPL3 above it. Spread the word.
diep is offline   Reply With Quote
Old 2016-09-23, 16:26   #5
Joe O
 
Joe O's Avatar
 
Aug 2002

10000011012 Posts
Default

I can run it on a GTX 750 TI if you send it to me.
Joe O is offline   Reply With Quote
Old 2016-09-23, 19:19   #6
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

2×3×113 Posts
Default

Oops my email address is: diep @ xs4all . nl

Forgot to write the x here before - apologies for that...
diep is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
fbncsieve - a new fixed n sieve rogue Software 38 2018-02-11 00:08
A siever for K (b, n, c fixed)? pepi37 Software 7 2015-07-10 04:42
Sieving k*2^n-1 With Fixed n c10ck3r Riesel Prime Search 14 2013-02-03 00:19
User interface bug fixed on LLR V3.8.4 Jean Penné Software 0 2011-01-22 16:47
KEP is reporting computer fixed KEP Twin Prime Search 3 2007-02-13 18:29

All times are UTC. The time now is 16:25.

Mon Oct 26 16:25:54 UTC 2020 up 46 days, 13:36, 0 users, load averages: 1.51, 1.59, 1.66

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.