20161128, 08:26  #1  
Random Account
Aug 2009
Not U. + S.A.
2·1,163 Posts 
Prime95 vs. CUDALucas
This is in regard to the following assignment:
Quote:
Edit: I am using an nVidia GTX750Ti and CUDA 8. Last fiddled with by storm5510 on 20161128 at 08:34 Reason: Additional Information 

20161128, 08:44  #2 
(loop (#_fork))
Feb 2006
Cambridge, England
1100100110110_{2} Posts 
And what processor are you comparing it against?
Modern graphics cards do not have exceptionally good doubleprecision performance; a GTX1080 is 256 gigaflops peak, which is the same peak as a quadcore 4GHz Haswell. The GTX750Ti is about 40 gigaflops peak, so slower than a single core of a 4GHz Haswell. Last fiddled with by fivemack on 20161128 at 08:45 
20161128, 10:51  #3 
Einyen
Dec 2003
Denmark
2^{3}×3^{2}×47 Posts 
The last Nvidia cards with "good" double precision performance was like GTX 580/590 and then the original Titan from 2013 and Titan Black / Titan Z from 2014 in the 700 series.
By "good" I mean 1/3rd of its single precision performance. All consumer cards since has DP performance of 1/24th or 1/32th of its SP performance. http://www.mersenne.ca/cudalucas.php Maybe you should use the SP performance for factoring with mfaktc instead. Your 750Ti has 1306 GFLOPs SP and 40.8 GFLOPs DP: https://en.wikipedia.org/wiki/GeForce_700_series Last fiddled with by ATH on 20161128 at 10:56 
20161128, 16:54  #4  
Random Account
Aug 2009
Not U. + S.A.
100100010110_{2} Posts 
i53570 @ 3.4 GHz.
Quote:


20161128, 17:16  #5  
Aug 2006
5,987 Posts 
Quote:
I think that ATH was suggesting that you use mfaktc instead of Culu, rather than switching modes of one or the other. 

20161128, 17:28  #6 
Einyen
Dec 2003
Denmark
6470_{8} Posts 
Yes, CUDALucas requires double precision, and it is therefore slow because it is running only 1/32 of your cards single precision performance.
It would probably be more beneficial for GIMPS and for the amount of GHzdays accumulating on your account (if you care about that) if you do factoring on the card with mfaktc (single precision) instead of LL tests with CUDALucas (double precision). Last fiddled with by ATH on 20161128 at 17:30 
20161128, 17:36  #7 
Random Account
Aug 2009
Not U. + S.A.
2·1,163 Posts 
This is primarily what I have been doing. I wanted to see how CUDALucas would perform on this hardware. Obviously, not as good as others. Case closed.

20220417, 16:21  #8  
Mar 2022
3·23 Posts 
Quote:


20220417, 17:49  #9  
"Curtis"
Feb 2005
Riverside, CA
2^{5}·3^{2}·19 Posts 
Quote:
It's not impossible, just less efficient. 

20220417, 21:08  #10 
Mar 2022
3×23 Posts 

20220418, 05:11  #11 
"Curtis"
Feb 2005
Riverside, CA
2^{5}×3^{2}×19 Posts 
Per iteration, slower. That's what I mean by "less efficient". Otherwise it would have been implemented by now.

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
CudaLucas Residual  evoflash  GPU Computing  21  20171113 12:04 
Don't DC/LL them with CudaLucas  LaurV  Data  131  20170502 18:41 
CUDALucas gives allzero residues  fivemack  GPU Computing  4  20160721 15:49 
settings for cudaLucas  fairsky  GPU Computing  11  20131103 02:08 
Trying to run CUDALucas on Windows 8 CP  Rodrigo  GPU Computing  12  20120307 23:20 