mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2016-11-06, 20:01   #2542
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

100111010000002 Posts
Default

Quote:
Originally Posted by henryzz View Post
Doesn't the 460 have a much better single precision/double precision ratio than the 750ti?
THe 460 is CC 2.1. I considered the possibility that, but perhaps did not look closely enough. I just now did some searching for specific floating point capability, but despite lengthy charts in various places, I did not get the differences sorted out.
kladner is offline   Reply With Quote
Old 2016-11-06, 20:27   #2543
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

3·5·383 Posts
Default

Quote:
Originally Posted by kladner View Post
THe 460 is CC 2.1. I considered the possibility that, but perhaps did not look closely enough. I just now did some searching for specific floating point capability, but despite lengthy charts in various places, I did not get the differences sorted out.
https://en.wikipedia.org/wiki/Fermi_(microarchitecture) states that consumer Fermi cards have 1/8 double precision speed.
https://en.wikipedia.org/wiki/Kepler_(microarchitecture) states that consumer Kepler cards have 1/24 double precision speed.
https://en.wikipedia.org/wiki/Maxwel...roarchitecture) states that consumer Maxwell cards have 1/32 double precision speed.
https://en.wikipedia.org/wiki/Pascal_(microarchitecture) basically states that consumer Pascal cards have 1/32 double precision speed.
The 460 is Fermi and 1/8 and the 750Ti is Maxwell and 1/32.

There are variations with some of the Titan cards and server cards which allow things like 1/2 or 1/3 depending on the generation.
The TDP of the 460 is also around 2.5x that of the 750Ti. The 750Ti is only a 60 watt card which is low for a gpu. It is almost as efficient as most of the 900 series gpus(Maxwell gen 2)
henryzz is offline   Reply With Quote
Old 2016-11-06, 20:29   #2544
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

26·157 Posts
Default

Thanks! I looked at related sources, but obviously not carefully enough. Actually, I think I read the general Wiki on CUDA.

Last fiddled with by kladner on 2016-11-06 at 20:30
kladner is offline   Reply With Quote
Old 2016-11-07, 05:22   #2545
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

235008 Posts
Default

Quote:
Originally Posted by storm5510 View Post
Disregard the request above. I found it with a bit more searching.

I changed the screen output options so I could see what was going on. I reserved one doublecheck from PrimNet. CUDALucas reports it can complete it in a little under six days.

Now for my quandary: Prime95 indicates it can do the test in the same amount of time. Six days for a LL test is nothing to sneeze at. CUDALucas did not seem to be utilizing my GPU as much as I thought it might. It's a GTX-750Ti. I could tell by observing the core temperature. mfaktc runs it in the upper 50's one the C scale. CUDALucas only made it into the low 40's.

Just in case anyone wonders about my setup, it all runs with CUDA 8.
Quote:
That card should do better.
My apologies. A more appropriate answer would have been, "That card is better suited to TF work."
kladner is offline   Reply With Quote
Old 2016-11-07, 15:42   #2546
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

3×5×383 Posts
Default

Quote:
Originally Posted by kladner View Post
My apologies. A more appropriate answer would have been, "That card is better suited to TF work."
As are most modern cards.
henryzz is offline   Reply With Quote
Old 2016-11-07, 16:42   #2547
airsquirrels
 
airsquirrels's Avatar
 
"David"
Jul 2015
Ohio

20516 Posts
Default

Quote:
Originally Posted by henryzz View Post
As are most modern cards.
I see this pulled out over and over again about GPUs. "More suited to TF work"

I fail to see how this it backed up by much in terms of actual productivity (exponents cleared - not GhzDays, TF GhzDays are a joke) for many modern cards. I believe the math is "For exponents currently factored < [card specific crossover point], it is better to use your card to TF those exponents. For the giant piles of exponents needing double or first time checks already factored beyond those cross-over points, the card is just as productive doing LL work."

For most modern cards we are already at or nearly at those cross-over points with lots of room to spare, meaning it is equally productive for a person with such a GPU to choose LL from the already factored pile or TF work from the far future list of exponents needing more factoring. It is true that GPUs are far better than CPUs at TF, however we only need enough current generation GPUs doing TF to keep ahead of the CPU/other GPU LL demand pool.

What is especially interesting on the AMD side, and somewhat on the NVIDIA side, is that the older generations of cards did 200-300 GhzDay/Day of TF, but are close to the same as modern cards at LL work due to 1/2 or 1/3 DP units (40-50GhzDay/day). These cards are better suited to DC or LL work than TF compared to a modern card, as in they have a lower cross over point, however even modern cards have crossover points that we often have large reserves of exponents that are factored to that level.

My counter-argument to continuing to factor far ahead with current or aging GPUs is that it isn't particularly power efficient. By the time we actually need those exponents factored we will likely be 1-2 generations of GPUs newer, or current high-end GPUs will be more accessible and migrated down to be widely in use. To henryzz's point, the direction technology is going does seem to be widening the gap between TF and LL and thus raising the cross-over point. That means newer GPUs will likely be just slightly better at LL but perhaps 2x as good at TF. Better to use the GPUs we have now balanced between filling the reserves and checking the exponents and save far-future TF for the next generation of cards that will be even more efficient at it.
airsquirrels is offline   Reply With Quote
Old 2016-11-07, 17:17   #2548
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

9,323 Posts
Default

Quote:
Originally Posted by airsquirrels View Post
Better to use the GPUs we have now balanced between filling the reserves and checking the exponents and save far-future TF for the next generation of cards that will be even more efficient at it.
+1!

For a while (read: several years) the GPU TF'ing effort was playing catch-up. Oliver and Bdot (and NVidia and AMD) completely changed the game with their programs and the respective hardware. Heck, we didn't even initially know how deep the GPU TF'ing should go until James stepped in with his analysis of the TF'ing vs. LL'ing and DC'ing cross-over points.

This might sound strange coming from the GPU72 guy, but just looking at the Primenet Exponent Status Distribution Map it is clear that the TF'ing is *well* ahead of the LL'ing, DC'ing and even the P-1'ing.

GPU TF'ing will always be needed, but this is a resource management and optimization problem. As GIMPS' stated goal is to find Mersenne Primes (not factors), perhaps it is time for more GPUs be directed to LL'ing or DC'ing (or Carl's P-1 GPU program).

But, as always, this is a volunteer effort. Everyone is encouraged to do whatever kind of work rocks their boat. At the end of the day all the work will be needed and useful.
chalsall is offline   Reply With Quote
Old 2016-11-07, 18:28   #2549
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

26×157 Posts
Default

Thanks to everyone for the discussion and explanations. The GTX 460 may not be very efficient, but it has proven to be rock steady at DC work. I am more leery of CuLu on the 580 because of the odd glitch which affects CC 2.0 cards. I know it's not supposed to spoil the result, but I had more than one proven bad mismatch on that card. That probably means that I did not go far enough in "detuning" it for stability.
kladner is offline   Reply With Quote
Old 2016-11-07, 19:00   #2550
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

21438 Posts
Default

Quote:
Originally Posted by kladner View Post
Thanks to everyone for the discussion and explanations. The GTX 460 may not be very efficient, but it has proven to be rock steady at DC work. I am more leery of CuLu on the 580 because of the odd glitch which affects CC 2.0 cards. I know it's not supposed to spoil the result, but I had more than one proven bad mismatch on that card. That probably means that I did not go far enough in "detuning" it for stability.
I have a couple of 580s sitting around that are very stable for all work. I can send them to anyone who would like them.
flashjh is offline   Reply With Quote
Old 2016-11-07, 19:32   #2551
Batalov
 
Batalov's Avatar
 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2

217068 Posts
Default

I could use one...
Batalov is offline   Reply With Quote
Old 2016-11-07, 19:35   #2552
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

Quote:
Originally Posted by Batalov View Post
I could use one...
Message me and lets get one to you :)
flashjh is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Don't DC/LL them with CudaLucas LaurV Data 131 2017-05-02 18:41
CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 Brain GPU Computing 13 2016-02-19 15:53
CUDALucas: which binary to use? Karl M Johnson GPU Computing 15 2015-10-13 04:44
settings for cudaLucas fairsky GPU Computing 11 2013-11-03 02:08
Trying to run CUDALucas on Windows 8 CP Rodrigo GPU Computing 12 2012-03-07 23:20

All times are UTC. The time now is 10:22.

Thu Nov 26 10:22:08 UTC 2020 up 77 days, 7:33, 4 users, load averages: 2.22, 1.80, 1.53

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.