Register FAQ Search Today's Posts Mark Forums Read

2019-09-04, 19:32   #12
Corbeau

Jul 2019

2×3 Posts

Quote:
 Originally Posted by chalsall Indeed. It would be interesting to see what the GHzD/D is for a T4. I've only been playing with this for about a day now, but I've yet to get a T4 (unless explicitly asked for).
I would imagine that it would be at least a bit faster, based on the specs versus a K80. Unfortunately, there's no benchmarks on Mersenne.ca, but according to Techpowerup, the FP32 performance of the T4 is equal to both K80 processors.. The FP64 performance of the T4 is much worse, though. I don't know enough about the math of trialfactoring to know if it uses single precision or double precision.

2019-09-04, 20:00   #13
hansl

Apr 2019

110011012 Posts

Quote:
 Originally Posted by Corbeau I would imagine that it would be at least a bit faster, based on the specs versus a K80. Unfortunately, there's no benchmarks on Mersenne.ca, but according to Techpowerup, the FP32 performance of the T4 is equal to both K80 processors.. The FP64 performance of the T4 is much worse, though. I don't know enough about the math of trialfactoring to know if it uses single precision or double precision.
Pretty sure its all 32bit, since the mersenne exponents top out at 2^32.

Also I'm not super familiar with these Tesla GPUs, but its interesting that K80 is 2-in-1. Does this mean it shows up as two devices? Maybe it actually more efficient to run two simultaneous mfaktc instances if possible, by specifying the device #?

If its giving ~400Ghz-d/day then I'm surprised how poorly it performs compared to a GTX 1660 though (I get 1500-1600 on mine), despite the K80 core count being much higher and FP32 TFLOPS being only slightly less per "processor" 4TFLOP K80 vs 5TFLOPS on 1660, so x2 it should easily beat that.

With the K80 having decent FP64 performance (1:3 of FP32, vs typical GPU having 1:32), I wonder if it would be better suited for CudaPm1 or CudaLucas work rather than TF. Those use FP64 heavily, as I understand, and why Radeon VII wipes the floor with most Nvidia GPUs for those tasks, but not TF.

2019-09-04, 20:22   #14
chalsall
If I May

"Chris Halsall"
Sep 2002

2·5,021 Posts

Quote:
 Originally Posted by hansl If its giving ~400Ghz-d/day then I'm surprised how poorly it performs compared to a GTX 1660 though (I get 1500-1600 on mine), despite the K80 core count being much higher and FP32 TFLOPS being only slightly less per "processor" 4TFLOP K80 vs 5TFLOPS on 1660, so x2 it should easily beat that.
The K80s are serious compute (for example, they include ECC memory), but don't compare to more modern gaming kit. Also, each VM gets only one of the two K80 GPUs. This is the same over on Azure, GCE, and ECS.

Quote:
 Originally Posted by hansl With the K80 having decent FP64 performance (1:3 of FP32, vs typical GPU having 1:32), I wonder if it would be better suited for CudaPm1 or CudaLucas work rather than TF.
Definitely should be analyzed. I think this is early days; different Notebooks can be created and shared to focus on a particular compute space, and each individual user gets to choose which they run in their (singular) VM.

2019-09-05, 04:11   #15
chalsall
If I May

"Chris Halsall"
Sep 2002

273A16 Posts

Quote:
 Originally Posted by hansl I just posted a build in the main mfaktc thread, but haven't had a chance to try it on Colab. Let me know if this helps.
So, I came down for my just-before-bed check, and the CoLab VM had expired (as expected). Relaunched, and I got a T4.

Uploaded your executable, and got a warning about not being able to find the CUDA 10.1 library. Apparently, CoLab is 10.0.

Uploaded your entire archive, unpacked, and "make clean" and then "make" in the src/ directory. Moved the resulting executable into place.

The T4 started at 2,080 GhzD/D, and then quickly throttled down to ~1,760 GHzD/D.

Wow!

2019-09-05, 05:17   #16
hansl

Apr 2019

CD16 Posts

Quote:
 Originally Posted by chalsall Uploaded your executable, and got a warning about not being able to find the CUDA 10.1 library. Apparently, CoLab is 10.0.
Weird, seems like maybe a mix of CUDA versions across their machines then. Because the log I posted on the previous page showed "CUDA driver version 10.10", so I figured they were all the same as that.

Quote:
 Originally Posted by chalsall Uploaded your entire archive, unpacked, and "make clean" and then "make" in the src/ directory. Moved the resulting executable into place. The T4 started at 2,080 GhzD/D, and then quickly throttled down to ~1,760 GHzD/D. Wow!
Awesome! That's great that it built without much trouble (I assume you had to edit the Makefile slightly though, since I used my local path for cuda 10.1 in there).

And very nice throughput on that T4!

 2019-09-05, 06:01 #17 xx005fs   "Eric" Jan 2018 USA 22×53 Posts Any Cuda 10.0 builds with 7.5 compute capabilities in the wild? Just in case I can use their T4s in the system. I sadly don't have a Linux system on hand to build them and I want to use this feature as well.
 2019-09-05, 12:46 #18 De Wandelaar     "Yves" Jul 2017 Belgium 2×3×13 Posts Hello, I'm setting up a notebook with the mfaktc built for CUDA 10.1 provided by hansl (see https://www.mersenneforum.org/showthread.php?t=12827, message #3189). Still a problem when running it : Code: CUDA version info binary compiled for CUDA 10.10 CUDA runtime version 0.75 CUDA driver version 10.10 ERROR: CUDA runtime version must match the CUDA toolkit version used during compile!. What should I do ? Should I use another version of mfaktc ? Many thanks for your help, Yves
2019-09-05, 13:30   #19
chalsall
If I May

"Chris Halsall"
Sep 2002

2·5,021 Posts

Quote:
 Originally Posted by De Wandelaar What should I do? Should I use another version of mfaktc?

It's just the executable produced from hansl's archive with a slightly tweaked Makefile, and built on CoLab. This also means the "LD_LIBRARY_PATH" stuff isn't needed, and it works for both K80s and T4.

Unpack it, upload it, rename it and you should be good to go.
Attached Files
 mfaktc_colab.tgz (622.1 KB, 273 views)

2019-09-05, 13:38   #20
chalsall
If I May

"Chris Halsall"
Sep 2002

273A16 Posts

Quote:
 Originally Posted by hansl I assume you had to edit the Makefile slightly though, since I used my local path for cuda 10.1 in there.
Yeah. I just changed the path to be "/usr/local/cuda", and changed "CC = gcc-8" to be "gcc". Beyond that it built with no problems; it appears a full CUDA Dev environment is present on the VM!

Quote:
 Originally Posted by hansl And very nice throughput on that T4!
Indeed! Now I'm going to whine when I only get a K80!

 2019-09-05, 14:41 #21 De Wandelaar     "Yves" Jul 2017 Belgium 2×3×13 Posts @Chalsall : Thanks for your answer ! Same error occurs after copying the new version of the program. Here my script : Code: import os.path from google.colab import drive if not os.path.exists('/content/drive/My Drive'): drive.mount('/content/drive') %cd '/content/drive/My Drive/mfaktc-0.21/' !chmod 755 '/content/drive/My Drive/mfaktc-0.21/mfaktc.exe' !cd '.' && /content/drive/My\ Drive/mfaktc-0.21/mfaktc.exe !cat 'results.txt' I'll keep on trying ...
2019-09-05, 14:53   #22
chalsall
If I May

"Chris Halsall"
Sep 2002

273A16 Posts

Quote:
 Originally Posted by De Wandelaar I'll keep on trying ...
Are you sure your Runtime is set for GPU hardware acceleration?

Also, I would suggest you try moving the executable over to /usr/local/bin/ and then do the chmod stuff. cd into a directory which has the mfaktc.ini and worktodo.txt files and launch.

My notebook above should work fine for this.

P.S. Also, under the Tools menu -> Preferences, Miscellaneous, set the Power level to be Kitty Mode. Googlers have a great sense of humor!

Edit: Oh, also, are you sure you're trying to run the executable I provided? It appears you might be still trying to use hansl's. This is the output I get using my executable:
Code:
CUDA version info
binary compiled for CUDA  10.0
CUDA runtime version      10.0
CUDA driver version       10.10

Last fiddled with by chalsall on 2019-09-05 at 15:04

 Similar Threads Thread Thread Starter Forum Replies Last Post kriesel Cloud Computing 11 2020-01-14 18:45 enzocreti enzocreti 0 2019-02-15 08:20 Christenson Hardware 32 2011-12-25 08:17 garo Hardware 41 2011-10-06 04:06 dsouza123 NFSNET Discussion 5 2004-02-27 00:42

All times are UTC. The time now is 18:41.

Tue Nov 30 18:41:44 UTC 2021 up 130 days, 13:10, 0 users, load averages: 2.13, 1.72, 1.54