mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > Cloud Computing

Reply
 
Thread Tools
Old 2019-09-04, 19:32   #12
Corbeau
 
Corbeau's Avatar
 
Jul 2019

2·3 Posts
Default

Quote:
Originally Posted by chalsall View Post
Indeed. It would be interesting to see what the GHzD/D is for a T4. I've only been playing with this for about a day now, but I've yet to get a T4 (unless explicitly asked for).
I would imagine that it would be at least a bit faster, based on the specs versus a K80. Unfortunately, there's no benchmarks on Mersenne.ca, but according to Techpowerup, the FP32 performance of the T4 is equal to both K80 processors.. The FP64 performance of the T4 is much worse, though. I don't know enough about the math of trialfactoring to know if it uses single precision or double precision.
Corbeau is offline   Reply With Quote
Old 2019-09-04, 20:00   #13
hansl
 
hansl's Avatar
 
Apr 2019

5×41 Posts
Default

Quote:
Originally Posted by Corbeau View Post
I would imagine that it would be at least a bit faster, based on the specs versus a K80. Unfortunately, there's no benchmarks on Mersenne.ca, but according to Techpowerup, the FP32 performance of the T4 is equal to both K80 processors.. The FP64 performance of the T4 is much worse, though. I don't know enough about the math of trialfactoring to know if it uses single precision or double precision.
Pretty sure its all 32bit, since the mersenne exponents top out at 2^32.

Also I'm not super familiar with these Tesla GPUs, but its interesting that K80 is 2-in-1. Does this mean it shows up as two devices? Maybe it actually more efficient to run two simultaneous mfaktc instances if possible, by specifying the device #?

If its giving ~400Ghz-d/day then I'm surprised how poorly it performs compared to a GTX 1660 though (I get 1500-1600 on mine), despite the K80 core count being much higher and FP32 TFLOPS being only slightly less per "processor" 4TFLOP K80 vs 5TFLOPS on 1660, so x2 it should easily beat that.

With the K80 having decent FP64 performance (1:3 of FP32, vs typical GPU having 1:32), I wonder if it would be better suited for CudaPm1 or CudaLucas work rather than TF. Those use FP64 heavily, as I understand, and why Radeon VII wipes the floor with most Nvidia GPUs for those tasks, but not TF.
hansl is offline   Reply With Quote
Old 2019-09-04, 20:22   #14
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

23×433 Posts
Default

Quote:
Originally Posted by hansl View Post
If its giving ~400Ghz-d/day then I'm surprised how poorly it performs compared to a GTX 1660 though (I get 1500-1600 on mine), despite the K80 core count being much higher and FP32 TFLOPS being only slightly less per "processor" 4TFLOP K80 vs 5TFLOPS on 1660, so x2 it should easily beat that.
The K80s are serious compute (for example, they include ECC memory), but don't compare to more modern gaming kit. Also, each VM gets only one of the two K80 GPUs. This is the same over on Azure, GCE, and ECS.

Quote:
Originally Posted by hansl View Post
With the K80 having decent FP64 performance (1:3 of FP32, vs typical GPU having 1:32), I wonder if it would be better suited for CudaPm1 or CudaLucas work rather than TF.
Definitely should be analyzed. I think this is early days; different Notebooks can be created and shared to focus on a particular compute space, and each individual user gets to choose which they run in their (singular) VM.
chalsall is online now   Reply With Quote
Old 2019-09-05, 04:11   #15
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

23·433 Posts
Default

Quote:
Originally Posted by hansl View Post
I just posted a build in the main mfaktc thread, but haven't had a chance to try it on Colab. Let me know if this helps.
So, I came down for my just-before-bed check, and the CoLab VM had expired (as expected). Relaunched, and I got a T4.

Uploaded your executable, and got a warning about not being able to find the CUDA 10.1 library. Apparently, CoLab is 10.0.

Uploaded your entire archive, unpacked, and "make clean" and then "make" in the src/ directory. Moved the resulting executable into place.

The T4 started at 2,080 GhzD/D, and then quickly throttled down to ~1,760 GHzD/D.

Wow!
chalsall is online now   Reply With Quote
Old 2019-09-05, 05:17   #16
hansl
 
hansl's Avatar
 
Apr 2019

110011012 Posts
Default

Quote:
Originally Posted by chalsall View Post
Uploaded your executable, and got a warning about not being able to find the CUDA 10.1 library. Apparently, CoLab is 10.0.
Weird, seems like maybe a mix of CUDA versions across their machines then. Because the log I posted on the previous page showed "CUDA driver version 10.10", so I figured they were all the same as that.

Quote:
Originally Posted by chalsall View Post
Uploaded your entire archive, unpacked, and "make clean" and then "make" in the src/ directory. Moved the resulting executable into place.

The T4 started at 2,080 GhzD/D, and then quickly throttled down to ~1,760 GHzD/D.

Wow!
Awesome! That's great that it built without much trouble (I assume you had to edit the Makefile slightly though, since I used my local path for cuda 10.1 in there).

And very nice throughput on that T4!
hansl is offline   Reply With Quote
Old 2019-09-05, 06:01   #17
xx005fs
 
"Eric"
Jan 2018
USA

22×53 Posts
Default

Any Cuda 10.0 builds with 7.5 compute capabilities in the wild? Just in case I can use their T4s in the system. I sadly don't have a Linux system on hand to build them and I want to use this feature as well.
xx005fs is offline   Reply With Quote
Old 2019-09-05, 12:46   #18
De Wandelaar
 
De Wandelaar's Avatar
 
"Yves"
Jul 2017
Belgium

73 Posts
Default

Hello,
I'm setting up a notebook with the mfaktc built for CUDA 10.1 provided by hansl (see https://www.mersenneforum.org/showthread.php?t=12827, message #3189).
Still a problem when running it :

Code:
CUDA version info
  binary compiled for CUDA  10.10
  CUDA runtime version      0.75
  CUDA driver version       10.10
ERROR: CUDA runtime version must match the CUDA toolkit version used during compile!.
What should I do ? Should I use another version of mfaktc ?
Many thanks for your help,
Yves
De Wandelaar is offline   Reply With Quote
Old 2019-09-05, 13:30   #19
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

23×433 Posts
Default

Quote:
Originally Posted by De Wandelaar View Post
What should I do? Should I use another version of mfaktc?
Yup. Please try the attached.

It's just the executable produced from hansl's archive with a slightly tweaked Makefile, and built on CoLab. This also means the "LD_LIBRARY_PATH" stuff isn't needed, and it works for both K80s and T4.

Unpack it, upload it, rename it and you should be good to go.
Attached Files
File Type: tgz mfaktc_colab.tgz (622.1 KB, 258 views)
chalsall is online now   Reply With Quote
Old 2019-09-05, 13:38   #20
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

23×433 Posts
Default

Quote:
Originally Posted by hansl View Post
I assume you had to edit the Makefile slightly though, since I used my local path for cuda 10.1 in there.
Yeah. I just changed the path to be "/usr/local/cuda", and changed "CC = gcc-8" to be "gcc". Beyond that it built with no problems; it appears a full CUDA Dev environment is present on the VM!

Quote:
Originally Posted by hansl View Post
And very nice throughput on that T4!
Indeed! Now I'm going to whine when I only get a K80!
chalsall is online now   Reply With Quote
Old 2019-09-05, 14:41   #21
De Wandelaar
 
De Wandelaar's Avatar
 
"Yves"
Jul 2017
Belgium

7310 Posts
Default

@Chalsall :
Thanks for your answer !
Same error occurs after copying the new version of the program.

Here my script :
Code:
import os.path
from google.colab import drive

if not os.path.exists('/content/drive/My Drive'):
  drive.mount('/content/drive')

%cd '/content/drive/My Drive/mfaktc-0.21/'

!chmod 755 '/content/drive/My Drive/mfaktc-0.21/mfaktc.exe'
!cd '.' && /content/drive/My\ Drive/mfaktc-0.21/mfaktc.exe

!cat 'results.txt'
I'll keep on trying ...
De Wandelaar is offline   Reply With Quote
Old 2019-09-05, 14:53   #22
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

100110111001112 Posts
Default

Quote:
Originally Posted by De Wandelaar View Post
I'll keep on trying ...
Are you sure your Runtime is set for GPU hardware acceleration?

Also, I would suggest you try moving the executable over to /usr/local/bin/ and then do the chmod stuff. cd into a directory which has the mfaktc.ini and worktodo.txt files and launch.

My notebook above should work fine for this.

P.S. Also, under the Tools menu -> Preferences, Miscellaneous, set the Power level to be Kitty Mode. Googlers have a great sense of humor!

Edit: Oh, also, are you sure you're trying to run the executable I provided? It appears you might be still trying to use hansl's. This is the output I get using my executable:
Code:
CUDA version info
  binary compiled for CUDA  10.0
  CUDA runtime version      10.0
  CUDA driver version       10.10

Last fiddled with by chalsall on 2019-09-05 at 15:04
chalsall is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Alternatives to Google Colab kriesel Cloud Computing 11 2020-01-14 18:45
Notebook enzocreti enzocreti 0 2019-02-15 08:20
Computer Diet causes Machine Check Exception -- need heuristics help Christenson Hardware 32 2011-12-25 08:17
Computer diet - Need help garo Hardware 41 2011-10-06 04:06
Workunit diet ? dsouza123 NFSNET Discussion 5 2004-02-27 00:42

All times are UTC. The time now is 17:53.


Fri Oct 22 17:53:40 UTC 2021 up 91 days, 12:22, 0 users, load averages: 1.33, 1.39, 1.41

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.