mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2016-11-18, 06:01   #2685
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009
U.S.A.

168910 Posts
Default

A simple question: Does mfaktc read the entire worktodo file into a queue, or does it take it one line at a time?

storm5510 is offline   Reply With Quote
Old 2016-11-18, 06:49   #2686
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

2×33×59 Posts
Default

Pretty sure it reads line by line, skipping any invalid lines, until it finds a valid assignment.
Once it has finished the assignment it rewrites the entire worktodo, minus the assignment-line it just completed.

Side note: disk I/O can be killer (e.g. 1-10MB/s sustained) for things like TF>1G where an assignment is completed every second or so and a large input buffer is maintained -- a RAM drive is essential.
James Heinrich is offline   Reply With Quote
Old 2016-11-18, 16:23   #2687
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009
U.S.A.

3·563 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
Pretty sure it reads line by line, skipping any invalid lines, until it finds a valid assignment.
Once it has finished the assignment it rewrites the entire worktodo, minus the assignment-line it just completed.

Side note: disk I/O can be killer (e.g. 1-10MB/s sustained) for things like TF>1G where an assignment is completed every second or so and a large input buffer is maintained -- a RAM drive is essential.
It reads it into a buffer, like I suspected. I've noticed that when it has a larger bit range, for example 272 to 274, it will write the intermediate stage when complete. This explains the add file feature. I have been stopping it to add assignments to the worktodo file.

TF>1G: I suspected there were people out there doing this but I had no idea of the magnitude of it. A very interesting page; I bookmarked it.
storm5510 is offline   Reply With Quote
Old 2016-11-26, 19:25   #2688
cseizert
 
cseizert's Avatar
 
"Curtis"
Sep 2016
Fort Collins, CO

2·5 Posts
Default

I think there would be a speedup for Pascal cards if the linux version were compiled with 8.0. Actually, I cannot run the current binaries unless I change the makefile and compile them for compute 6.1. But even if you can get this to run on a Pascal card in its current form, my experience suggests that there is a performance penalty for running binaries compiled for compute capability <6.0 cards on the 1080 or 1070.
cseizert is offline   Reply With Quote
Old 2017-01-14, 23:29   #2689
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

172028 Posts
Default

We've had a (FE) GTX 1060 card for several months but never got around to running mfaktc.



We tried it today and it just worked, out of the box, without anything extra needed! In the past we had to install the CUDA toolkit but we didn't today.

The card is doing roughly 530 GHz-d/day and the display has no lag whatsoever. The card is at 80 C and it is nearly silent. We didn't modify the fan curve or anything.

Xyzzy is offline   Reply With Quote
Old 2017-01-15, 04:54   #2690
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

13·773 Posts
Default

Do you have fan headroom to bring that down a bit from 80? I get nervous in the upper 70s.
kladner is offline   Reply With Quote
Old 2017-03-10, 03:02   #2691
planetclown
 
Feb 2012

5 Posts
Default

Are there updated linux64 binaries available for cuda 8? I don't see them in the download section or in this thread.

If not, how difficult would it be to compile them? I recently upgraded from a 970 to 1070 and am getting the 'cudaGetLastError() returned 8: invalid device function' error.

Thank you!
planetclown is offline   Reply With Quote
Old 2017-03-10, 16:29   #2692
planetclown
 
Feb 2012

58 Posts
Default

I took a stab at compiling the linux64 binaries myself using the cuda8 toolkit and it's running successfully. The GHz-d/day is hovering around 780 in the terminal for my GTX 1070, and nvidia-smi shows GPU utilization in the high 90's.

When compiling I added an nvcc flag for compute 6.1 capabilities. I also had to remove the existing line for compute 1.1 (Tesla?) since it wouldn't compile with that flag. Otherwise I left all settings the same as in the source file for mfaktc with cuda 6.5.

I copied the compiled mfaktc.exe and the libraries for cuda 8.0.61 on top of the existing folder structure for mfaktc with cuda 6.5. Attached is the result if anyone else is looking for or wants to test it.

Be aware I'm not an expert, so use at your own risk.
Attached Files
File Type: gz mfaktc-0.21.linux64.cuda80.tar.gz (1.38 MB, 62 views)
planetclown is offline   Reply With Quote
Old 2017-03-10, 18:30   #2693
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

112310 Posts
Default

Thank you
flashjh is offline   Reply With Quote
Old 2017-03-12, 15:27   #2694
bayanne
 
bayanne's Avatar
 
"Tony Gott"
Aug 2002
Yell, Shetland, UK

313 Posts
Default

Quote:
Originally Posted by planetclown View Post
I took a stab at compiling the linux64 binaries myself using the cuda8 toolkit and it's running successfully. The GHz-d/day is hovering around 780 in the terminal for my GTX 1070, and nvidia-smi shows GPU utilization in the high 90's.

When compiling I added an nvcc flag for compute 6.1 capabilities. I also had to remove the existing line for compute 1.1 (Tesla?) since it wouldn't compile with that flag. Otherwise I left all settings the same as in the source file for mfaktc with cuda 6.5.

I copied the compiled mfaktc.exe and the libraries for cuda 8.0.61 on top of the existing folder structure for mfaktc with cuda 6.5. Attached is the result if anyone else is looking for or wants to test it.

Be aware I'm not an expert, so use at your own risk.
Hmm, I wonder whether that would run on a Mac, which I have running another GPU project on cuda 8.0.53 ...
bayanne is offline   Reply With Quote
Old 2017-03-23, 22:49   #2695
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

2×3×5×37 Posts
Default stock 1080 Ti "Founders Edition"

Code:
# ./mfaktc.exe -tf 66362159 75 76
mfaktc v0.21 (64bit built)
[...]
CUDA device info
  name                      Graphics Device
  compute capability        6.1
  max threads per block     1024
  max shared memory per MP  98304 byte
  number of multiprocessors 28
  clock rate (CUDA cores)   1582MHz
  memory clock rate:        5505MHz
  memory bus width:         352 bit
[...]
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Mar 23 23:43 |    0   0.1% |  7.003   1h51m |   1481.90    82485    n.a.%
Mar 23 23:44 |    4   0.2% |  6.980   1h51m |   1486.78    82485    n.a.%
Mar 23 23:44 |    9   0.3% |  7.003   1h51m |   1481.90    82485    n.a.%
Mar 23 23:44 |   12   0.4% |  7.110   1h53m |   1459.59    82485    n.a.%
Mar 23 23:44 |   16   0.5% |  7.494   1h59m |   1384.80    82485    n.a.%
Mar 23 23:44 |   24   0.6% |  7.928   2h06m |   1309.00    82485    n.a.%
Mar 23 23:44 |   25   0.7% |  7.955   2h06m |   1304.55    82485    n.a.%
First 20-25 seconds: limited by power target (250W)
After 20-25 seconds: limited by thermal target, hovers around at ~190W. Reason need more fresh air in chassis.

Oliver

Last fiddled with by TheJudger on 2017-03-23 at 22:53
TheJudger is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1657 2020-10-27 01:23
The P-1 factoring CUDA program firejuggler GPU Computing 752 2020-09-08 16:15
"CUDA runtime version 0.0" when running mfaktc.exe froderik GPU Computing 4 2016-10-30 15:29
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51
World's dumbest CUDA program? xilman Programming 1 2009-11-16 10:26

All times are UTC. The time now is 07:53.

Sun Nov 29 07:53:39 UTC 2020 up 80 days, 5:04, 3 users, load averages: 1.46, 1.27, 1.15

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.