mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2021-12-22, 18:05   #3521
KEP
Quasi Admin Thing
 
KEP's Avatar
 
May 2005

2·491 Posts
Default KEP needs help for running mfaktc on Linux Mint 20.1

Dear everyone

I'm still very new at Linux - now it appears to be a problem, where having not 1 but 2 offline computers trying to run mfaktc

I need a guide from a-z on how to make it run.

Current OS is Linux Mint 20.1 (Cinnamon 64-bit)

I downloaded mfaktc-0.21.linux64.cuda11.2 and copied mfaktc from the compressed folder to a mersenne folder, where I had a mfaktc.ini and a worktodo.txt file

First trying to run mfaktc from a terminal, I typed this:

./mfaktc (enter)

This gave me a no permission message

This I came around, after a googling on the phone, right click and set permissions to read and write in all available selections on the mfaktc, mfaktc.ini and worktodo.txt files

Now back in the terminal I again typed this:

./mfaktc (enter)

This gave me an "no souch directory" no cudart.11.0 found (or something like that)

It appears after more googling that cuda 11.0 is not installed, and that may in fact be correct, because in the terminal when running:

nvidia msi, the terminal window told me that I have cuda 11.2 installed and driver 460.32 (if I recall correct)

This raises the question does mfaktc 0.21 not work with cuda 11.2?

When trying to install the drivers - wich is typical what is missed on windows I tryed running this file:

cuda_11.2.0_460.27.04_linux.run (Downloaded from Nvidia website) - using this command:

sudo sh cuda_11.2.0_460.27.04_linux.run (enter)

Now asking for and recieving my password

Then a warning that cuda files 11.2 is already installed and it is recommended to remove the package before installing and after that I aborted to avoid breaking anything.

Now KEP ask for a thorough guide, on how to run mfaktc on my 2 Linux Mint 20.1 64 bit machines. So can I download a libcudart11.0 file somewhere or find a version of mfaktc that actually reads and finds the cudadrivers on my Linux machines?

Any help is greatly appreciated, since I had hoped to get the 2 gt 1030 up and running with TF for mersenne numbers, before the severe nightcold (-15 degrees celcius) makes its entrance on the night between christmas eve and the day after.

PM is welcome, but maybe someone can come up with a guide to the newbies, that the newbies can find for the future - if such guide excist, someone can maybe refer me to such

Last question, is there no way, like on windows, to download the libcudart11.0 driver a memorystick and copy that driverfile to the folder where mfaktc is contained and then get mfaktc running? (Just like on windows)

Best regards

KEP
KEP is offline   Reply With Quote
Old 2022-03-19, 08:11   #3522
rebirther
 
rebirther's Avatar
 
Sep 2011
Germany

3,413 Posts
Default

Is someone able to compile the mfaktc app on Ubuntu 18? We need it for CUDA11.1/11.2, most of the users have no glibc2.29
rebirther is offline   Reply With Quote
Old 2022-03-19, 16:21   #3523
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009
Not U. + S.A.

9D816 Posts
Default

Quote:
Originally Posted by rebirther View Post
Is someone able to compile the mfaktc app on Ubuntu 18? We need it for CUDA11.1/11.2, most of the users have no glibc2.29
It's not that simple. I have a binary for Ubuntu 20.04.4. It is GTX-1080 specific. It has to be compiled to match the GPU it will be used on. Unless I am mistaken, it cannot be transplanted from one place to another.

Some members here guided me through the process, and I am grateful they took the time. I took a lot of notes, but I don't know if what I have would work with Ubuntu 18. This is for someone else to say.
storm5510 is offline   Reply With Quote
Old 2022-03-19, 16:54   #3524
paulunderwood
 
paulunderwood's Avatar
 
Sep 2002
Database er0rr

2·33·83 Posts
Default

Quote:
Originally Posted by rebirther View Post
Is someone able to compile the mfaktc app on Ubuntu 18? We need it for CUDA11.1/11.2, most of the users have no glibc2.29
Having installed an nVidia graphics driver, you can compile the source.

sudo apt-get install build-essential nvidia-cuda-toolkit

Download the source (second in the list) from https://www.mersenneforum.org/mfaktc/mfaktc-0.21/ and untar the source with:

tar -zxvf mfaktc-0.21.tar.gz

Edit the Makefile in the src directory to suit your GPU:

https://en.wikipedia.org/wiki/CUDA#GPUs_supported

Comment out the lines:

Code:
# generate code for various compute capabilities
NVCCFLAGS += --generate-code arch=compute_11,code=sm_11 # CC 1.1, 1.2 and 1.3 GPUs will use this code (1.0 is not possible for mfaktc)
NVCCFLAGS += --generate-code arch=compute_20,code=sm_20 # CC 2.x GPUs will use this code, one code fits all!
NVCCFLAGS += --generate-code arch=compute_30,code=sm_30 # all CC 3.x GPUs _COULD_ use this code
NVCCFLAGS += --generate-code arch=compute_35,code=sm_35 # but CC 3.5 (3.2?) _CAN_ use funnel shift which is useful for mfaktc
And add your own for your paticular GPU, eg a GTX 1080 it is:

Code:
NVCCFLAGS += --generate-code arch=compute_61,code=sm_61
Run make in the src directory. The executable will be placed in one directory up

Last fiddled with by paulunderwood on 2022-03-19 at 17:01
paulunderwood is offline   Reply With Quote
Old 2022-03-19, 17:38   #3525
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009
Not U. + S.A.

47308 Posts
Default

Quote:
Originally Posted by paulunderwood View Post
...Run make in the src directory. The executable will be placed in one directory up
Once complete, a folder titled mfaktc-0.21 will exist in the Downloads folder. The entire folder can then be moved to the Home folder. It makes more sense, to me at least, to move it there. Only one folder change from Home to get there and run it once a terminal is opened.

@paulunderwood: Any idea why the executable has a .exe file extension? It seems sort of strange in this context.
storm5510 is offline   Reply With Quote
Old 2022-03-19, 18:16   #3526
paulunderwood
 
paulunderwood's Avatar
 
Sep 2002
Database er0rr

2·33·83 Posts
Default

Quote:
Originally Posted by storm5510 View Post

@paulunderwood: Any idea why the executable has a .exe file extension? It seems sort of strange in this context.
Maybe just having one file as compilation makes it easier. ".exe" is standard for Windows.
paulunderwood is offline   Reply With Quote
Old 2022-03-26, 18:33   #3527
Ethan (EO)
 
Ethan (EO)'s Avatar
 
"Ethan O'Connor"
Oct 2002
GIMPS since Jan 1996

2×72 Posts
Default

Quote:
Originally Posted by TheJudger View Post
... I have had many issues when creating binaries for Windows, from one day to another day without changing the code I wasn't able a binary which works (guess windows update caused the change)... others had similiar issues while others can compile... I have no clue what the root cause is. Maybe it is the code, maybe it is the why the Makefile calls the compiler... I'm anoyed as hell about this (while I know it is possible that I caused the issue by myself...)!

Oliver
Oliver, I recently built mfaktc on Windows for the first time in quite a while, and I was running into maddening issues that sound similar to the above (Invalid Device Function errors on kernel launch despite careful driver/runtime/binary matching, etc.)

For me, removing whole-program optimization flags from the compiler and linker seem to have eliminated the problem! So no /GL or /ltgc. This makes some sense as a root cause in terms of the cuda build pipeline, and makes no discernable difference in factoring speed for me on 1080ti or 3090.

I don't have an updated makefile to link to right now, but anyone building mfaktc on windows should probably remove /ltgc from LFLAGS and /GL from CFLAGS (and -Xcompiler).

Ethan
Ethan (EO) is offline   Reply With Quote
Old 2022-03-28, 16:13   #3528
nullcure
 
nullcure's Avatar
 
Feb 2022

7 Posts
Talking Hello Everyone

Hello guys I've been following all of your works since last November '21.

I am fascinated by this GIMPS project.

I've been able to dedicate

* RTX2080 ULTRA XC2
* Ryzen 9 5950X Zen 3 Vermeer
* Ryzen 7 1700 Original Zen
* An Ocassional T4 from Colab.
* "Intel Core2 Duo E8400 @ 3.00GHz Linux64,v30.8,build 11" (ECM small only and Cert work)


Here is my current contribution to the cause.

226 nullcure 44763.429 | *** 279 76 7 | 72.5 1.3 17.8 6.5 0.6 1.0 |

--------------------------------------------------------------

I've spent a great deal of time researching all of your forum posts and works on different programs I've got the RTX2080 automated with

MISFIT & PS D:\GIMPS\mfaktfc> .\mfaktc-win-64.exe


mfaktc v0.21 (64bit built)

Compiletime options
THREADS_PER_BLOCK 256
SIEVE_SIZE_LIMIT 32kiB
SIEVE_SIZE 193154bits
SIEVE_SPLIT 250
MORE_CLASSES enabled

Runtime options
SievePrimes 100000
SievePrimesAdjust 1
SievePrimesMin 2000
SievePrimesMax 200000
NumStreams 10
CPUStreams 5
GridSize 3
GPU Sieving enabled
GPUSievePrimes 82486
GPUSieveSize 128Mi bits
GPUSieveProcessSize 8Ki bits
Checkpoints enabled
CheckpointDelay 30s
WorkFileAddDelay disabled
Stages enabled
StopAfterFactor bitlevel
PrintMode full
V5UserID (none)
ComputerID (none)
AllowSleep no
TimeStampInResults yes

CUDA version info
binary compiled for CUDA 10.0
CUDA runtime version 10.0
CUDA driver version 11.60

CUDA device info
name NVIDIA GeForce RTX 2080
compute capability 7.5
max threads per block 1024
max shared memory per MP 65536 byte
number of multiprocessors 46
clock rate (CUDA cores) 1815MHz
memory clock rate: 7000MHz
memory bus width: 256 bit

Automatic parameters
threads per grid 753664
GPUSievePrimes (adjusted) 82486
GPUsieve minimum exponent 1055144

running a simple selftest...
Selftest statistics
number of tests 107
successfull tests 107

selftest PASSED!

got assignment: exp=14200031 bit_min=72 bit_max=73 (67.36 GHz-days)
Starting trial factoring M14200031 from 2^72 to 2^73 (67.36 GHz-days)
k_min = 166280146951380
k_max = 332560293908488
Using GPU kernel "barrett76_mul32_gs"

found a valid checkpoint file!
last finished class was: 1681
found 1 factor(s) already

[date time] exponent [TF bits]: percent class #, seq | GHZ | time | ETA | #FCs | rate | SieveP. |

[Mar 28 11:51] M14200031 [72-73]: 36.8% 1693/4620,353/960 | 3379.25 | 1.794s | 18m09s | 35.99G | 20062.1M/s | 82485 |

PS D:\GIMPS\mfaktfc>
----------------------------------------

Theres my mfaktc stats its running back logged assignments from gpu72 (Primarily because my Ryzen 9 does PRP 2 days faster than the RTX2080)

Now I've compiled Gpuowl from github

I feel like I'm missing something in terms of GpuOwl and PRP testing on the RTX0280 Is there really no way to make it PRP test faster than the 5950X?

---------------------------------------------

And please feel free to ask me anything, I'll help out and contribute where I can.

I came onto the computer scene back in 1995 Windows 95 Those phreaking things. ;-)

I have 3 systems up and running Windows Server with 32 GB DDR4 3200 used for small lab work I do have RDP and free user accounts over VPN it's literally used for “Lab work” it's a test system for people to throw rocks at.

I have the older intel core2 running headless Linux…

Accounts and SSH available and is party of the “Test Lab” please don't break the DDR2 memory ordering from China takes forever. :-)

and last my main system the 5950X on Win11 (Linux WSL2 with X Gui support forced my hand to upgrade from win10)

There will be no public access to this system, lol, this is my baby.

---------------------------------

So again.

Whatever I can do to help contribute, I have 2 systems for private use of friends in a mini lab config for throwing rocks at.
(PM me for remote ssh or RDP accounts)

and secondly.

Is my Ryzen 5950X supposed to PRP faster than the RTX2080 on GpuOwl that I've git compiled for win x64?

(Attached latest GpuOwl compiled for windows x64 non-dirty)

I am also having issues compiling mfaktc source.

Am I missing anything, as I'm only 4 months into this project.




Thanks all.
Attached Thumbnails
Click image for larger version

Name:	gimps_worx.png
Views:	64
Size:	329.9 KB
ID:	26701  
Attached Files
File Type: 7z gpuowl-v7.2-91-g9c22195.7z (616.3 KB, 42 views)
nullcure is offline   Reply With Quote
Old 2022-03-28, 17:19   #3529
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2×3,677 Posts
Default

Welcome to the forum and to GIMPS.
NVIDIA RTX20xx, GTX 16xx, and even more so RTX 30xx are far faster (more productive) at TF (than PRP or P-1 or LL), because they have exceptionally low DP/SP performance ratios (1/32 or lower). Teslas, or AMD GPUs, tend to have closer SP/DP ratios.
See for example the theoretical performance figures at
https://www.techpowerup.com/gpu-spec...rtx-2080.c3224
https://www.techpowerup.com/gpu-specs/radeon-vii.c3358

For a large and growing compilation of reference info, see this thread and note the beginning reading recommendations there.

Last fiddled with by kriesel on 2022-03-28 at 17:21
kriesel is offline   Reply With Quote
Old 2022-03-29, 18:52   #3530
nullcure
 
nullcure's Avatar
 
Feb 2022

7 Posts
Default

Thank you Kriesel, Funny you responded. I had posted on github an issue I had with your mingw64 MSYS2 64bit windows compilation guide oif gpuowl.

Though there was nothing wrong with you guide but an issue I encountered during the compile process.

When gpuowl was compiling a file called gpuowl-exanded (i forget the extension) would result in a file size of 0.

So get around this I kept the file opened in notepad during the compile so the contents of gpuowl-exanded remained.

Problem solved. Though I'm not sure what was causing it.


So you're saying if I was to have an AMD Gpu a premium one that it would outperform the 5950x in PRP crunch time?
nullcure is offline   Reply With Quote
Old 2022-03-29, 20:29   #3531
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2×3,677 Posts
Default

Quote:
Originally Posted by nullcure View Post
So you're saying if I was to have an AMD Gpu a premium one that it would outperform the 5950x in PRP crunch time?
I have no personal experience with an AMD 5950x. But I trust the crack programmers. George, Mihai, and Ernst all own RadeonvII GPUs and run wavefront PRP on them with Gpuowl v6.11-3xx or v7.whatever, which include proof file generation and GEC.
Try running a DC using PRP, or LLDC, on your 5950x. Can you beat 9 hours for ~65M exponent? 8.5 hours?

On gpuowl v6.11-380, AMD Radeon VII, Windows 10, GPU memory clock at 1200MHz, GPU power reduced by 20%, stock voltage curve (NTP time sync system):
Code:
2022-03-27 16:16:44 test/radeonvii 65005679 FFT: 3.50M 1K:7:256 (17.71 bpw)
2022-03-27 16:16:44 test/radeonvii Expected maximum carry32: 2CD70000
2022-03-27 16:16:45 test/radeonvii OpenCL args "-DEXP=65005679u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=7u -DPM1=0 -DAMDGPU=1 -DWEIGHT_STEP_MINUS_1=0xe.1b17042cf73dp-6 -DIWEIGHT_STEP_MINUS_1=-0xb.8eee6898b4078p-6 -DNO_ASM=1  -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2022-03-27 16:16:54 test/radeonvii OpenCL compilation in 9.08 s
2022-03-27 16:16:54 test/radeonvii 65005679 LL        0 loaded: 0000000000000004
...
2022-03-28 01:14:38 test/radeonvii 65005679 LL 65005677 100.00%;  517 us/it; ETA 0d 00:00; bd325b7241d0a681
2022-03-28 01:14:38 test/radeonvii waiting for the Jacobi check to finish..
2022-03-28 01:15:20 test/radeonvii 65005679 OK 65000000 (jacobi == -1)
2022-03-28 01:15:20 test/radeonvii {"status":"C", "exponent":"65005679", "worktype":"LL", "res64":"bd325b7241d0a681", "fft-length":"3670016", "shift-count":"0", "program":{"name":"gpuowl", "version":"v6.11-380-g79ea0cc"}, "user":"kriesel", "computer":"test/radeonvii", "aid":"(redacted)", "timestamp":"2022-03-28 06:15:20 UTC"}
2022-03-27 16:16:44 to 2022-03-28 01:15:20 is 9 hours minus 1:24. or ~8.977 hours.
If I boost it back up to nominal power it will go somewhat faster, maybe ~5%. A quick test on a few hundred K iterations of a higher exponent indicates ~5.4%, so ~8.518 hours estimated for 65M exponent at nominal power.

https://www.mersenne.org/report_expo...exp_hi=&full=1

Last fiddled with by kriesel on 2022-03-29 at 20:39
kriesel is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1719 2023-01-16 15:51
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 42 2022-12-18 05:59
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 10:23.


Fri Jan 27 10:23:57 UTC 2023 up 162 days, 7:52, 0 users, load averages: 1.81, 1.18, 1.07

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔