mersenneforum.org  

Go Back   mersenneforum.org > New To GIMPS? Start Here! > Information & Answers

Reply
 
Thread Tools
Old 2022-09-28, 03:39   #1
Rubiksmath
 
Sep 2022

47 Posts
Default GMP-ECM GPU for WSL problems

So yeah, ive spent a couple days trying (~20h total) and almost not failing to successfully install a gpu-enabled version of GMP-ECM for WSL, and the majority of problems all seem to circle back to the CUDA installation for WSL being weird or something.

On the host machine, nvcc --version gives 11.2 but nvidia-smi gives 11.7, this may make sense as I installed 11.7 earlier but had to uninstall it and install 11.2. If this is a problem please let me know as it would save a lot of pain if the issue isnt with WSL at all.

On the WSL: this is a very,very,very cut down version of the rundown of what I am facing as a full synopsis would take all day to read. Anyway, here goes:
Fresh WSL-ubuntu on WSL windows 10 (I think I am up to fresh install attempt 5 or 6). Installed packages make, automake, libtools, libgmp3-dev, probably others but I forgot.

Cloned the Gitlab Zimmerman ecm.

Now for the CUDA Part: so the WSL instructions say to not install the drivers and all cause the host CUDA imprints in some way, okay got it. Ive tried installing 11.7 and 11.2 but I think before I continue I must explain this. So I was getting stuff like "unable to locate package cuda" when trying to install it the way it recommends, Ive been told this is cause of the GPG key but I dont know if i fixed it or not, anyway its complicated and I best move on.

After installing CUDA 11.2 via a local runfile, (not installing drivers cause it warns me not to, but actually ive tried installing drivers and it makes no difference), and running libtoolize, autoconf, and then ./configure --enable-gpu I get the error that cuda.h was not found. I have looked online at this and tried a lot of fixes (including updating PATH) but the only one that works is to run

$ sudo apt-get install nvidia-cuda-toolkit

However now I get an error telling me that cuda.h and cudart are different versions, and i get the same results for nvcc --version and nvidia-smi as I do on the host,

however this may be because I have tried installing CUDA so many different ways and in different versions and such that I may have duplicate or clashing installs and all that cause I have tried so much in troubleshooting it's not funny. Edit: can confirm nvidia-smi just reads it from the host so I guess I dont know what the cudart version is, anyone know how to find it?

I actually did get the app to compile with no error one time but the make check for test.gpuecm failed and running a gpu curve results in a kernel image error, dont remember specifically but it is likely irrelevant as I have wiped the install 5 times since then. is there anything anyone knows I can do to fix it, or am I msising something obvious or doing something wrong (or maybe is my troubleshooting causing it to fail cause I try and install so much other ways and whatever)? would be appreciated as for the last 2 days I really have been pulling my teeth out over this (or maybe what if it is not compatible with wsl somehow?).

Last fiddled with by Rubiksmath on 2022-09-28 at 03:50 Reason: new info
Rubiksmath is offline   Reply With Quote
Old 2022-09-28, 05:16   #2
Rubiksmath
 
Sep 2022

47 Posts
Default

Update: I have gotten a bit farther now by directly pointing to the cuda.h file in the configure step, however now it reaches this point:

checking for cuInit in -lcuda... no
configure: could not find cuda lib

So i went ok and pointed to the cuda lib in the configure doing
--with-cuda-lib=/usr/local/cuda-11.7/targets/x86_64-linux/lib

same error. pointing to lib64 again same error. so yeah now I am genuinely stumped
Rubiksmath is offline   Reply With Quote
Old 2022-09-29, 02:15   #3
Rubiksmath
 
Sep 2022

47 Posts
Default

Okay so I am getting closer, I had to point to /mnt/c/windows/system32/lxss/lib to get the lib to be found but now it throws the error of different versions again, I am feeling like the host is the problem with this 11.2/11.7 duality, problem is I wiped the 11.7 clean so I dont know where it is getting this from, but whatever.
Rubiksmath is offline   Reply With Quote
Old 2022-09-29, 04:07   #4
Rubiksmath
 
Sep 2022

47 Posts
Default

update: bruh I wiped it all clean and did full reinstall of all nvidia and removed all nvidia files and nope same error so i officially dont have a clue
Rubiksmath is offline   Reply With Quote
Old 2022-09-29, 04:38   #5
Rubiksmath
 
Sep 2022

2F16 Posts
Default

Ok bruh this is getting mysterious like what? I checked cat config.log for more info and it says cant find -lcudart so I go okay, ill ad a symlink where its looking for it. I do that, and now it says ohno cant find -lcuda, so I change the path I am pointing to to one which has that in it and make a symlink to -lcudart in there. The error now reverts back to cannot find -lcudart, and when I cd to the directory I put the symlink in, DIR shows that libcudart.so is there but the kernel cant find it, pressing tab with libcuda only yields libcuda.so and doing find -name libcudart returns no results. I think I will have to give up at this point, seriously.
Rubiksmath is offline   Reply With Quote
Old 2022-09-29, 04:57   #6
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

3×17×73 Posts
Default

I haven't tried getting Cuda to work on my WSL install, and now I'm thinking I don't want to try

The stuff you are trying makes sense... wish I could help.
bsquared is offline   Reply With Quote
Old 2022-09-29, 05:07   #7
Rubiksmath
 
Sep 2022

47 Posts
Default

Quote:
Originally Posted by bsquared View Post
I haven't tried getting Cuda to work on my WSL install, and now I'm thinking I don't want to try

The stuff you are trying makes sense... wish I could help.
Yeah I may just have to bite the bullet and install ubuntu to my computer as a dual boot (or more likely just install it and use boot menu), cause there are very well written guides on how to get it done from a clean install and a this point WSL isn't looking too viable (for me at least). As i've said i've never seen anyone do it on wsl and maybe this is why...
Rubiksmath is offline   Reply With Quote
Old 2022-09-29, 10:34   #8
Rubiksmath
 
Sep 2022

47 Posts
Default

Update: okay so I bit the bullet and installed Ubuntu 22.04 LTS on my machine, great. but even that is not enough! After installing the Nvidia drivers with toolkit 11.7 (an ordeal in itself with nouveau disabled preventing boot), running autoreconf -i gives a ton of warnings about macros being obsolete, and needing to run autoupdate. I ran autoupdate, no change in warnings. I was not getting these warnings on WSL. Any attempt to run ./configure predictably results in immediate failure the instant the code reaches one of these obsolete macros. What more can I possibly do? this is now like well-known as lots of people have installed GMPECM GPU on Ubuntu, yet I seem to still somehow be able to get errors nobody else has ever gotten. I need a break; I'll be back in the morning for day #5 of trying to install this seemingly *insert rage here* software.
Rubiksmath is offline   Reply With Quote
Old 2022-09-30, 00:56   #9
Rubiksmath
 
Sep 2022

578 Posts
Default

Okay, so if anyone is wondering, here is the exact output from ./configure starting from the first error:

./configure line 4032: Some: command not found
./configure line 4616: syntax error near unexpected token `('
./configure line 4616: `case "(($ac_try" in'

What to do to fix this?

Last fiddled with by Rubiksmath on 2022-09-30 at 00:59 Reason: making it 100% replica
Rubiksmath is offline   Reply With Quote
Old 2022-09-30, 01:21   #10
Rubiksmath
 
Sep 2022

47 Posts
Default

Alright, I finally did it. It works. Had to get version 7.0.5 not version 7.0.6-dev which seems unstable. make check functions for gpuecm. woohoo. that was one heck of a journey.
Rubiksmath is offline   Reply With Quote
Old 2022-09-30, 03:44   #11
Rubiksmath
 
Sep 2022

47 Posts
Default

Okay, well, almost working. After a bit the program just says "killed", and I've read up and it seems like that is caused by too much memory usage, and you have to use -param 0 to get around it, but param 0 is not compatible with GPU use. Please tell me how to fix? thanks.
Rubiksmath is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Just a few problems. storm5510 YAFU 3 2019-10-21 22:25
PC problems Nimras Information & Answers 6 2009-12-15 21:24
Readline problems CRGreathouse Software 11 2009-07-07 05:18
Need help with few problems Laserjet Hardware 1 2007-10-13 10:59
Two problems gribozavr Puzzles 11 2007-02-05 05:46

All times are UTC. The time now is 20:14.


Mon Dec 5 20:14:23 UTC 2022 up 109 days, 17:42, 0 users, load averages: 1.34, 1.02, 0.86

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔