mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GpuOwl (https://www.mersenneforum.org/forumdisplay.php?f=171)
-   -   gpuowl: runtime error (https://www.mersenneforum.org/showthread.php?t=23117)

SELROC 2018-03-01 19:00

gpuowl: runtime error
 
This on debian buster,
Here's an extract of the program output:


Note: using short, fused carry and fused tail kernels
OpenCL compilation in 616 ms, with " -DEXP=84674341u -I. -cl-fast-relaxed-math -cl-kernel-arg-info "
PRP-3: FFT 5000K (625 * 4096 * 2) of 84674341 (16.54 bits/word) [2018-03-01 19:57:46 CET]
Starting at iteration 0
error -55 (fft4K)
gpuowl: clwrap.h:267: void run(cl_queue, cl_kernel, size_t, size_t, const string&): Assertion `check(clEnqueueNDRangeKernel(queue, kernel, 1, __null, &workSize, &groupSize, 0, __null, __null), name.c_str())' failed.
Aborted

preda 2018-03-01 20:17

-55 is "invalid work item size". Apparently the OpenCL system you're using does not support a 512 workgroup size.

[QUOTE=SELROC;481250]This on debian buster,
Here's an extract of the program output:


Note: using short, fused carry and fused tail kernels
OpenCL compilation in 616 ms, with " -DEXP=84674341u -I. -cl-fast-relaxed-math -cl-kernel-arg-info "
PRP-3: FFT 5000K (625 * 4096 * 2) of 84674341 (16.54 bits/word) [2018-03-01 19:57:46 CET]
Starting at iteration 0
error -55 (fft4K)
gpuowl: clwrap.h:267: void run(cl_queue, cl_kernel, size_t, size_t, const string&): Assertion `check(clEnqueueNDRangeKernel(queue, kernel, 1, __null, &workSize, &groupSize, 0, __null, __null), name.c_str())' failed.
Aborted[/QUOTE]

SELROC 2018-03-01 20:20

[QUOTE=preda;481254]-55 is "invalid work item size". Apparently the OpenCL system you're using does not support a 512 workgroup size.[/QUOTE]

Thanks, what should I do to make the program work ?

perhaps install a different opencl package ?

SELROC 2018-03-01 20:47

[QUOTE=SELROC;481255]Thanks, what should I do to make the program work ?

perhaps install a different opencl package ?[/QUOTE]
I can modify the program if this is necessary, but I would need guidance from the author

SELROC 2018-03-02 08:25

[QUOTE=preda;481254]-55 is "invalid work item size". Apparently the OpenCL system you're using does not support a 512 workgroup size.[/QUOTE]

Effectively the max work group size is 256.

How should I modify the program to make it work with this hardware ?

Thank you

preda 2018-03-02 08:32

[QUOTE=SELROC;481314]Effectively the max work group size is 256.

How should I modify the program to make it work with this hardware ?

Thank you[/QUOTE]

There's no easy way (to use workgroup 256 in this situation) I can think of.

If you install amdgpu-pro or ROCm, you should be able to use WG up to 1024.

In the next update I'll try to move back to 256.

SELROC 2018-03-02 08:50

[QUOTE=preda;481315]There's no easy way (to use workgroup 256 in this situation) I can think of.

If you install amdgpu-pro or ROCm, you should be able to use WG up to 1024.

In the next update I'll try to move back to 256.[/QUOTE]


Thanks very much, I look forward for the mods.

selroc

SELROC 2018-03-02 17:04

[QUOTE=preda;481315]There's no easy way (to use workgroup 256 in this situation) I can think of.

If you install amdgpu-pro or ROCm, you should be able to use WG up to 1024.

In the next update I'll try to move back to 256.[/QUOTE]

Somehow I got the program to work by reinstalling a fresh debian testing and amdgpu-pro

It is running right now

SELROC 2018-03-02 17:22

[QUOTE=SELROC;481366]Somehow I got the program to work by reinstalling a fresh debian testing and amdgpu-pro

It is running right now[/QUOTE]



gpuOwL v2.0--mod GPU Mersenne primality checker
Ellesmere-36x1360-@4:0.0 Radeon RX 580 Series
Note: using short, fused carry and fused tail kernels
OpenCL compilation in 628 ms, with " -DEXP=84701459u -I. -cl-fast-relaxed-math -cl-kernel-arg-info "
PRP-3: FFT 5000K (625 * 4096 * 2) of 84701459 (16.54 bits/word) [2018-03-02 18:05:01 CET]
Starting at iteration 142500
OK 142500 / 84701459 [ 0.17%], 0.00 ms/it [0.00, 0.00], check 3.73s; ETA 0d 00:00; 821d3202550d3c23 [18:05:06]
OK 143000 / 84701459 [ 0.17%], 5.29 ms/it [5.29, 5.29], check 3.33s; ETA 5d 04:21; 71d0fd7863001c6d [18:05:11]
OK 144000 / 84701459 [ 0.17%], 5.00 ms/it [4.70, 5.29], check 3.32s; ETA 4d 21:24; 5f059bf89d226260 [18:05:20]
OK 145000 / 84701459 [ 0.17%], 5.01 ms/it [4.71, 5.30], check 3.32s; ETA 4d 21:36; 5055eab68c2355f2 [18:05:28]
OK 150000 / 84701459 [ 0.18%], 4.77 ms/it [4.71, 5.31], check 3.33s; ETA 4d 16:01; 6dc264064ce6830b [18:05:55]
OK 160000 / 84701459 [ 0.19%], 4.84 ms/it [4.72, 6.13], check 3.37s; ETA 4d 17:45; 7bd9b48b95f55663 [18:06:47]
OK 170000 / 84701459 [ 0.20%], 4.79 ms/it [4.72, 5.47], check 3.37s; ETA 4d 16:34; 4504231bf5b0b0af [18:07:38]
OK 180000 / 84701459 [ 0.21%], 4.80 ms/it [4.73, 5.49], check 3.37s; ETA 4d 16:46; 9500155d9891ec05 [18:08:30]
OK 200000 / 84701459 [ 0.24%], 4.79 ms/it [4.73, 5.38], check 3.40s; ETA 4d 16:20; 1cfb4f382ad7729b [18:10:09]
OK 220000 / 84701459 [ 0.26%], 4.79 ms/it [4.73, 5.40], check 3.38s; ETA 4d 16:21; 0d2bb5f1d3f7f8f0 [18:11:48]
OK 240000 / 84701459 [ 0.28%], 4.79 ms/it [4.73, 5.40], check 3.37s; ETA 4d 16:20; 009378cb131480c5 [18:13:27]
OK 260000 / 84701459 [ 0.31%], 4.82 ms/it [4.74, 5.41], check 3.38s; ETA 4d 17:04; 3c8ca29ba50ac4f6 [18:15:07]
OK 300000 / 84701459 [ 0.35%], 4.79 ms/it [4.73, 5.52], check 3.38s; ETA 4d 16:23; 2cef54e6f19b57f3 [18:18:22]

SELROC 2018-03-03 09:22

One thing I notice with two instances of gpuowl running: one instance gets stuck and the only way to stop it is reboot

SELROC 2018-03-10 15:02

[QUOTE=SELROC;481451]One thing I notice with two instances of gpuowl running: one instance gets stuck and the only way to stop it is reboot[/QUOTE]

Hello Mihai, have you attempted yet to reproduce the error ?

I have reinstalled debian-testing with amdgpu-pro and still getting the same error: if two instances of gpuowl are launched, the first remains in a blocked state and we can only reboot to stop it.

However, the normal reboot will not work (with a message: "watchdog did not stop") and we can only switch off the power to reboot.

My GPU hardware is Radeon RX 580

kriesel 2018-03-10 17:11

[QUOTE=SELROC;481995]Hello Mihai, have you attempted yet to reproduce the error ?

I have reinstalled debian-testing with amdgpu-pro and still getting the same error: if two instances of gpuowl are launched, the first remains in a blocked state and we can only reboot to stop it.

However, the normal reboot will not work (with a message: "watchdog did not stop") and we can only switch off the power to reboot.

My GPU hardware is Radeon RX 580[/QUOTE]

Yikes. Not a problem on Windows 7. If gpuowl and mfakto are accidentally run on the same RX 550 gpu at the same time, the system eventually reboots itself. First, gpuowl v1.9 is simply stalled a while.

SELROC 2018-03-10 18:49

[QUOTE=kriesel;482005]Yikes. Not a problem on Windows 7. If gpuowl and mfakto are accidentally run on the same RX 550 gpu at the same time, the system eventually reboots itself. First, gpuowl v1.9 is simply stalled a while.[/QUOTE]

I run two instances with -device parameter set differently. One instance gpu 0 and other instance gpu 1.

The first instance will stall, and cannot be stopped with ^C

kriesel 2018-03-10 21:35

[QUOTE=SELROC;482012]I run two instances with -device parameter set differently. One instance gpu 0 and other instance gpu 1.

The first instance will stall, and cannot be stopped with ^C[/QUOTE]

Right now I'm running mfakto, cllucas, and gpuowl on 3 RX550s in the same system. That's using "in"loosely, since due to PCIE slot spacing one is perched atop the tower case and connected via 1x-16x extender. I see about a 3% speed penalty in gpuOwLv1.9 with the extender.

I have seen occasional gpuOwL stalls; the gpu load is displayed as 100% via GPU-Z, and the progress in the console and log has stopped. I don't recall if that was v1.8 or 1.9.

Are you able to read the gpu sensor values ok on linux? Gpu core clock, gpu memory clock, and temperature get disabled on Windows 7 in GPU-Z on the RX550s when using Windows Remote Desktop, but not when using the local console or VNC remote access.

preda 2018-03-11 02:17

[QUOTE=SELROC;481995]Hello Mihai, have you attempted yet to reproduce the error ?

I have reinstalled debian-testing with amdgpu-pro and still getting the same error: if two instances of gpuowl are launched, the first remains in a blocked state and we can only reboot to stop it.

However, the normal reboot will not work (with a message: "watchdog did not stop") and we can only switch off the power to reboot.

My GPU hardware is Radeon RX 580[/QUOTE]
No I haven't looked into this yet, sorry. It seems to be a driver problem. When this happens, if you're on linux, could you look at the system error log with:
"dmesg", eventually filtering like: "dmesg | grep -i amd". If you see errors there it's likely a confirmation of driver errors.

SELROC 2018-03-11 09:37

1 Attachment(s)
[QUOTE=preda;482049]No I haven't looked into this yet, sorry. It seems to be a driver problem. When this happens, if you're on linux, could you look at the system error log with:
"dmesg", eventually filtering like: "dmesg | grep -i amd". If you see errors there it's likely a confirmation of driver errors.[/QUOTE]

I have looked at the dmesg output, only a couple of lines report something that looks like an error "kfd not supported on this ASIC", the rest of the lines look pretty normal. But just in case I attach the output.

preda 2018-03-11 11:09

Yes it looks fine. And you don't get anything new in dmesg when the lock happens..? interesting. I still can't really imagine how the OpenCL app can lock the process much less the whose OS unless some problem happens in deeper layers (kernel/driver).

[QUOTE=SELROC;482059]I have looked at the dmesg output, only a couple of lines report something that looks like an error "kfd not supported on this ASIC", the rest of the lines look pretty normal. But just in case I attach the output.[/QUOTE]

SELROC 2018-03-11 11:26

[QUOTE=preda;482061]Yes it looks fine. And you don't get anything new in dmesg when the lock happens..? interesting. I still can't really imagine how the OpenCL app can lock the process much less the whose OS unless some problem happens in deeper layers (kernel/driver).[/QUOTE]

I am doing another test and waiting for more time before shutting down the system, to see if any other messages are generated in dmesg.

kladner 2018-03-11 11:34

Please forgive me if I am misunderstanding your description, but are you running two instances on a single GPU? I am not running AMD hardware, but I do run two mfaktc instances on an Nvidia card. In my circumstances,[U] the instances run from different directories, but both are using -d 0.[/U] If I used -d 1 for one instance, it would run on my other GPU. I think with a single GPU, calling -d 1 would cause an error.

Could this being causing the difficulty?

SELROC 2018-03-11 11:53

[QUOTE=kladner;482064]Please forgive me if I am misunderstanding your description, but are you running two instances on a single GPU? I am not running AMD hardware, but I do run two mfaktc instances on an Nvidia card. In my circumstances,[U] the instances run from different directories, but both are using -d 0.[/U] If I used -d 1 for one instance, it would run on my other GPU. I think with a single GPU, calling -d 1 would cause an error.

Could this being causing the difficulty?[/QUOTE]


I have 2 GPU, 0 and 1

SELROC 2018-03-11 13:40

[QUOTE=SELROC;482063]I am doing another test and waiting for more time before shutting down the system, to see if any other messages are generated in dmesg.[/QUOTE]

Well, nothing else showed up in dmesg

Investigating further...

kladner 2018-03-11 21:16

[QUOTE=SELROC;482065]I have 2 GPU, 0 and 1[/QUOTE]
Sorry. I did not catch the 2 GPU part. I guess, out of habit, that I use "instance" in this context when speaking of more than one on the same card, though I know this is too limited a meaning.

preda 2018-03-12 00:03

Did you consider trying ROCm?
[url]https://github.com/RadeonOpenCompute/ROCm[/url]

(in my oppinion it's at least on par with amdgpu-pro performance-wise)

It may be interesting to see if it encounters the problem in the same way.
(OTOH this may be too much trouble just to debug this issue).

[QUOTE=SELROC;482063]I am doing another test and waiting for more time before shutting down the system, to see if any other messages are generated in dmesg.[/QUOTE]

SELROC 2018-03-12 09:16

[QUOTE=preda;482110]Did you consider trying ROCm?
[URL]https://github.com/RadeonOpenCompute/ROCm[/URL]

(in my oppinion it's at least on par with amdgpu-pro performance-wise)

It may be interesting to see if it encounters the problem in the same way.
(OTOH this may be too much trouble just to debug this issue).[/QUOTE]

I am doing a test with amdgpu-pro (which includes rocm in the packages), the problem seems to be in the BIOS settings on the motherboard, so I apologize for rising the issue which maybe isn't on the software area...

Later on I will post my findings

SELROC 2018-03-13 09:40

[QUOTE=SELROC;482137]I am doing a test with amdgpu-pro (which includes rocm in the packages), the problem seems to be in the BIOS settings on the motherboard, so I apologize for rising the issue which maybe isn't on the software area...

Later on I will post my findings[/QUOTE]

I went ahead and started another test from scratch, this time with ROCm from the official repo...

SELROC 2018-03-13 17:32

[QUOTE=SELROC;482221]I went ahead and started another test from scratch, this time with ROCm from the official repo...[/QUOTE]

with ROCm I get an error -30 and gpuowl does not run

the clinfo utility gives a message: NULL platform behavior

still trying to figure out which opencl package is good to install with ROCm

..I'm back to trying with amdgpu-pro

SELROC 2018-03-14 16:17

[QUOTE=SELROC;482257]with ROCm I get an error -30 and gpuowl does not run

the clinfo utility gives a message: NULL platform behavior

still trying to figure out which opencl package is good to install with ROCm

..I'm back to trying with amdgpu-pro[/QUOTE]


I have resolved. It was an hardware issue.

SELROC 2018-09-22 07:23

[QUOTE=SELROC;481451]One thing I notice with two instances of gpuowl running: one instance gets stuck and the only way to stop it is reboot[/QUOTE]


Two issues have been isolated here:


1) the GPUs riser card location in pcie slots must be in contiguous order with no holes,


2) the processor C-States bios setting must be disabled to avoid low power state.

preda 2018-09-22 10:18

[QUOTE=SELROC;496555]Two issues have been isolated here:

1) the GPUs riser card location in pcie slots must be in contiguous order with no holes,

2) the processor C-States bios setting must be disabled to avoid low power state.[/QUOTE]

You may want to report these, if they only happen with ROCm but not with amdgpu-pro -- then maybe report to "rocm issues" on github, so that they know about it.

SELROC 2018-09-22 16:09

[QUOTE=preda;496559]You may want to report these, if they only happen with ROCm but not with amdgpu-pro -- then maybe report to "rocm issues" on github, so that they know about it.[/QUOTE]


I was expecting more collaboration from the ROCm team. They just said that mine is "unsupported hardware" and closed the issue.

moebius 2020-08-30 23:25

I have now finally managed to create an executable binary for colab-Ubuntu from the preda repository. Unfortunately, this error occurs while running. What is wrong?

[B][SIZE="1"]2020-08-30 23:04:28 gpuowl v6.11-380-g79ea0cc-dirty
2020-08-30 23:04:28 Note: not found 'config.txt'
2020-08-30 23:04:28 device 0, unique id ''
2020-08-30 23:04:28 Tesla P100-PCIE-16GB-0 104930401 FFT: 5.50M 1K:11:256 (18.19 bpw)
2020-08-30 23:04:28 Tesla P100-PCIE-16GB-0 Expected maximum carry32: 50950000
2020-08-30 23:04:29 Tesla P100-PCIE-16GB-0 OpenCL args "-DEXP=104930401u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=11u -DPM1=0 -DMM2_CHAIN=1u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0x1.7ee28e7ec46ep-1 -DIWEIGHT_STEP_MINUS_1=-0x1.b620c8c81195dp-2 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2020-08-30 23:04:29 Tesla P100-PCIE-16GB-0

2020-08-30 23:04:29 Tesla P100-PCIE-16GB-0 OpenCL compilation in 0.00 s
2020-08-30 23:04:29 Tesla P100-PCIE-16GB-0 Exception gpu_error: INVALID_KERNEL clSetKernelArg(k, pos, sizeof(value), &value) at clwrap.h:77 setArg
2020-08-30 23:04:29 Tesla P100-PCIE-16GB-0 Bye[/SIZE][/B]

moebius 2020-08-31 09:04

[QUOTE=preda;496559.].....[/QUOTE]
Maybe preda can say something about it, I used the notebook from Kriesel to build

ATH 2020-08-31 10:12

Are your worktodo.txt lines of this format?:
PRP=<AID>,1,2,<exponent>,-1,75,0


You can also try this notebook which compiles gmp-6.2.0 and gpuowl with gcc9 in /root folder and copies the executable "gpuowl" to the root of the Google Drive, I just tested it:


[CODE]
import subprocess
import os

from google.colab import drive

if not os.path.exists('/content/drive/My Drive'):
drive.mount('/content/drive')

%cd ~
!sudo add-apt-repository -y ppa:ubuntu-toolchain-r/test
!sudo apt install -y gcc-9
!sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-9 800
!sudo apt install -y g++-9
!sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-9 800
!sudo apt-get -y install lzip
!sudo apt-get -y install m4
!sudo apt-get -y install libtool
!sudo apt-get -y install subversion
!sudo apt-get -y install make
!sudo apt-get -y install autoconf
!sudo apt-get -y install automake
!sudo wget https://gmplib.org/download/gmp/gmp-6.2.0.tar.lz
!sudo tar --lzip -xvf gmp-6.2.0.tar.lz
%cd gmp-6.2.0
!./configure ABI=64 CC=gcc CFLAGS="-O3 -m64 -mavx -mavx2" --build=x86_64-pc-linux-gnu --enable-cxx --enable-static --disable-shared
!make
!sudo make install

%cd ..
!git clone https://github.com/preda/gpuowl
%cd gpuowl
!make gpuowl
!cp gpuowl '/content/drive/My Drive/'

!cp /usr/lib/x86_64-linux-gnu/libstdc* '/content/drive/My Drive/'

[/CODE]

Last line will copy the 2 libstdc* files to your Google Drive.
Each time you need to run gpuowl without compiling it, you will need this line first (after connecting to Google Drive) to copy them back to the correct folder:

[CODE]!cp /content/drive/My\ Drive/libstdc* /usr/lib/x86_64-linux-gnu/[/CODE]

moebius 2020-08-31 11:14

[SIZE="1"][QUOTE=ATH;555507] I just tested it:[/QUOTE][/SIZE]
Unfortunately then this error message comes up,

[COLOR="Red"]/content/drive/My Drive/gpuowl-master/gpuowl.exe: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by /content/drive/My Drive/gpuowl-master/gpuowl.exe)[/COLOR]

[B]I tried the following without success, Maybe someone has a binary that works in general, I would be happy with it[/B]

!sudo add-apt-repository ppa:ubuntu-toolchain-r/test
!sudo apt-get update
!sudo apt-get install gcc-4.9 g++-4.9
!sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 60 --slave /usr/bin/g++ g++ /usr/bin/g++-4.9

preda 2020-08-31 12:13

[QUOTE=moebius;555490]2020-08-30 23:04:29 Tesla P100-PCIE-16GB-0 OpenCL compilation in 0.00 s
2020-08-30 23:04:29 Tesla P100-PCIE-16GB-0 Exception gpu_error: INVALID_KERNEL clSetKernelArg(k, pos, sizeof(value), &value) at clwrap.h:77 setArg
2020-08-30 23:04:29 Tesla P100-PCIE-16GB-0 Bye[/QUOTE]

Sorry I don't really know how to help there. The time of the OpenCL compilation (0.00s) is suspiciously fast. Maybe something didn't go well with the OpenCL compilation, but there is no error reported at that point. Later on, when we try to do something with a kernel, it is not found. I personally don't have experience either with the platform or with the particular GPU.

ATH 2020-08-31 12:35

[QUOTE=moebius;555511][SIZE="1"][/SIZE]
Unfortunately then this error message comes up,

[COLOR="Red"]/content/drive/My Drive/gpuowl-master/gpuowl.exe: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by /content/drive/My Drive/gpuowl-master/gpuowl.exe)[/COLOR]

[B]I tried the following without success, Maybe someone has a binary that works in general, I would be happy with it[/B]

!sudo add-apt-repository ppa:ubuntu-toolchain-r/test
!sudo apt-get update
!sudo apt-get install gcc-4.9 g++-4.9
!sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 60 --slave /usr/bin/g++ g++ /usr/bin/g++-4.9[/QUOTE]

Sorry I forgot about those files. On the instance you compiled it on (or compile it again) add a line after compiling:
!cp /usr/lib/x86_64-linux-gnu/libstdc* '/content/drive/My Drive/'

Now those 2 files are in the root of your Google Drive, so each time you run gpuowl without compiling it, you need to add the lines:
%cd '/content/drive/My Drive'
!cp libstdc* /usr/lib/x86_64-linux-gnu/

to copy the files back to that folder before starting gpuowl.

moebius 2020-08-31 16:20

[QUOTE=ATH;555519]Sorry I forgot about those files. [/QUOTE]

THX I build it new, the binary seems to be good now.

[SIZE="1"][B]/content/drive/My Drive
cp: cannot stat 'libstdc*': No such file or directory

2020-08-31 16:15:20 gpuowl v6.11-380-g79ea0cc
2020-08-31 16:15:20 Note: not found 'config.txt'
2020-08-31 16:15:20 device 0, unique id ''
2020-08-31 16:15:20 Tesla P100-PCIE-16GB-0 104930401 FFT: 5.50M 1K:11:256 (18.19 bpw)
2020-08-31 16:15:20 Tesla P100-PCIE-16GB-0 Expected maximum carry32: 50950000
2020-08-31 16:15:21 Tesla P100-PCIE-16GB-0 OpenCL args "-DEXP=104930401u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=11u -DPM1=0 -DMM2_CHAIN=1u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0x1.7ee28e7ec46ep-1 -DIWEIGHT_STEP_MINUS_1=-0x1.b620c8c81195dp-2 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2020-08-31 16:15:23 Tesla P100-PCIE-16GB-0

2020-08-31 16:15:23 Tesla P100-PCIE-16GB-0 OpenCL compilation in 1.77 s
2020-08-31 16:15:23 Tesla P100-PCIE-16GB-0 104930401 OK 96702000 loaded: blockSize 400, 180c05fad6c9c2cb
2020-08-31 16:15:23 Tesla P100-PCIE-16GB-0 validating proof residues for power 8
2020-08-31 16:15:23 Tesla P100-PCIE-16GB-0 Can't open './104930401/proof/409885' (mode 'rb')
2020-08-31 16:15:23 Tesla P100-PCIE-16GB-0 validating proof residues for power 9
2020-08-31 16:15:23 Tesla P100-PCIE-16GB-0 Can't open './104930401/proof/204943' (mode 'rb')
2020-08-31 16:15:23 Tesla P100-PCIE-16GB-0 validating proof residues for power 8
2020-08-31 16:15:23 Tesla P100-PCIE-16GB-0 Can't open './104930401/proof/409885' (mode 'rb')
2020-08-31 16:15:23 Tesla P100-PCIE-16GB-0 validating proof residues for power 7
2020-08-31 16:15:23 Tesla P100-PCIE-16GB-0 Can't open './104930401/proof/819769' (mode 'rb')
2020-08-31 16:15:23 Tesla P100-PCIE-16GB-0 validating proof residues for power 6
2020-08-31 16:15:23 Tesla P100-PCIE-16GB-0 Can't open './104930401/proof/1639538' (mode 'rb')
2020-08-31 16:15:23 Tesla P100-PCIE-16GB-0 Proof disabled because of missing checkpoints
2020-08-31 16:15:25 Tesla P100-PCIE-16GB-0 104930401 OK 96702800 92.16%; 1009 us/it; ETA 0d 02:18; aa440da56e62c525 (check 0.56s)
2020-08-31 16:17:04 Tesla P100-PCIE-16GB-0 104930401 OK 96800000 92.25%; 1011 us/it; ETA 0d 02:17; cb6f78c6b55a7c0f (check 0.57s)
2020-08-31 16:17:38 Tesla P100-PCIE-16GB-0 Stopping, please wait..
2020-08-31 16:17:39 Tesla P100-PCIE-16GB-0 104930401 OK 96834400 92.28%; 1011 us/it; ETA 0d 02:16; 9e337d1cc0d9aff1 (check 0.56s)
2020-08-31 16:17:39 Tesla P100-PCIE-16GB-0 Exiting because "stop requested"
2020-08-31 16:17:39 Tesla P100-PCIE-16GB-0 Bye[/B][/SIZE]

frmky 2020-09-07 03:57

So you did get this to work with nVidia? I tried compiling using g++-8 and clang++-9 on Ubuntu 18.04 and the CUDA 11 OpenCL library, but whenever I run it on a Tesla K20 gpu I get the same error, INVALID_KERNEL clSetKernelArg(k, pos, sizeof(value), &value) at clwrap.h:77 setArg. I think I'm about ready to give up.

preda 2020-09-07 06:34

[QUOTE=frmky;556303]So you did get this to work with nVidia? I tried compiling using g++-8 and clang++-9 on Ubuntu 18.04 and the CUDA 11 OpenCL library, but whenever I run it on a Tesla K20 gpu I get the same error, INVALID_KERNEL clSetKernelArg(k, pos, sizeof(value), &value) at clwrap.h:77 setArg. I think I'm about ready to give up.[/QUOTE]

Moebious' post #37 [url]https://mersenneforum.org/showpost.php?p=555557&postcount=37[/url] indicates that he got it to work, yes. The problem likely was something to do with the generated files "gpuowl-wrap.cpp" or "gpuowl-expanded.cl" that are produced during build (see Makefile) and contain the kernels source. Probably some of these files were empty (no kernel), which explains the instantaneous OpenCL compilation and the subsequent "kernel not found" error.

moebius 2020-09-07 08:31

1 Attachment(s)
[QUOTE=frmky;556303]So you did get this to work with nVidia?[/QUOTE]
[B][SIZE="2"]Here is the executable for colab with the libstdc files [/SIZE][/B]

frmky 2020-09-07 09:12

[QUOTE=moebius;556311][B][SIZE="2"]Here is the executable for colab with the libstdc files [/SIZE][/B][/QUOTE]

Thank you! That works on Ubuntu 18.04. :smile:

frmky 2020-09-07 09:26

[QUOTE=preda;556308]The problem likely was something to do with the generated files "gpuowl-wrap.cpp" or "gpuowl-expanded.cl" that are produced during build (see Makefile) and contain the kernels source. Probably some of these files were empty (no kernel), which explains the instantaneous OpenCL compilation and the subsequent "kernel not found" error.[/QUOTE]
Yes, that was in fact the problem. gpuowl-expanded.cl was in fact empty. I deleted these files and recompiled, and it now works fine. Perhaps a make clean should delete these? Thanks!

And in the double-check range, gpuowl appears faster than CUDALucas on the K20. With a 54M exponent, I'm getting 2.6 ms/iter with gpuowl and 3.2 ms/iter with CUDALucas.

petrw1 2020-09-17 05:28

[QUOTE=ATH;555507]Are your worktodo.txt lines of this format?:
PRP=<AID>,1,2,<exponent>,-1,75,0


You can also try this notebook which compiles gmp-6.2.0 and gpuowl with gcc9 in /root folder and copies the executable "gpuowl" to the root of the Google Drive, I just tested it:


[CODE]
import subprocess
import os

from google.colab import drive

if not os.path.exists('/content/drive/My Drive'):
drive.mount('/content/drive')

%cd ~
!sudo add-apt-repository -y ppa:ubuntu-toolchain-r/test
!sudo apt install -y gcc-9
!sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-9 800
!sudo apt install -y g++-9
!sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-9 800
!sudo apt-get -y install lzip
!sudo apt-get -y install m4
!sudo apt-get -y install libtool
!sudo apt-get -y install subversion
!sudo apt-get -y install make
!sudo apt-get -y install autoconf
!sudo apt-get -y install automake
!sudo wget https://gmplib.org/download/gmp/gmp-6.2.0.tar.lz
!sudo tar --lzip -xvf gmp-6.2.0.tar.lz
%cd gmp-6.2.0
!./configure ABI=64 CC=gcc CFLAGS="-O3 -m64 -mavx -mavx2" --build=x86_64-pc-linux-gnu --enable-cxx --enable-static --disable-shared
!make
!sudo make install

%cd ..
!git clone https://github.com/preda/gpuowl
%cd gpuowl
!make gpuowl
!cp gpuowl '/content/drive/My Drive/'

!cp /usr/lib/x86_64-linux-gnu/libstdc* '/content/drive/My Drive/'

[/CODE][/QUOTE]

I get this error....is it familiar?

./gpuowl.exe: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by ./gpuowl.exe)

moebius 2020-09-17 05:49

[QUOTE=petrw1;557195]I get this error....is it familiar?
./gpuowl.exe: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by ./gpuowl.exe)[/QUOTE]

Read post #36 in this thread.
[URL="https://mersenneforum.org/showthread.php?p=555511#post555511"]https://mersenneforum.org/showthread.php?p=555511#post555511[/URL]

petrw1 2020-09-17 06:00

[QUOTE=moebius;557198]Read post #36 in this thread.
[URL="https://mersenneforum.org/showthread.php?p=555511#post555511"]https://mersenneforum.org/showthread.php?p=555511#post555511[/URL][/QUOTE]

Your compiled version is a few posts later.
I tried that one first and got the same error.

Just now I tried adding the files as ATH suggested and still same error.

Must be an ID-10-T error on my part.

I put all of this into a subfolder to reduce clutter from previous attempts. This is what I have:

/content/drive/My Drive/HOOT
ls -l ./
total 4691
-rw------- 1 root root 49 Sep 17 04:49 config.txt
-rwx------ 1 root root 988648 Sep 17 05:11 gpuowl.exe
-rw------- 1 root root 416 Sep 17 04:55 gpuowl.log
-rw------- 1 root root 870 Sep 17 05:57 gpuowllog.txt
-rw------- 1 root root 1903624 Sep 17 05:55 libstdc++.so.6
-rw------- 1 root root 1903624 Sep 17 05:55 libstdc++.so.6.0.28
drwx------ 2 root root 4096 Sep 17 04:55 proof
-rw------- 1 root root 0 Sep 17 04:55 results.txt
-rwx------ 1 root root 50 Sep 17 05:10 worktodo.txt

petrw1 2020-09-17 06:05

[QUOTE=petrw1;557199]Your compiled version is a few posts later.
I tried that one first and got the same error.

Just now I tried adding the files as ATH suggested and still same error.

Must be an ID-10-T error on my part.

I put all of this into a subfolder to reduce clutter from previous attempts. This is what I have:

/content/drive/My Drive/HOOT
ls -l ./
total 4691
-rw------- 1 root root 49 Sep 17 04:49 config.txt
-rwx------ 1 root root 988648 Sep 17 05:11 gpuowl.exe
-rw------- 1 root root 416 Sep 17 04:55 gpuowl.log
-rw------- 1 root root 870 Sep 17 05:57 gpuowllog.txt
-rw------- 1 root root 1903624 Sep 17 05:55 libstdc++.so.6
-rw------- 1 root root 1903624 Sep 17 05:55 libstdc++.so.6.0.28
drwx------ 2 root root 4096 Sep 17 04:55 proof
-rw------- 1 root root 0 Sep 17 04:55 results.txt
-rwx------ 1 root root 50 Sep 17 05:10 worktodo.txt[/QUOTE]


Hmmm ... seems to be running after all... YIPPEE

moebius 2020-09-17 07:13

If you use several google accounts at colab and the new gpuowl version that generates .proof files, you have to transfer all temporary files in the proof folder.

To do this, I'll zip the entire gpuowl directory to download using the instruction.

[B]!zip -r '/content/drive/My Drive/Directory.zip' '/content/drive/My Drive/gpuowl-master'[/B]

and then the following to unzip the files back into the correct directory after uploading

[B]!unzip -o -d '/' '/content/drive/My Drive/Directory.zip'[/B]

petrw1 2020-09-18 07:05

P-1 Error
 
Stage 2 of:
[CODE]B1=1500000,B2=30000000;PFactor=0,1,2,40370521,-1,74,2[/CODE]
config.txt:
[CODE]-user petrw1 -cpu colab -device 0 -maxAlloc 30000[/CODE]

[CODE]2020-09-18 06:37:35 colab P-1 (B1=1500000, B2=30000000, D=30030): primes 1743704, expanded 1767682, doubles 282837 (left 1187489), singles 1178030, total 1460867 (84%)
2020-09-18 06:37:35 colab 40370521 P2 using blocks [50 - 999] to cover 1460867 primes
2020-09-18 06:37:36 colab 40370521 P2 using 1440 buffers of 18.0 MB each
2020-09-18 06:37:51 colab Exception gpu_error: MEM_OBJECT_ALLOCATION_FAILURE clEnqueueCopyBuffer(queue, src, dst, 0, 0, size, 0, NULL, NULL) at clwrap.cpp:339 copyBuf[/CODE]

I increased maxAlloc to 100000; same error

paulunderwood 2020-09-18 09:56

[QUOTE=petrw1;557272]

I increased maxAlloc to 100000; same error[/QUOTE]

I think you should have decreased it.

kriesel 2020-09-18 13:13

[QUOTE=petrw1;557272]Stage 2 of:
[CODE]B1=1500000,B2=30000000;PFactor=0,1,2,40370521,-1,74,2[/CODE]config.txt:
[CODE]-user petrw1 -cpu colab -device 0 -maxAlloc 30000[/CODE][CODE]2020-09-18 06:37:35 colab P-1 (B1=1500000, B2=30000000, D=30030): primes 1743704, expanded 1767682, doubles 282837 (left 1187489), singles 1178030, total 1460867 (84%)
2020-09-18 06:37:35 colab 40370521 P2 using blocks [50 - 999] to cover 1460867 primes
2020-09-18 06:37:36 colab 40370521 P2 using 1440 buffers of 18.0 MB each
2020-09-18 06:37:51 colab Exception gpu_error: MEM_OBJECT_ALLOCATION_FAILURE clEnqueueCopyBuffer(queue, src, dst, 0, 0, size, 0, NULL, NULL) at clwrap.cpp:339 copyBuf[/CODE]I increased maxAlloc to 100000; same error[/QUOTE]From the program's help output:[CODE]-maxAlloc : limit GPU memory usage to this value in MB (needed on non-AMD GPUs)
[/CODE]
Aim for -maxAlloc somewhat less than what the Colab gpu you get has installed; at least a gigabyte less. It can't allocate what's not there, or used for other things. See [URL]https://www.mersenneforum.org/showpost.php?p=533245&postcount=15[/URL]
(If anyone has data on V100 or any other model encountered, PM me and I'll add it.)

petrw1 2020-09-18 16:01

[QUOTE=kriesel;557287]From the program's help output:[CODE]-maxAlloc : limit GPU memory usage to this value in MB (needed on non-AMD GPUs)
[/CODE]
Aim for -maxAlloc somewhat less than what the Colab gpu you get has installed; at least a gigabyte less. It can't allocate what's not there, or used for other things. See [URL]https://www.mersenneforum.org/showpost.php?p=533245&postcount=15[/URL]
(If anyone has data on V100 or any other model encountered, PM me and I'll add it.)[/QUOTE]

Thanks...how can I tell which GPU I got. And more importantly i need to set this parm before I know what I'm going to get, yes/no?

moebius 2020-09-18 16:10

use this to see the GPU you got at your actual running time
[B]!nvidia-smi -L[/B]

kriesel 2020-09-18 16:47

[QUOTE=petrw1;557293]Thanks...how can I tell which GPU I got. And more importantly i need to set this parm before I know what I'm going to get, yes/no?[/QUOTE]You could code for the worst case. Or for multiple cases as in attachment of [URL="https://www.mersenneforum.org/showpost.php?p=537155&postcount=16"]https://www.mersenneforum.org/showpost.php?p=537155&postcount=16 [/URL]
or reject gpus that don't match your preference.

petrw1 2020-09-18 16:53

[QUOTE=petrw1;557293]Thanks...how can I tell which GPU I got. And more importantly i need to set this parm before I know what I'm going to get, yes/no?[/QUOTE]

If the smallest MiB is 7611 then I could use MaxAlloc=7500 to be safe?

However, with 7500 I couldn't do stage 2 of a smaller test; error was:
[CODE]FFT size too large for exponent.[/CODE]

Increasing MaxAlloc allowed that Stage 2 to run.

preda 2020-09-18 21:51

[QUOTE=petrw1;557272]
I increased maxAlloc to 100000; same error[/QUOTE]

maxAlloc is in Megabytes, so 100'000 indicates 100GB.

Maybe you should start with a conservativelly small value, such as 3000 or 7000, if you expect GPUs with at least 4GB or at least 8GB of RAM. Once that's working, you can move up.

Xyzzy 2020-09-18 23:42

4096MB seems to be fine. You don't get that much faster with more memory. Especially if you stall the program!

:mike:

petrw1 2020-09-19 04:19

YAY... and Thanks ... and FAST!!!
 
Its running and it's a lot faster than a CPU.

A 6.7566 GhzDay P-1 in under 30 minutes with a P100 GPU

This would take 1 core of my 7820X 18 hours.
All 8 cores of my CPU could do just over 10 per day; this GPU could do almost 50 per day.

Observation: I specified B1=1,500,000 B2=30,000,000
I got Stage 1=2,164,271; Stage 2=1,460,867 primes and 2,880 (classes???)

PS I added the commas for readability

Observation#2: If I leave off the ,2 at the end of this worktodo.txt line
B1=1500000,B2=30000000;PFactor=0,1,2,40371047,-1,74,2
I get the error

2020-09-19 04:21:44 colab 1 FFT: 128K 256:1:256 (0.00 bpw)
2020-09-19 04:21:44 colab FFT size too large for exponent (0.00 bits/word).
2020-09-19 04:21:44 colab Exiting because "FFT size too large"

petrw1 2020-10-01 23:39

Why does it ignore worktodo lines?
 
2020-10-01 23:37:32 colab worktodo.txt line ignored: "B1=1500000,B2=30000000;PFactor=0,1,2,40382929,-1,74,2"


Thanks

Never mind, I got the new build which I understand no longer does P1 alone.

moebius 2020-10-02 01:56

[QUOTE=petrw1;558558]Never mind, I got the new build which I understand no longer does P1 alone.[/QUOTE]
[URL="https://mersenneforum.org/showpost.php?p=557928&postcount=1"]https://mersenneforum.org/showpost.php?p=557928&postcount=1[/URL]

kriesel 2020-10-02 03:56

[QUOTE=petrw1;558558]2020-10-01 23:37:32 colab worktodo.txt line ignored: "B1=1500000,B2=30000000;PFactor=0,1,2,40382929,-1,74,2"

Thanks

Never mind, I got the new build which I understand no longer does P1 alone.[/QUOTE]
Gpuowl 7.0 is not ready yet, don't use it, per Mihai [url]https://www.mersenneforum.org/showpost.php?p=558460&postcount=44[/url]

kriesel 2021-09-27 23:15

[QUOTE=moebius;556311][B][SIZE=2]Here is the executable for colab with the libstdc files [/SIZE][/B][/QUOTE]
Tried that on a new colab account, got the following as entire output of the run attempt:
./gpuowl.exe: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by ./gpuowl.exe)


(edit: and I see others have hit that error too. Will try some of the posted resolution approaches at the next opportunities. If moebius or whoever were to post an updated archive file with the needed file included, that might help.)

chalsall 2021-09-27 23:28

[QUOTE=kriesel;588873]Tried that on a new colab account, got the following as entire output of the run attempt:
./gpuowl.exe: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by ./gpuowl.exe)[/QUOTE]

Quoting a thread almost exactly a year ago. Why?

I was drilling down to ensure I hadn't said anything stupid in my recent posts. I came across [URL="https://www.youtube.com/watch?v=YzJlI4P7i0Y"]this example[/URL] of someone well trained being able to reproduce excellence (once they understood the theory of operation).

We're all actually on the same side.

Correct?

Dr Sardonicus 2021-09-28 00:45

[QUOTE=chalsall;588875]Quoting a thread almost exactly a year ago. Why?[/QUOTE]

From my reading, I would guess it's because (my emphasis)[QUOTE=kriesel;588873]Tried that [the executable for colab from the quoted post] [color=red]on a [b]new[/b] colab account[/color], got the following as entire output of the run attempt:
./gpuowl.exe: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by ./gpuowl.exe)[/QUOTE]

moebius 2021-09-28 04:14

[QUOTE=kriesel;588873]If moebius or whoever were to post an updated archive file with the needed file included, that might help.)[/QUOTE]
read this post, did You copy the libstdc files to your working directory? I can make a new binary, if you tell me the path to the old repository
[URL="https://mersenneforum.org/showpost.php?p=555507&postcount=33"]https://mersenneforum.org/showpost.php?p=555507&postcount=33[/URL]

kriesel 2021-09-28 15:06

[QUOTE=moebius;588881]did You copy the libstdc files to your working directory?[/QUOTE]yes. libstd... missing is not the issue, libcxx version is in that case.

I tried 3 different Linux for Colab versions and hit different issues in each, finally got Fan Ming's going, although it blocks mprime on the CPU due to excessive CPU usage in gpuowl. I think that is ~V6.11-329 based on dates. It does not give a version when it runs. Purpose of the exercise was to not only get that account going, but document what's needed to clone to other accounts' Google drives without building anew on each, or at every new launch that gets a different Colab VM.

From my "building with msys2" notes, to get an old gpuowl commit, find the hash corresponding to the version desired, by matching the leading digits of the hash to the trailing digits of the gpuowl version.
For V6.11-380-g[B]79ea0cc[/B] that's [B]79ea0cc[/B]29184237b24018e9396df271ec2754e97. [URL]https://github.com/preda/gpuowl/commit/79ea0cc29184237b24018e9396df271ec2754e97[/URL]

Then do similar to[CODE]git clone --branch v6 https://github.com/preda/gpuowl
cd gpuowl
git checkout 79ea0cc29184237b24018e9396df271ec2754e97
or perhaps git reset --hard 79ea0cc29184237b24018e9396df271ec2754e97
[/CODE]There's more on my 3 attempts with downloaded builds in the second half of [URL]https://www.mersenneforum.org/showpost.php?p=527930&postcount=7;[/URL] use browser search for "build anew" to find the start of the relevant section.

moebius 2021-09-28 21:17

GLIBCXX_3.4.26 comes with GCC 9.1.0. The gpuOwl 6.11.380 binary at [URL="https://mersenneforum.org/showpost.php?p=556311&postcount=40"]https://mersenneforum.org/showpost.php?p=556311&postcount=40[/URL] does not work for you, because the gcc version is not the same. I compiled it with gcc 9.1.0 [URL="https://gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html"]https://gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html[/URL]
It should work if you install gcc 9.1.0 and then copy the needed files with a notebook [URL="https://mersenneforum.org/showpost.php?p=555507&postcount=33"]https://mersenneforum.org/showpost.php?p=555507&postcount=33[/URL]
Sure it isn't the same issue like in this thread? [URL="https://mersenneforum.org/showthread.php?p=555511#post555511"]https://mersenneforum.org/showthread.php?p=555511#post555511[/URL]

kriesel 2021-09-28 22:54

[QUOTE=moebius;588932]GLIBCXX_3.4.26 comes with GCC 9.1.0. The gpuOwl 6.11.380 binary at [URL]https://mersenneforum.org/showpost.php?p=556311&postcount=40[/URL] does not work for you, because the gcc version is not the same. I compiled it with gcc 9.1.0 [URL]https://gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html[/URL]
It should work if you install gcc 9.1.0 and then copy the needed files with a notebook [URL]https://mersenneforum.org/showpost.php?p=555507&postcount=33[/URL]
Sure it isn't the same issue like in this thread? [URL]https://mersenneforum.org/showthread.php?p=555511#post555511[/URL][/QUOTE]
Thanks for responding. But this is not working for me. I already have the libstdc stuff in the gpuowl working directory on Google Drive, which is all you cp in the posted script at that link. It's complaining about GLIBCXX_3.4.26, which that post's script does not cp. And if I attempt
[CODE][COLOR=#000000][FONT=monospace][COLOR=#0000ff]![/COLOR][COLOR=#000000]sudo add-apt-repository -y ppa:ubuntu-toolchain-r/test[/COLOR][/FONT][/COLOR]
[COLOR=#000000][FONT=monospace][COLOR=#0000ff]![/COLOR][COLOR=#000000]sudo apt install -y gcc-[/COLOR][COLOR=#098658]9[/COLOR][/FONT][/COLOR]
[COLOR=#000000][FONT=monospace][COLOR=#0000ff]![/COLOR][COLOR=#000000]sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-[/COLOR][COLOR=#098658]9[/COLOR][COLOR=#098658] 800[/COLOR][/FONT][/COLOR]
[COLOR=#000000][FONT=monospace][COLOR=#0000ff]![/COLOR][COLOR=#000000]ls /usr/lib/x86_64-linux-gnu/GLIBCXX*[/COLOR][/FONT][/COLOR]
[COLOR=#000000][FONT=monospace][COLOR=#0000ff]![/COLOR][COLOR=#000000]cp /usr/lib/x86_64-linux-gnu/GLIBCXX_3[/COLOR][COLOR=#098658].4[/COLOR][COLOR=#000000].* [/COLOR][COLOR=#a31515]'/content/drive/My Drive/gpuowl/K80'[/COLOR][/FONT][/COLOR][/CODE]it installs gcc 9.[B]4[/B].0, and then gives
[CODE]update-alternatives: using /usr/bin/gcc-9 to provide /usr/bin/gcc (gcc) in auto mode
ls: cannot access '/usr/lib/x86_64-linux-gnu/GLIBCXX*': No such file or directory
cp: cannot stat '/usr/lib/x86_64-linux-gnu/GLIBCXX_3.4.*': No such file or directory[/CODE]So, I'll cp in more libstdc*, anyway, and see what the next K80 instance shows. All[COLOR=#000000][FONT=monospace][CODE]!cp /usr/lib/x86_64-linux-gnu/libstdc* '/content/drive/My Drive/gpuowl/K80/'[/CODE]appears to have accomplished is to freshen the time stamps on libstdc++.so.6 and libstdc++.so.6.0.25 in [/FONT][/COLOR][COLOR=#000000][FONT=monospace][COLOR=#000000][FONT=monospace][COLOR=#a31515]/content/drive/My Drive/gpuowl/K80[/COLOR][/FONT][/COLOR]; no sign of GLIBCXX*. [/FONT][/COLOR][COLOR=#000000][FONT=monospace][COLOR=#000000][FONT=monospace][COLOR=#a31515]/content/drive/My Drive/gpuowl/K80[/COLOR][/FONT][/COLOR]/libstdc++.so.6.0.28 is untouched.[/FONT][/COLOR]
[COLOR=#000000][FONT=monospace]Ok, got a K80. And using the v6.11-380 Gpuowl.exe, the error persists:[/FONT][/COLOR][CODE]./gpuowl.exe: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by ./gpuowl.exe)[/CODE]So, reverting to Fan Ming's version for the session.
And something about that process messed with the permissions on Fan Ming's executable again. Which Google Drive does not show, so one is flying blind.

Flaukrotist 2021-09-29 06:51

It works for me.
 
I can only guess where your error could be with my very limited knowledge of gpuowl and linux. But I can at least tell, that I could make it work with the help of the zips and code blocks provided in this very thread that moebius has again linked to in his latest post here.

One potential source of error I see from your code block about copying the files. The magic is not to copy them to your google drive (actually that is only to save them there once. I saved them there from some download here in this thread and avoided this step). The magic is that you copy them back to the correct linux folder every time you want to run gpuowl. Look at the last sentence and last separate code block here: [URL]https://mersenneforum.org/showpost.php?p=555507&postcount=33[/URL]

kriesel 2021-09-29 09:18

[QUOTE=Flaukrotist;588963]I can only guess where your error could be with my very limited knowledge of gpuowl and linux. But I can at least tell, that I could make it work with the help of the zips and code blocks provided in this very thread that moebius has again linked to in his latest post here.

One potential source of error I see from your code block about copying the files. The magic is not to copy them to your google drive (actually that is only to save them there once. I saved them there from some download here in this thread and avoided this step). The magic is that you copy them back to the correct linux folder every time you want to run gpuowl. Look at the last sentence and last separate code block here: [URL]https://mersenneforum.org/showpost.php?p=555507&postcount=33[/URL][/QUOTE]But it was also the same VM session where I did the GCC install that subsequently Moebius's build did not work. I don't understand how copying them from a to b and back to a in the same VM session that installs them can help at all. Or why moebius' build would require such a maneuver when Fan Ming's does not, in the same VM session. Maybe Fan Ming's has it all statically linked?

moebius 2021-09-29 10:07

[QUOTE=kriesel;588970] I don't understand how copying them from a to b and back to a in the same VM session that installs them can help at all.[/QUOTE] I just can tell you, that I had to run the complete 'make notebook' from ATH once for every new google account I used. Then the gpuOwl binary worked properly.


All times are UTC. The time now is 09:45.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.