mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GpuOwl (https://www.mersenneforum.org/forumdisplay.php?f=171)
-   -   gpuowl: runtime error (https://www.mersenneforum.org/showthread.php?t=23117)

petrw1 2020-09-17 06:00

[QUOTE=moebius;557198]Read post #36 in this thread.
[URL="https://mersenneforum.org/showthread.php?p=555511#post555511"]https://mersenneforum.org/showthread.php?p=555511#post555511[/URL][/QUOTE]

Your compiled version is a few posts later.
I tried that one first and got the same error.

Just now I tried adding the files as ATH suggested and still same error.

Must be an ID-10-T error on my part.

I put all of this into a subfolder to reduce clutter from previous attempts. This is what I have:

/content/drive/My Drive/HOOT
ls -l ./
total 4691
-rw------- 1 root root 49 Sep 17 04:49 config.txt
-rwx------ 1 root root 988648 Sep 17 05:11 gpuowl.exe
-rw------- 1 root root 416 Sep 17 04:55 gpuowl.log
-rw------- 1 root root 870 Sep 17 05:57 gpuowllog.txt
-rw------- 1 root root 1903624 Sep 17 05:55 libstdc++.so.6
-rw------- 1 root root 1903624 Sep 17 05:55 libstdc++.so.6.0.28
drwx------ 2 root root 4096 Sep 17 04:55 proof
-rw------- 1 root root 0 Sep 17 04:55 results.txt
-rwx------ 1 root root 50 Sep 17 05:10 worktodo.txt

petrw1 2020-09-17 06:05

[QUOTE=petrw1;557199]Your compiled version is a few posts later.
I tried that one first and got the same error.

Just now I tried adding the files as ATH suggested and still same error.

Must be an ID-10-T error on my part.

I put all of this into a subfolder to reduce clutter from previous attempts. This is what I have:

/content/drive/My Drive/HOOT
ls -l ./
total 4691
-rw------- 1 root root 49 Sep 17 04:49 config.txt
-rwx------ 1 root root 988648 Sep 17 05:11 gpuowl.exe
-rw------- 1 root root 416 Sep 17 04:55 gpuowl.log
-rw------- 1 root root 870 Sep 17 05:57 gpuowllog.txt
-rw------- 1 root root 1903624 Sep 17 05:55 libstdc++.so.6
-rw------- 1 root root 1903624 Sep 17 05:55 libstdc++.so.6.0.28
drwx------ 2 root root 4096 Sep 17 04:55 proof
-rw------- 1 root root 0 Sep 17 04:55 results.txt
-rwx------ 1 root root 50 Sep 17 05:10 worktodo.txt[/QUOTE]


Hmmm ... seems to be running after all... YIPPEE

moebius 2020-09-17 07:13

If you use several google accounts at colab and the new gpuowl version that generates .proof files, you have to transfer all temporary files in the proof folder.

To do this, I'll zip the entire gpuowl directory to download using the instruction.

[B]!zip -r '/content/drive/My Drive/Directory.zip' '/content/drive/My Drive/gpuowl-master'[/B]

and then the following to unzip the files back into the correct directory after uploading

[B]!unzip -o -d '/' '/content/drive/My Drive/Directory.zip'[/B]

petrw1 2020-09-18 07:05

P-1 Error
 
Stage 2 of:
[CODE]B1=1500000,B2=30000000;PFactor=0,1,2,40370521,-1,74,2[/CODE]
config.txt:
[CODE]-user petrw1 -cpu colab -device 0 -maxAlloc 30000[/CODE]

[CODE]2020-09-18 06:37:35 colab P-1 (B1=1500000, B2=30000000, D=30030): primes 1743704, expanded 1767682, doubles 282837 (left 1187489), singles 1178030, total 1460867 (84%)
2020-09-18 06:37:35 colab 40370521 P2 using blocks [50 - 999] to cover 1460867 primes
2020-09-18 06:37:36 colab 40370521 P2 using 1440 buffers of 18.0 MB each
2020-09-18 06:37:51 colab Exception gpu_error: MEM_OBJECT_ALLOCATION_FAILURE clEnqueueCopyBuffer(queue, src, dst, 0, 0, size, 0, NULL, NULL) at clwrap.cpp:339 copyBuf[/CODE]

I increased maxAlloc to 100000; same error

paulunderwood 2020-09-18 09:56

[QUOTE=petrw1;557272]

I increased maxAlloc to 100000; same error[/QUOTE]

I think you should have decreased it.

kriesel 2020-09-18 13:13

[QUOTE=petrw1;557272]Stage 2 of:
[CODE]B1=1500000,B2=30000000;PFactor=0,1,2,40370521,-1,74,2[/CODE]config.txt:
[CODE]-user petrw1 -cpu colab -device 0 -maxAlloc 30000[/CODE][CODE]2020-09-18 06:37:35 colab P-1 (B1=1500000, B2=30000000, D=30030): primes 1743704, expanded 1767682, doubles 282837 (left 1187489), singles 1178030, total 1460867 (84%)
2020-09-18 06:37:35 colab 40370521 P2 using blocks [50 - 999] to cover 1460867 primes
2020-09-18 06:37:36 colab 40370521 P2 using 1440 buffers of 18.0 MB each
2020-09-18 06:37:51 colab Exception gpu_error: MEM_OBJECT_ALLOCATION_FAILURE clEnqueueCopyBuffer(queue, src, dst, 0, 0, size, 0, NULL, NULL) at clwrap.cpp:339 copyBuf[/CODE]I increased maxAlloc to 100000; same error[/QUOTE]From the program's help output:[CODE]-maxAlloc : limit GPU memory usage to this value in MB (needed on non-AMD GPUs)
[/CODE]
Aim for -maxAlloc somewhat less than what the Colab gpu you get has installed; at least a gigabyte less. It can't allocate what's not there, or used for other things. See [URL]https://www.mersenneforum.org/showpost.php?p=533245&postcount=15[/URL]
(If anyone has data on V100 or any other model encountered, PM me and I'll add it.)

petrw1 2020-09-18 16:01

[QUOTE=kriesel;557287]From the program's help output:[CODE]-maxAlloc : limit GPU memory usage to this value in MB (needed on non-AMD GPUs)
[/CODE]
Aim for -maxAlloc somewhat less than what the Colab gpu you get has installed; at least a gigabyte less. It can't allocate what's not there, or used for other things. See [URL]https://www.mersenneforum.org/showpost.php?p=533245&postcount=15[/URL]
(If anyone has data on V100 or any other model encountered, PM me and I'll add it.)[/QUOTE]

Thanks...how can I tell which GPU I got. And more importantly i need to set this parm before I know what I'm going to get, yes/no?

moebius 2020-09-18 16:10

use this to see the GPU you got at your actual running time
[B]!nvidia-smi -L[/B]

kriesel 2020-09-18 16:47

[QUOTE=petrw1;557293]Thanks...how can I tell which GPU I got. And more importantly i need to set this parm before I know what I'm going to get, yes/no?[/QUOTE]You could code for the worst case. Or for multiple cases as in attachment of [URL="https://www.mersenneforum.org/showpost.php?p=537155&postcount=16"]https://www.mersenneforum.org/showpost.php?p=537155&postcount=16 [/URL]
or reject gpus that don't match your preference.

petrw1 2020-09-18 16:53

[QUOTE=petrw1;557293]Thanks...how can I tell which GPU I got. And more importantly i need to set this parm before I know what I'm going to get, yes/no?[/QUOTE]

If the smallest MiB is 7611 then I could use MaxAlloc=7500 to be safe?

However, with 7500 I couldn't do stage 2 of a smaller test; error was:
[CODE]FFT size too large for exponent.[/CODE]

Increasing MaxAlloc allowed that Stage 2 to run.

preda 2020-09-18 21:51

[QUOTE=petrw1;557272]
I increased maxAlloc to 100000; same error[/QUOTE]

maxAlloc is in Megabytes, so 100'000 indicates 100GB.

Maybe you should start with a conservativelly small value, such as 3000 or 7000, if you expect GPUs with at least 4GB or at least 8GB of RAM. Once that's working, you can move up.


All times are UTC. The time now is 03:48.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.