![]() |
![]() |
#166 |
Aug 2020
37 Posts |
![]()
Different computer and a Vega 64 instead of a Radeon VII, same problem using v.7.2. Raising the fft size to 6.5M avoids the problem but running at 2730µ/iter instead of 2020µ.
Code:
2020-11-04 07:12:53 Rig02-RadeonVega64-02 109004201 OpenCL args "-DEXP=109004201u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=12u -DAMDGPU=1 -DCARRY64=1 -DCARRYM64=1 -DWEIGHT_STEP_MINUS_1=0x9.8841a10b5e2bp-4 -DIWEIGHT_STEP_MINUS_1=-0xb.f26a11911bbp-5 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only " 2020-11-04 07:12:53 Rig02-RadeonVega64-02 109004201 ASM compilation failed, retrying compilation using NO_ASM 2020-11-04 07:12:55 Rig02-RadeonVega64-02 109004201 OpenCL compilation in 2.25 s 2020-11-04 07:12:56 Rig02-RadeonVega64-02 109004201 maxAlloc: 6.5 GB 2020-11-04 07:12:56 Rig02-RadeonVega64-02 109004201 P1(5.5M) 7935851 bits 2020-11-04 07:12:56 Rig02-RadeonVega64-02 109004201 PRP starting from beginning 2020-11-04 07:12:56 Rig02-RadeonVega64-02 109004201 Acquired memory lock 'c:\gpuowl\pool\memlock-1' 2020-11-04 07:12:56 Rig02-RadeonVega64-02 109004201 P1(5.5M) using 258 buffers 2020-11-04 07:12:58 Rig02-RadeonVega64-02 109004201 [0] 36500ec1 != fffffffb 2020-11-04 07:12:58 Rig02-RadeonVega64-02 109004201 [1] 8cf00cca != 00000019 2020-11-04 07:12:58 Rig02-RadeonVega64-02 109004201 [2] 7aff4181 != ffffff83 2020-11-04 07:12:58 Rig02-RadeonVega64-02 109004201 [3] 3003737c != 00000271 2020-11-04 07:12:58 Rig02-RadeonVega64-02 109004201 [4] 7003a3e7 != fffff3cb 2020-11-04 07:12:58 Rig02-RadeonVega64-02 109004201 [5] 7bae3f3f != 00003d09 2020-11-04 07:12:59 Rig02-RadeonVega64-02 109004201 [6] f648cd5e != fffeced3 2020-11-04 07:12:59 Rig02-RadeonVega64-02 109004201 [7] 290e228d != 0005f5e1 2020-11-04 07:12:59 Rig02-RadeonVega64-02 109004201 [8] 89e2769b != ffe2329b 2020-11-04 07:12:59 Rig02-RadeonVega64-02 109004201 [9] 628fd07c != 009502f9 2020-11-04 07:12:59 Rig02-RadeonVega64-02 109004201 [10] 4518a126 != fd16f123 2020-11-04 07:12:59 Rig02-RadeonVega64-02 109004201 [11] 8ea0fa4a != 0e8d4a51 2020-11-04 07:12:59 Rig02-RadeonVega64-02 109004201 [12] a9f40f61 != b73d8c6b 2020-11-04 07:12:59 Rig02-RadeonVega64-02 109004201 [13] 7fd856fb != 6bcc41e9 2020-11-04 07:12:59 Rig02-RadeonVega64-02 109004201 [14] 78e2a243 != e502b673 2020-11-04 07:12:59 Rig02-RadeonVega64-02 109004201 [15] 14740e82 != 86f26fc1 2020-11-04 07:12:59 Rig02-RadeonVega64-02 109004201 [16] 9d46583b != 5d43d13b 2020-11-04 07:12:59 Rig02-RadeonVega64-02 109004201 [17] 7daf7a00 != 2dace9d9 2020-11-04 07:12:59 Rig02-RadeonVega64-02 109004201 [18] 9a50f044 != 1b9f6ec3 2020-11-04 07:12:59 Rig02-RadeonVega64-02 109004201 [19] 8b0dcfc9 != 75e2d631 2020-11-04 07:12:59 Rig02-RadeonVega64-02 109004201 fold() does not roundtrip 2020-11-04 07:12:59 Rig02-RadeonVega64-02 109004201 P1(5.5M) releasing 258 buffers 2020-11-04 07:12:59 Rig02-RadeonVega64-02 109004201 Released memory lock 'c:\gpuowl\pool\memlock-1' 2020-11-04 07:12:59 Rig02-RadeonVega64-02 Exiting because "fold roundtrip" 2020-11-04 07:12:59 Rig02-RadeonVega64-02 Bye |
![]() |
![]() |
![]() |
#167 |
"Mihai Preda"
Apr 2015
2·23·29 Posts |
![]()
You did not include the version you're running. I see in the opencl args that it uses -cl-unsafe-math-optimizations . This was dropped a few versions back (now you need to run with -unsafeMath to get that). I propose you try with a more recent version, just to rule that as a factor out.
|
![]() |
![]() |
![]() |
#168 |
"Viliam Furík"
Jul 2018
Martin, Slovakia
24·3·7 Posts |
![]()
I am almost sure he mentioned it... v7.2
It is in the text above the gpuOwl output. |
![]() |
![]() |
![]() |
#169 |
"Mihai Preda"
Apr 2015
2·23·29 Posts |
![]()
So are you running with -unsafeMath -- why? please try without that. Don't use -unsafeMath unless you have a good reason for it.
Last fiddled with by preda on 2020-11-04 at 22:22 Reason: update quote |
![]() |
![]() |
![]() |
#170 | |
"Mihai Preda"
Apr 2015
2·23·29 Posts |
![]() Quote:
GpuOwl VERSION v7.2-13-g266aed4 The v7.2 is a shortcut which gives a very approximate indication of which features are present; not so useful for bug reproduction. |
|
![]() |
![]() |
![]() |
#171 | |
Aug 2020
1001012 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#172 |
"mrh"
Oct 2018
Temecula, ca
5·13 Posts |
![]()
What version of rocm is recommended for 7.x? I'm using 2.10.0 and not able to compile, so I'm guessing it is time to upgrade?
|
![]() |
![]() |
![]() |
#173 |
"Mihai Preda"
Apr 2015
133410 Posts |
![]()
I think ROCm 3.3 is good. Also any after 3.5 should work fine, but maybe slower than 3.3. You can try, now it's easier to install multiple versions of ROCm OpenCL in parallel (at the same time) and choose which one is used with LD_LIBRARY_PATH
|
![]() |
![]() |
![]() |
#174 | |
"mrh"
Oct 2018
Temecula, ca
10000012 Posts |
![]() Quote:
Code:
diff --git a/Gpu.cpp b/Gpu.cpp index 9e5f09a..3e6739e 100644 --- a/Gpu.cpp +++ b/Gpu.cpp @@ -24,6 +24,7 @@ #include <numeric> #include <bitset> #include <limits> +#include <iomanip> #ifndef M_PIl #define M_PIl 3.141592653589793238462643383279502884L diff --git a/Pm1Plan.cpp b/Pm1Plan.cpp index fa84b43..afdf461 100644 --- a/Pm1Plan.cpp +++ b/Pm1Plan.cpp @@ -41,7 +41,7 @@ u32 reduce(u32 B1, u32 pos) { return pos; } -constexpr u32 firstMissingFactor(u32 D) { +u32 firstMissingFactor(u32 D) { switch (D) { case 210: case 420: |
|
![]() |
![]() |
![]() |
#175 |
Jul 2003
So Cal
40108 Posts |
![]()
Using nVidia OpenCL, I also need to add
#include <iomanip> to Gpu.cpp to get it to compile. |
![]() |
![]() |
![]() |
#176 |
"Viliam Furík"
Jul 2018
Martin, Slovakia
5208 Posts |
![]()
I have downloaded the version 7.2-13-g266aed4, and when I run a 108M test, it runs at 1250 us/it. The same test runs at 920 us/it when using v6.11-380-g79ea0cc.
I have also noticed it's not saying anything about the duration of the GEC, but I hope it is doing it. While writing, I have noticed it looks like it's doing P-1 on a different FFT size, is that possible? Code:
2020-11-22 09:39:53 GpuOwl VERSION v7.2-13-g266aed4 2020-11-22 09:39:53 config: -device 1 2020-11-22 09:39:53 config: -proof 8 2020-11-22 09:39:53 config: -nospin 2020-11-22 09:39:53 device 1, unique id '' 2020-11-22 09:39:53 gfx906-1 108850051 FFT: 6M 1K:12:256 (17.30 bpw) 2020-11-22 09:39:53 gfx906-1 108850051 OpenCL args "-DEXP=108850051u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=12u -DAMDGPU=1 -DWEIGHT_STEP_MINUS_1=0.62309825525553619 -DIWEIGHT_STEP_MINUS_1=-0.3838943534305243 -DIWEIGHTS={0,-0.3838943534305243,-0.24082766453041662,-0.064539274795706897,-0.42365736505765839,-0.28982409650658664,-0.12491323160025802,-0.46085410075068395,-0.33565833429543745,-0.18139069701609609,-0.49565018609731404,-0.37853446361658188,-0.23422314777169634,-0.056401114659886273,-0.41864339864529271,-0.28364583046985026,} -cl-std=CL2.0 -cl-finite-math-only " 2020-11-22 09:39:54 gfx906-1 108850051 ASM compilation failed, retrying compilation using NO_ASM 2020-11-22 09:39:58 gfx906-1 108850051 OpenCL compilation in 4.79 s 2020-11-22 09:39:58 gfx906-1 108850051 maxAlloc: 0.0 GB 2020-11-22 09:39:58 gfx906-1 108850051 You should use -maxAlloc if your GPU has more than 4GB memory. See help '-h' 2020-11-22 09:39:58 gfx906-1 108850051 P1(5.5M) 7935851 bits 2020-11-22 09:39:58 gfx906-1 108850051 PRP starting from beginning 2020-11-22 09:39:59 gfx906-1 108850051 Acquired memory lock 'memlock-1' 2020-11-22 09:39:59 gfx906-1 108850051 P1(5.5M) using 112 buffers 2020-11-22 09:40:02 gfx906-1 108850051 OK 0 on-load: blockSize 400, 0000000000000003 2020-11-22 09:40:02 gfx906-1 108850051 validating proof residues for power 8 2020-11-22 09:40:02 gfx906-1 108850051 Proof using power 8 2020-11-22 09:40:15 gfx906-1 108850051 10000 0.01% a834a715c12eb82f 1248 us/it 2020-11-22 09:40:27 gfx906-1 108850051 20000 0.02% 399d28f60cdc9b8e 1251 us/it 2020-11-22 09:40:35 gfx906-1 108850051 Stopping, please wait.. 2020-11-22 09:40:36 gfx906-1 108850051 OK 26000 0.02% 64247ab7a49860c3 1251 us/it + check 0.51s + save 0.75s; ETA 1d 13:48 | P1(5.5M) 0.3% ETA 02:45 5b28693878aaf77c 2020-11-22 09:40:36 gfx906-1 108850051 P1(5.5M) releasing 112 buffers 2020-11-22 09:40:36 gfx906-1 108850051 Released memory lock 'memlock-1' 2020-11-22 09:40:36 gfx906-1 Exiting because "stop requested" 2020-11-22 09:40:36 gfx906-1 Bye |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
GpuOwl PRP-Proof changes | preda | GpuOwl | 20 | 2020-10-17 06:51 |
gpuowl: runtime error | SELROC | GpuOwl | 59 | 2020-10-02 03:56 |
gpuOWL for Wagstaff | GP2 | GpuOwl | 22 | 2020-06-13 16:57 |
gpuowl tuning | M344587487 | GpuOwl | 14 | 2018-12-29 08:11 |
How to interface gpuOwl with PrimeNet | preda | PrimeNet | 2 | 2017-10-07 21:32 |