mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing > GpuOwl

Reply
 
Thread Tools
Old 2020-01-24, 17:10   #1805
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

111578 Posts
Default

Quote:
Originally Posted by Prime95 View Post
I think you'll get an error message. Try it.

You could also do "-use FANCY_MIDDLEMUL1,ORIGINAL_TWEAKED" to get fancy middlemul1 for middle=10,11 and original tweaked middle mul1 otherwise.
Yup. Instant trouble, a real showstopper.
Code:
2020-01-23 16:54:19 condorella/rx480 82053239 FFT 4608K: Width 256x4, Height 64x4, Middle 9; 17.39 bits/word
2020-01-23 16:54:19 condorella/rx480 OpenCL args "-DEXP=82053239u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=9u -DWEIGHT_STEP=0xc.373107b1f3e78p-3 -DIWEIGHT_STE
P=0xa.7a792f1683b7p-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DAMDGPU=1 -DCARRY32=1 -DCHEBYSHEV_MIDDLEMUL2=1 -DMERGED_MIDD
LE=1 -DMORE_SQUARES_MIDDLEMUL1=1 -DNEW_SLOWTRIG=1 -DNO_ASM=1 -DT2_SHUFFLE_HEIGHT=1 -DT2_SHUFFLE_WIDTH=1 -DUNROLL_HEIGHT=1 -DUNROLL_WIDTH=1 -DWORKINGIN1=1 -DWORK
INGOUT1=1  -I. -cl-fast-relaxed-math -cl-std=CL2.0"
2020-01-23 16:54:22 condorella/rx480 OpenCL compilation in 2.65 s
2020-01-23 16:54:24 condorella/rx480 82053239 EE        0 loaded: blockSize 400, a8c3b11429b46cbf (expected 0000000000000003)
2020-01-23 16:54:24 condorella/rx480 Exiting because "error on load"
2020-01-23 16:54:24 condorella/rx480 Bye
Next step; coding in gpuowl to apply -use options only when they are legal and nonfatal for the applicable fft length? Support for fft-length conditionals in config.txt?
This is a PRP-DC 4608K fft length, to which 5M fft length optimal -use options were applied from config.txt, with fatal result.
Not looking forward to tuning a long list of -use options on an fftlength by fftlength basis for numerous gpu models and swapping them out manually when exponents change, or having such crashes discard 18 hours of gpu time instead of make progress.
At 3-5% speedup on many models, it takes a long time to pay that back.

Last fiddled with by kriesel on 2020-01-24 at 17:17
kriesel is online now   Reply With Quote
Old 2020-01-24, 22:59   #1806
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

471910 Posts
Default Gpuowl -use options tune on RX480 for 4.5M fft length

Likes a somewhat different combination than for 5M
Quote:
gpuowl v6.11-134-g1e0ce1d
RX480 8GB
Win7 x64
exponent 82053239 PRP
4.5M fft
-iters 10000 -time
all timings below are in microsec/iteration


NO_ASM 3021
NO_ASM 3022
NO_ASM,UNROLL_ALL 3010 *
NO_ASM,UNROLL_NONE 3039
NO_ASM,UNROLL_WIDTH 3035
NO_ASM,UNROLL_HEIGHT 3038
NO_ASM,UNROLL_MIDDLEMUL1 3036
NO_ASM,UNROLL_MIDDLEMUL2 3027

NO_ASM,UNROLL_WIDTH,UNROLL_MIDDLEMUL1 3028
NO_ASM,UNROLL_WIDTH,UNROLL_MIDDLEMUL2 3019, 3028
NO_ASM,NO_ASM,UNROLL_MIDDLEMUL2,UNROLL_MIDDLEMUL1 2989 *
NO_ASM,UNROLL_WIDTH,UNROLL_MIDDLEMUL1,UNROLL_MIDDLEMUL2 2996

NO_ASM,MERGED_MIDDLE,WORKINGIN 5309
NO_ASM,MERGED_MIDDLE,WORKINGIN 5306
NO_ASM,MERGED_MIDDLE,WORKINGIN1 3032
NO_ASM,MERGED_MIDDLE,WORKINGIN1A 3052
NO_ASM,MERGED_MIDDLE,WORKINGIN2 3111
NO_ASM,MERGED_MIDDLE,WORKINGIN3 3133
NO_ASM,MERGED_MIDDLE,WORKINGIN4 3454
NO_ASM,MERGED_MIDDLE,WORKINGIN5 2995 *

NO_ASM,MERGED_MIDDLE,WORKINGOUT 5224
NO_ASM,MERGED_MIDDLE,WORKINGOUT0 4036
NO_ASM,MERGED_MIDDLE,WORKINGOUT1 2984 *
NO_ASM,MERGED_MIDDLE,WORKINGOUT1A 3012/2982
NO_ASM,MERGED_MIDDLE,WORKINGOUT2 3353
NO_ASM,MERGED_MIDDLE,WORKINGOUT3 2986
NO_ASM,MERGED_MIDDLE,WORKINGOUT4 3137
NO_ASM,MERGED_MIDDLE,WORKINGOUT5 2995

NO_ASM,MERGED_MIDDLE,%wkgin%,%wkgout% 2973
NO_ASM,MERGED_MIDDLE,%wkgin%,%wkgout%,T2_SHUFFLE_WIDTH 2957 *
NO_ASM,MERGED_MIDDLE,%wgkin%,%wkgout%,T2_SHUFFLE_MIDDLE 3026
NO_ASM,MERGED_MIDDLE,%wkgin%,%wkgout%,T2_SHUFFLE_HEIGHT 2966
NO_ASM,MERGED_MIDDLE,%wkgin%,%wkgout%,T2_SHUFFLE_REVERSELINE 2972
NO_ASM,MERGED_MIDDLE,%wkgin%,%wkgout%,T2_SHUFFLE 2992

set allotheroptions=NO_ASM,WORKINGIN5,WORKINGOUT1,UNROLL_MIDDLEMUL2,UNROLL_MIDDLEMUL1
%allotheroptions%,T2_SHUFFLE_WIDTH,T2_SHUFFLE_HEIGHT 2938 *
%allotheroptions%,T2_SHUFFLE_HEIGHT,T2_SHUFFLE_MIDDLE,T2_SHUFFLE_WIDTH 2989
%allotheroptions%,T2_SHUFFLE_HEIGHT,T2_SHUFFLE_MIDDLE,T2_SHUFFLE_WIDTH,T2_SHUFFLE_REVERSELINE 2987

set allotheroptions=NO_ASM,MERGED_MIDDLE,UNROLL_HEIGHT,UNROLL_WIDTH,WORKINGIN1,WORKINGOUT1,T2_SHUFFLE_WIDTH,T2_SHUFFLE_HEIGHT
%allotheroptions%,CARRY32 2940 *
%allotheroptions%,CARRY64 3054

set allotheroptions=NO_ASM,MERGED_MIDDLE,WORKINGIN5,WORKINGOUT1,T2_SHUFFLE_WIDTH,T2_SHUFFLE_HEIGHT,UNROLL_MIDDLEMUL2,UNROLL_MIDDLEMUL1,CARRY32
%allotheroptions%,FANCY_MIDDLEMUL1 "error: Clang front-end compilation failed!"
%allotheroptions%,MORE_SQUARES_MIDDLEMUL1 2985
%allotheroptions%,CHEBYSHEV_METHOD 2919
%allotheroptions%,CHEBYSHEV_METHOD_FMA 2911 *
%allotheroptions%,ORIGINAL_METHOD 2942
%allotheroptions%,ORIGINAL_TWEAKED 2937

set allotheroptions=NO_ASM,MERGED_MIDDLE,WORKINGIN5,WORKINGOUT1,T2_SHUFFLE_WIDTH,T2_SHUFFLE_HEIGHT,UNROLL_MIDDLEMUL2,UNROLL_MIDDLEMUL1,CARRY32,CHEBYSHEV_METHOD_FMA
%allotheroptions%,ORIG_MIDDLEMUL2 2926
%allotheroptions%,CHEBYSHEV_MIDDLEMUL2 2916 *

%allotheroptions%,ORIG_SLOWTRIG 3058
%allotheroptions%,NEW_SLOWTRIG 2910
%allotheroptions%,MORE_ACCURATE 2921
%allotheroptions%,LESS_ACCURATE 2909 *

NO_ASM,MERGED_MIDDLE,WORKINGIN5,WORKINGOUT1,T2_SHUFFLE_WIDTH,T2_SHUFFLE_HEIGHT,UNROLL_MIDDLEMUL2,UNROLL_MIDDLEMUL1,CARRY32,CHEBYSHEV_METHOD_FMA,CHEBYSHEV_MIDDLEMUL2,LESS_ACCURATE

base 3021.5
repeatability +-1.5/5307.5 =~ +-0.028%
best 2909
ratio 3021.5/2909 = 1.039
kriesel is online now   Reply With Quote
Old 2020-02-01, 20:46   #1807
JCoveiro
 
"Jorge Coveiro"
Nov 2006
Moura, Portugal

2610 Posts
Default GTX1660 -use options

Hi!

Can someone help me?
I've a Nvidia GTX1660 running gpuowl at around 8250 us/it (FFT 5632K).
With some overclock I can get less then 8000 us/it, but I'm not sure how to test gpus better for errors or tuning it with -use options. Can someone help me out?

More questions:
1. I'm considering to buy 2x Radeon VII or should I wait for Big Navi?
2. Anyone with AMD 5700 XT benchmarks to compare with Radeon VII?
3. CudaLUCAS seems to run slower then gpuowl. Are there any other options?

Thanks!
JCoveiro is offline   Reply With Quote
Old 2020-02-01, 21:18   #1808
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

24×83 Posts
Default

Quote:
Originally Posted by JCoveiro View Post
1. I'm considering to buy 2x Radeon VII or should I wait for Big Navi?
My expectation is that Radeon VII will still be better than "big navi" because it has such a good DP (FP64) throughput. Also the memory is both large and fast. In addition to that, the prices for Radeon VII moved down a bit.
preda is online now   Reply With Quote
Old 2020-02-01, 22:22   #1809
xx005fs
 
"Eric"
Jan 2018
USA

211 Posts
Default

Quote:
Originally Posted by JCoveiro View Post
Hi!

Can someone help me?
I've a Nvidia GTX1660 running gpuowl at around 8250 us/it (FFT 5632K).
With some overclock I can get less then 8000 us/it, but I'm not sure how to test gpus better for errors or tuning it with -use options. Can someone help me out?

More questions:
1. I'm considering to buy 2x Radeon VII or should I wait for Big Navi?
2. Anyone with AMD 5700 XT benchmarks to compare with Radeon VII?
3. CudaLUCAS seems to run slower then gpuowl. Are there any other options?

Thanks!
You have an Nvidia Turing GPU which is amazing for trial factoring, and the 1660 is particularly efficient in that workload due to its 1080ti like performance but significantly lower power draw. In that case, overclocking the core will help trial factoring but memory won't change anything but waste more power. Go ahead and try that out if you want to factor some numbers.

A1: Definitely buy 2 radeon VII over big navi, I seriously doubt amd will put FP64 performance on big navi since the norm right now for gaming GPU is to cut down FP64 as much as possible to save die space for Ray Tracing or Shaders.
A2: I think the OpenCL is still broken on Navi GPUs and run much more stably on GCN GPUs. Even if it's not broken I am assuming that the 5700xt should perform slightly better than a stock Vega 56 in PRP, so around 3000us/it for 5632K FFT. But Radeon VII should get it close to 1000us/it (I personally don't own one but if i remembered correctly from other owner's benchmarks).
A3: gpuowl is already the fastest option for primality tests. Maybe future optimizations will make it even faster but for now it's going to be way faster than CUDALucas on memory bound GPUs such as Titan V or Radeon VII (in which the latter doesn't run on CUDALucas but gpuowl is 2x faster on Titan V). Though it doesn't matter if you own a modern Nvidia (supporting OpenCL 2.0 and above) or AMD GPU and you should always run gpuowl over CUDALucas or CLLucas due to its superior error checking algorithm that could potentially eliminate the need for double checking.

Last fiddled with by xx005fs on 2020-02-01 at 22:24
xx005fs is offline   Reply With Quote
Old 2020-02-01, 22:36   #1810
JCoveiro
 
"Jorge Coveiro"
Nov 2006
Moura, Portugal

110102 Posts
Default

Quote:
Originally Posted by preda View Post
My expectation is that Radeon VII will still be better than "big navi" because it has such a good DP (FP64) throughput. Also the memory is both large and fast. In addition to that, the prices for Radeon VII moved down a bit.
The prices of the Radeon VII are still high in here. It's around 800€ each.
But I think they're a good investment anyway (for this kind of project).
I hope they move down a bit more, since AMD discontinued them.
JCoveiro is offline   Reply With Quote
Old 2020-02-01, 22:52   #1811
JCoveiro
 
"Jorge Coveiro"
Nov 2006
Moura, Portugal

2×13 Posts
Default

Quote:
Originally Posted by xx005fs View Post
You have an Nvidia Turing GPU which is amazing for trial factoring, and the 1660 is particularly efficient in that workload due to its 1080ti like performance but significantly lower power draw. In that case, overclocking the core will help trial factoring but memory won't change anything but waste more power. Go ahead and try that out if you want to factor some numbers.

A1: Definitely buy 2 radeon VII over big navi, I seriously doubt amd will put FP64 performance on big navi since the norm right now for gaming GPU is to cut down FP64 as much as possible to save die space for Ray Tracing or Shaders.
A2: I think the OpenCL is still broken on Navi GPUs and run much more stably on GCN GPUs. Even if it's not broken I am assuming that the 5700xt should perform slightly better than a stock Vega 56 in PRP, so around 3000us/it for 5632K FFT. But Radeon VII should get it close to 1000us/it (I personally don't own one but if i remembered correctly from other owner's benchmarks).
A3: gpuowl is already the fastest option for primality tests. Maybe future optimizations will make it even faster but for now it's going to be way faster than CUDALucas on memory bound GPUs such as Titan V or Radeon VII (in which the latter doesn't run on CUDALucas but gpuowl is 2x faster on Titan V). Though it doesn't matter if you own a modern Nvidia (supporting OpenCL 2.0 and above) or AMD GPU and you should always run gpuowl over CUDALucas or CLLucas due to its superior error checking algorithm that could potentially eliminate the need for double checking.
Thanks for the answers!

Well... AMD 5700 XT is alot cheaper than the Radeon VII.
They're almost half-price of the Radeon VII.
Also 3000us/it for the 5700 XT is still good, but 1000us/it for the Radeon VII is awesome!
JCoveiro is offline   Reply With Quote
Old 2020-02-01, 22:58   #1812
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

111578 Posts
Default

Quote:
Originally Posted by JCoveiro View Post
Hi!

Can someone help me?
I've a Nvidia GTX1660 running gpuowl at around 8250 us/it (FFT 5632K).
With some overclock I can get less then 8000 us/it, but I'm not sure how to test gpus better for errors or tuning it with -use options. Can someone help me out?

More questions:
1. I'm considering to buy 2x Radeon VII or should I wait for Big Navi?
2. Anyone with AMD 5700 XT benchmarks to compare with Radeon VII?
3. CudaLUCAS seems to run slower then gpuowl. Are there any other options?

Thanks!
A GTX1660 is so much better at TF, relatively speaking, that it's probably a waste to run it on gpuowl instead, even though gpuowl is excellent. But your kit your choice. CUDALucas has not had any significant development in years, so naturally has fallen behind. Preda, Prime95, and others have done a great job on gpuowl speed and other improvements.
"More questions" has been covered pretty well already by others.

For gpuowl -use option timing and tuning, I use the Windows batch file attached.
Pass zero and one run together; other passes individually. Edit the gotos and sets from one pass to the next, to change the control flow and -use options in effect, respectively.
That is what I did to produce my previous posts of tuning results. See the comments at both ends of the file, for more info. (Had to zip it, the forum won't accept a .bat file.)

Please post your tuning results.
Attached Files
File Type: 7z gwtime.7z (1.8 KB, 40 views)

Last fiddled with by kriesel on 2020-02-01 at 23:01
kriesel is online now   Reply With Quote
Old 2020-02-02, 00:58   #1813
JCoveiro
 
"Jorge Coveiro"
Nov 2006
Moura, Portugal

2·13 Posts
Exclamation Bug!!!

Quote:
Originally Posted by kriesel View Post
A GTX1660 is so much better at TF, relatively speaking, that it's probably a waste to run it on gpuowl instead, even though gpuowl is excellent. But your kit your choice. CUDALucas has not had any significant development in years, so naturally has fallen behind. Preda, Prime95, and others have done a great job on gpuowl speed and other improvements.
"More questions" has been covered pretty well already by others.

For gpuowl -use option timing and tuning, I use the Windows batch file attached.
Pass zero and one run together; other passes individually. Edit the gotos and sets from one pass to the next, to change the control flow and -use options in effect, respectively.
That is what I did to produce my previous posts of tuning results. See the comments at both ends of the file, for more info. (Had to zip it, the forum won't accept a .bat file.)

Please post your tuning results.
Thanks!

But first, just want to say that there is a bug on the program.

I'm using gpuowl v6.11-134-g1e0ce1d.

#####################################

Running the batch outputs the following errors:

Error#1
Running the Windows batch file at:
2020-02-01 23:55:14 config: -time -iters 10000 -use NO_ASM,UNROLL_NONE
outputs some errors and after the following:
2020-02-01 23:55:14 GeForce GTX 1660-0 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:247 build

Error#2
Running the Windows batch file at:
2020-02-01 23:55:14 config: -time -iters 10000 -use NO_ASM,UNROLL_WIDTH
outputs some errors and after the following:
2020-02-01 23:55:15 GeForce GTX 1660-0 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:247 build

Error#3
Running the Windows batch file at:
2020-02-01 23:55:15 config: -time -iters 10000 -use NO_ASM,UNROLL_HEIGHT
outputs some errors and after the following:
2020-02-01 23:55:15 GeForce GTX 1660-0 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:247 build

Error#4

Running the Windows batch file at:
2020-02-01 23:55:15 config: -time -iters 10000 -use NO_ASM,UNROLL_MIDDLEMUL1
outputs some errors and after the following:
2020-02-01 23:55:16 GeForce GTX 1660-0 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:247 build

Error#5
Running the Windows batch file at:
2020-02-01 23:55:16 config: -time -iters 10000 -use NO_ASM,UNROLL_MIDDLEMUL2
outputs some errors and after the following:
2020-02-01 23:55:16 GeForce GTX 1660-0 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:247 build

#####################################

Here are some more details on Error#1:

Code:
2020-02-01 23:55:14 config: -time -iters 10000 -use NO_ASM,UNROLL_NONE 
2020-02-01 23:55:14 device 0, unique id ''
2020-02-01 23:55:14 GeForce GTX 1660-0 99753809 FFT 5632K: Width 256x4, Height 64x4, Middle 11; 17.30 bits/word
2020-02-01 23:55:14 GeForce GTX 1660-0 OpenCL args "-DEXP=99753809u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=11u -DWEIGHT_STEP=0xd.064531a6f6b48p-3 -DIWEIGHT_STEP=0x9.d3e00e7c301p-4 -DWEIGHT_BIGSTEP=0xd.744fccad69d68p-3 -DIWEIGHT_BIGSTEP=0x9.837f0518db8a8p-4 -DNO_ASM=1 -DUNROLL_NONE=1  -I. -cl-fast-relaxed-math -cl-std=CL2.0"
2020-02-01 23:55:14 GeForce GTX 1660-0 OpenCL compilation error -11 (args -DEXP=99753809u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=11u -DWEIGHT_STEP=0xd.064531a6f6b48p-3 -DIWEIGHT_STEP=0x9.d3e00e7c301p-4 -DWEIGHT_BIGSTEP=0xd.744fccad69d68p-3 -DIWEIGHT_BIGSTEP=0x9.837f0518db8a8p-4 -DNO_ASM=1 -DUNROLL_NONE=1  -I. -cl-fast-relaxed-math -cl-std=CL2.0 -DNO_ASM=1)
2020-02-01 23:55:14 GeForce GTX 1660-0 <kernel>:1386:3: error: expected identifier or '('
  for (i32 s = 4; s >= 0; s -= 2) {
  ^
<kernel>:1394:3: error: expected identifier or '('
  for (i32 s = 4; s >= 0; s -= 2) {
  ^
<kernel>:1404:3: error: expected identifier or '('
  for (i32 s = 3; s >= 0; s -= 3) {
  ^
<kernel>:1412:3: error: expected identifier or '('
  for (i32 s = 3; s >= 0; s -= 3) {
  ^
<kernel>:1422:3: error: expected identifier or '('
  for (i32 s = 6; s >= 0; s -= 2) {
  ^
<kernel>:1430:3: error: expected identifier or '('
  for (i32 s = 6; s >= 0; s -= 2) {
  ^
<kernel>:1440:3: error: expected identifier or '('
  for (i32 s = 6; s >= 0; s -= 3) {
  ^
<kernel>:1448:3: error: expected identifier or '('
  for (i32 s = 6; s >= 0; s -= 3) {
  ^
<kernel>:1458:3: error: expected identifier or '('
  for (i32 s = 5; s >= 2; s -= 3) {
  ^
<kernel>:1502:3: error: expected identifier or '('
  for (i32 s = 5; s >= 2; s -= 3) {
  ^
<kernel>:2478:3: error: expected identifier or '('
  for (i32 i = 0; i < MIDDLE; ++i) {
  ^

2020-02-01 23:55:14 GeForce GTX 1660-0 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:247 build
2020-02-01 23:55:14 GeForce GTX 1660-0 Bye

Last fiddled with by JCoveiro on 2020-02-02 at 01:15
JCoveiro is offline   Reply With Quote
Old 2020-02-02, 01:15   #1814
xx005fs
 
"Eric"
Jan 2018
USA

211 Posts
Default

Quote:
Originally Posted by kriesel View Post
Please post your tuning results.
I ran this file on my Titan V to try out the most recent update, but I got consistently slower result (657us/it vs 632us/it) compared to version 6.11-113-g6ecd9a2 that I am running. Seems like that the default Nvidia optimization settings don't play well with the Titan V.

Last fiddled with by xx005fs on 2020-02-02 at 01:15
xx005fs is offline   Reply With Quote
Old 2020-02-02, 01:41   #1815
JCoveiro
 
"Jorge Coveiro"
Nov 2006
Moura, Portugal

2·13 Posts
Exclamation Bug#2

I have found another bug, while trying to test M47 (a lower exponent).

Code:
2020-02-02 01:36:38 gpuowl v6.11-134-g1e0ce1d
2020-02-02 01:36:38 Note: not found 'config.txt'
2020-02-02 01:36:38 config: -use UNROLL_ALL,WORKINGIN4,WORKINGOUT4,T2_SHUFFLE,CARRY64,FANCYMIDDLEMUL1,LESS_ACCURATE
2020-02-02 01:36:38 device 0, unique id ''
2020-02-02 01:36:38 GeForce GTX 1660-0 43112609 FFT 2304K: Width 8x8, Height 256x8, Middle 9; 18.27 bits/word
2020-02-02 01:36:39 GeForce GTX 1660-0 OpenCL args "-DEXP=43112609u -DWIDTH=64u -DSMALL_HEIGHT=2048u -DMIDDLE=9u -DWEIGHT_STEP=0xd.3ca600d8f455p-3 -DIWEIGHT_STEP=0x9.ab80a96f8aeap-4 -DWEIGHT_BIGSTEP=0xe.ac0c6e7dd2438p-3 -DIWEIGHT_BIGSTEP=0x8.b95c1e3ea8bd8p-4 -DCARRY64=1 -DFANCYMIDDLEMUL1=1 -DLESS_ACCURATE=1 -DT2_SHUFFLE=1 -DUNROLL_ALL=1 -DWORKINGIN4=1 -DWORKINGOUT4=1  -I. -cl-fast-relaxed-math -cl-std=CL2.0"
2020-02-02 01:36:39 GeForce GTX 1660-0 OpenCL compilation error -11 (args -DEXP=43112609u -DWIDTH=64u -DSMALL_HEIGHT=2048u -DMIDDLE=9u -DWEIGHT_STEP=0xd.3ca600d8f455p-3 -DIWEIGHT_STEP=0x9.ab80a96f8aeap-4 -DWEIGHT_BIGSTEP=0xe.ac0c6e7dd2438p-3 -DIWEIGHT_BIGSTEP=0x8.b95c1e3ea8bd8p-4 -DCARRY64=1 -DFANCYMIDDLEMUL1=1 -DLESS_ACCURATE=1 -DT2_SHUFFLE=1 -DUNROLL_ALL=1 -DWORKINGIN4=1 -DWORKINGOUT4=1  -I. -cl-fast-relaxed-math -cl-std=CL2.0 -DNO_ASM=1)
2020-02-02 01:36:39 GeForce GTX 1660-0 <kernel>:2009:2: error: WORKINGOUT4 not compatible with this FFT size
#error WORKINGOUT4 not compatible with this FFT size
 ^

2020-02-02 01:36:39 GeForce GTX 1660-0 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:247 build
2020-02-02 01:36:39 GeForce GTX 1660-0 Bye

Last fiddled with by JCoveiro on 2020-02-02 at 01:42
JCoveiro is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1657 2020-10-27 01:23
GPUOWL AMD Windows OpenCL issues xx005fs GpuOwl 0 2019-07-26 21:37
Testing an expression for primality 1260 Software 17 2015-08-28 01:35
Testing Mersenne cofactors for primality? CRGreathouse Computer Science & Computational Number Theory 18 2013-06-08 19:12
Primality-testing program with multiple types of moduli (PFGW-related) Unregistered Information & Answers 4 2006-10-04 22:38

All times are UTC. The time now is 20:47.

Fri Nov 27 20:47:29 UTC 2020 up 78 days, 17:58, 3 users, load averages: 1.26, 1.34, 1.45

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.