mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing > GpuOwl

Reply
 
Thread Tools
Old 2020-10-21, 17:23   #2520
Viliam Furik
 
Jul 2018
Martin, Slovakia

111111112 Posts
Default

Quote:
Originally Posted by kriesel View Post
I don't think so. Consider that the source code for mfaktc contains lots of uint32, occasional uint or uint64, and no float declarations in gpusieve.c, my_types.h; a mix of int and float in the kernel code. We want good performance in int 24 and 32, and float.
Yeah, I remember somebody telling me that. But then why is the RTX 2080Ti TF throughput consistent with its FP32 TFLOPS?

FP32 TFLOPS for RTX 2080Ti is 11,75 TFLOPS, which translates to 5875 GHz-D/D, which really is the most I can observe on stock settings.
Viliam Furik is online now   Reply With Quote
Old 2020-10-22, 01:38   #2521
moebius
 
moebius's Avatar
 
Jul 2009
Germany

5·7·13 Posts
Default

This one have similiar speed to GeForce GTX 980 Ti... (If I have all the comparison values ​​together, I should create a Top 100 ranking list.)
Code:
020-10-21 23:33:27 Tesla T4-0 OpenCL compilation in 1.81 s
2020-10-21 23:33:29 Tesla T4-0 77936867 OK        0 loaded: blockSize 400, 0000000000000003
2020-10-21 23:33:29 Tesla T4-0 validating proof residues for power 8
2020-10-21 23:33:29 Tesla T4-0 Proof using power 8
2020-10-21 23:33:34 Tesla T4-0 77936867 OK      800   0.00%; 4247 us/it; ETA 3d 19:57; 1579c241dc63eca6 (check 1.82s)
2020-10-21 23:47:52 Tesla T4-0 77936867 OK   200000   0.26%; 4299 us/it; ETA 3d 20:50; f0b04b45b0855bd2 (check 1.85s)
2020-10-22 00:02:15 Tesla T4-0 77936867 OK   400000   0.51%; 4304 us/it; ETA 3d 20:43; c03f94396a5aa29e (check 1.85s)
2020-10-22 00:16:37 Tesla T4-0 77936867 OK   600000   0.77%; 4300 us/it; ETA 3d 20:22; b9decd65ca71b629 (check 1.84s)

Last fiddled with by moebius on 2020-10-22 at 02:05
moebius is online now   Reply With Quote
Old 2020-10-22, 04:35   #2522
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

291410 Posts
Default

Quote:
Originally Posted by Viliam Furik View Post
Yeah, I remember somebody telling me that. But then why is the RTX 2080Ti TF throughput consistent with its FP32 TFLOPS?

FP32 TFLOPS for RTX 2080Ti is 11,75 TFLOPS, which translates to 5875 GHz-D/D, which really is the most I can observe on stock settings.
The RTX 20xx series has an equal number of FP32 and INT32 cores.

In the RTX 30xx series is the same, but the INT32 cores can also do FP32, so it can give up to double the FP32 performance of the RTX 20xx series, but only equivalent INT32 performance for the same number of cores at the same frequency.
Mark Rose is offline   Reply With Quote
Old 2020-10-22, 19:14   #2523
Viliam Furik
 
Jul 2018
Martin, Slovakia

3×5×17 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
The RTX 20xx series has an equal number of FP32 and INT32 cores.

In the RTX 30xx series is the same, but the INT32 cores can also do FP32, so it can give up to double the FP32 performance of the RTX 20xx series, but only equivalent INT32 performance for the same number of cores at the same frequency.
Alright, sorry for my repeated false statement.

But shouldn't then the code be reworked to work with FP32? It seems like it should work - has a lot higher maximum value. Thus could potentially extend the range for the maximal exponent. (If so, please remove the minimal limit, too.)

This above is my view on how it could work, I may be absolutely wrong.

If it would be successfully reworked, and the DPbySP experiment turns out to also be successful, GIMPS would buy out all RTX 3080s and RTX 3090s (those maybe not, very expensive) within few days.
Viliam Furik is online now   Reply With Quote
Old 2020-10-22, 22:36   #2524
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

2×349 Posts
Default

There is potential, it's been discussed a little on the forum but from the sounds of it it's not straightforward. There's no rush to buy or to experiment with an implementation, it's not like the R7 which may only have had a production run measured in tens of thousands, there will eventually be millions of the 30 series.


You may be mildly overestimating the buying power of GIMPSters ;)
M344587487 is online now   Reply With Quote
Old 2020-10-23, 02:41   #2525
xx005fs
 
"Eric"
Jan 2018
USA

211 Posts
Default

Quote:
Originally Posted by moebius View Post
The AMD RX Vega 64 remains the second winner in the consumer card category.

Benchmarks of 77936867 for the following graphics cards would be interesting.
AMD Radeon RX 6900/6800/6700 XT when available
Nvidia GeForce RTX 3080/3070, 3060 TI, 2080 Ti
Nvidia Titan RTX, Titan V
Nvidia A100 SXM4

Stock: 1600MHz Core, 850MHz HBM2 memory, 250W
Code:
gpuowl-win -prp 77936867 -maxAlloc 8192 -nospin
2020-10-22 19:30:10 gpuowl v7.0-66-gebe49cc
2020-10-22 19:30:10 Note: not found 'config.txt'
2020-10-22 19:30:10 config: -prp 77936867 -maxAlloc 8192 -nospin
2020-10-22 19:30:10 device 0, unique id ''
2020-10-22 19:30:10 TITAN V-0 77936867 FFT: 4M 1K:8:256 (18.58 bpw)
2020-10-22 19:30:10 TITAN V-0 77936867 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DCARRY64=1 -DCARRYM64=1 -DMM_CHAIN=1u -DMM2_CHAIN=2u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0xa.c42d0d7cec038p-5 -DIWEIGHT_STEP_MINUS_1=-0x8.0e50c8817ddf8p-5  -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2020-10-22 19:30:10 TITAN V-0 77936867

2020-10-22 19:30:10 TITAN V-0 77936867 OpenCL compilation in 0.01 s
2020-10-22 19:30:10 TITAN V-0 77936867 maxAlloc: 8.0 GB
2020-10-22 19:30:10 TITAN V-0 77936867 P1(0) 0 bits
2020-10-22 19:30:10 TITAN V-0 77936867 PRP starting from beginning
2020-10-22 19:30:10 TITAN V-0 77936867 OK         0 loaded: blockSize 400, 0000000000000003
2020-10-22 19:30:10 TITAN V-0 77936867 validating proof residues for power 8
2020-10-22 19:30:10 TITAN V-0 77936867 Proof using power 8
2020-10-22 19:30:11 TITAN V-0 77936867 OK       800   0.00% 1579c241dc63eca6  596 us/it + check 0.27s + save 0.11s; ETA 12:54
2020-10-22 19:30:16 TITAN V-0 77936867        10000   0.01% fc4f135f7cf4ad29  588 us/it
2020-10-22 19:30:22 TITAN V-0 77936867        20000   0.03% 3cd1bd9d5e09cbc5  589 us/it
2020-10-22 19:30:28 TITAN V-0 77936867        30000   0.04% c4e0ff35e3290d98  590 us/it
2020-10-22 19:30:34 TITAN V-0 77936867        40000   0.05% dffe1b1b0d748128  590 us/it
2020-10-22 19:30:40 TITAN V-0 77936867        50000   0.06% 52e286945371ed29  590 us/it
2020-10-22 19:30:46 TITAN V-0 77936867        60000   0.08% 0945da4dc08bdd95  590 us/it
2020-10-22 19:30:52 TITAN V-0 77936867        70000   0.09% 7131fa4eb77f4bb2  590 us/it
2020-10-22 19:30:58 TITAN V-0 77936867        80000   0.10% 8d76071d27ee4221  591 us/it
2020-10-22 19:31:04 TITAN V-0 77936867        90000   0.12% 0bacff453b2f470e  590 us/it
2020-10-22 19:31:10 TITAN V-0 77936867       100000   0.13% 6d7296b9e2830f50  591 us/it
2020-10-22 19:31:12 TITAN V-0 77936867 Stopping, please wait..
2020-10-22 19:31:13 TITAN V-0 77936867 OK    104400   0.13% 587552d3b9350467  592 us/it + check 0.27s + save 0.11s; ETA 12:48
2020-10-22 19:31:13 TITAN V-0 Exiting because "stop requested"
2020-10-22 19:31:13 TITAN V-0 Bye
Tuned: 1400MHz Core, 1040MHz HBM2, 170W
Code:
gpuowl-win -prp 77936867 -maxAlloc 8192 -nospin
2020-10-22 19:34:11 gpuowl v7.0-66-gebe49cc
2020-10-22 19:34:11 Note: not found 'config.txt'
2020-10-22 19:34:11 config: -prp 77936867 -maxAlloc 8192 -nospin
2020-10-22 19:34:11 device 0, unique id ''
2020-10-22 19:34:11 TITAN V-0 77936867 FFT: 4M 1K:8:256 (18.58 bpw)
2020-10-22 19:34:11 TITAN V-0 77936867 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DCARRY64=1 -DCARRYM64=1 -DMM_CHAIN=1u -DMM2_CHAIN=2u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0xa.c42d0d7cec038p-5 -DIWEIGHT_STEP_MINUS_1=-0x8.0e50c8817ddf8p-5  -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2020-10-22 19:34:11 TITAN V-0 77936867

2020-10-22 19:34:11 TITAN V-0 77936867 OpenCL compilation in 0.01 s
2020-10-22 19:34:11 TITAN V-0 77936867 maxAlloc: 8.0 GB
2020-10-22 19:34:11 TITAN V-0 77936867 P1(0) 0 bits
2020-10-22 19:34:11 TITAN V-0 77936867 PRP starting from beginning
2020-10-22 19:34:12 TITAN V-0 77936867 OK         0 loaded: blockSize 400, 0000000000000003
2020-10-22 19:34:12 TITAN V-0 77936867 validating proof residues for power 8
2020-10-22 19:34:12 TITAN V-0 77936867 Proof using power 8
2020-10-22 19:34:12 TITAN V-0 77936867 OK       800   0.00% 1579c241dc63eca6  500 us/it + check 0.23s + save 0.11s; ETA 10:49
2020-10-22 19:34:17 TITAN V-0 77936867        10000   0.01% fc4f135f7cf4ad29  494 us/it
2020-10-22 19:34:22 TITAN V-0 77936867        20000   0.03% 3cd1bd9d5e09cbc5  495 us/it
2020-10-22 19:34:27 TITAN V-0 77936867        30000   0.04% c4e0ff35e3290d98  496 us/it
2020-10-22 19:34:32 TITAN V-0 77936867        40000   0.05% dffe1b1b0d748128  497 us/it
2020-10-22 19:34:37 TITAN V-0 77936867        50000   0.06% 52e286945371ed29  497 us/it
2020-10-22 19:34:42 TITAN V-0 77936867        60000   0.08% 0945da4dc08bdd95  498 us/it
2020-10-22 19:34:47 TITAN V-0 77936867        70000   0.09% 7131fa4eb77f4bb2  499 us/it
2020-10-22 19:34:52 TITAN V-0 77936867        80000   0.10% 8d76071d27ee4221  499 us/it
2020-10-22 19:34:57 TITAN V-0 77936867        90000   0.12% 0bacff453b2f470e  500 us/it
2020-10-22 19:35:02 TITAN V-0 77936867       100000   0.13% 6d7296b9e2830f50  500 us/it
2020-10-22 19:35:07 TITAN V-0 77936867       110000   0.14% 8cbfd4435622bda7  500 us/it
2020-10-22 19:35:08 TITAN V-0 77936867 Stopping, please wait..
2020-10-22 19:35:09 TITAN V-0 77936867 OK    113600   0.15% fb675f1fc2063c9b  501 us/it + check 0.23s + save 0.11s; ETA 10:50
2020-10-22 19:35:09 TITAN V-0 Exiting because "stop requested"
 2020-10-22 19:35:09 TITAN V-0 Bye

It seems that the new version doesn't let me use CARRY32, which the older 6.11 version did and appears to run faster. Here's the result for 6.11 on the same exponent
Code:
gpuowl -device 0 -carry short -use CARRY32,ORIG_SLOWTRIG,IN_WG=128,IN_SIZEX=16,IN_SPACING=4,OUT_WG=128,OUT_SIZEX=16,OUT_SPACING=4 -nospin -block 100 -maxAlloc 10000 -B1 750000 -rB2 20 -prp 77936867
2020-10-22 19:36:40 gpuowl v6.11-364-g36f4e2a
2020-10-22 19:36:40 Note: not found 'config.txt'
2020-10-22 19:36:40 config: -device 0 -carry short -use CARRY32,ORIG_SLOWTRIG,IN_WG=128,IN_SIZEX=16,IN_SPACING=4,OUT_WG=128,OUT_SIZEX=16,OUT_SPACING=4 -nospin -block 100 -maxAlloc 10000 -B1 750000 -rB2 20 -prp 77936867
2020-10-22 19:36:40 device 0, unique id ''
2020-10-22 19:36:40 TITAN V-0 77936867 FFT: 4M 1K:8:256 (18.58 bpw)
2020-10-22 19:36:40 TITAN V-0 Expected maximum carry32: 583B0000
2020-10-22 19:36:40 TITAN V-0 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DPM1=0 -DMM_CHAIN=1u -DMM2_CHAIN=2u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0xa.c42d0d7cec038p-5 -DIWEIGHT_STEP_MINUS_1=-0x8.0e50c8817ddf8p-5 -DCARRY32=1 -DIN_SIZEX=16 -DIN_SPACING=4 -DIN_WG=128 -DORIG_SLOWTRIG=1 -DOUT_SIZEX=16 -DOUT_SPACING=4 -DOUT_WG=128  -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2020-10-22 19:36:40 TITAN V-0

2020-10-22 19:36:40 TITAN V-0 OpenCL compilation in 0.01 s
2020-10-22 19:36:40 TITAN V-0 77936867 OK        0 loaded: blockSize 100, 0000000000000003
2020-10-22 19:36:40 TITAN V-0 validating proof residues for power 8
2020-10-22 19:36:40 TITAN V-0 Proof using power 8
2020-10-22 19:36:41 TITAN V-0 77936867 OK      200   0.00%;  502 us/it; ETA 0d 10:53; 2619e0f0cb78fe50 (check 0.09s)
2020-10-22 19:38:16 TITAN V-0 77936867 OK   200000   0.26%;  478 us/it; ETA 0d 10:19; f0b04b45b0855bd2 (check 0.20s)
2020-10-22 19:39:52 TITAN V-0 77936867 OK   400000   0.51%;  480 us/it; ETA 0d 10:21; c03f94396a5aa29e (check 0.09s)
2020-10-22 19:40:50 TITAN V-0 Stopping, please wait..
2020-10-22 19:40:50 TITAN V-0 77936867 OK   519700   0.67%;  480 us/it; ETA 0d 10:20; 19d648e17333ad91 (check 0.09s)
2020-10-22 19:40:50 TITAN V-0 Exiting because "stop requested"
2020-10-22 19:40:50 TITAN V-0 Bye

Last fiddled with by xx005fs on 2020-10-23 at 02:47
xx005fs is offline   Reply With Quote
Old 2020-10-23, 03:55   #2526
moebius
 
moebius's Avatar
 
Jul 2009
Germany

5×7×13 Posts
Default

Quote:
Originally Posted by xx005fs View Post
020-10-22 19:38:16 TITAN V-0 77936867 OK 200000 0.26%; 478 us/it; ETA 0d 10:19; f0b04b45b0855bd2 (check 0.20s)
Thank you very much, according to my expectations, the Titan V is so far the second best with 478 us/it to 442 us/it compared to a Tesla V100-SXM2-16GB. I'm already working on an application-oriented top list for gpuowl, which I will publish here in the forum.
moebius is online now   Reply With Quote
Old 2020-10-23, 04:43   #2527
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

715810 Posts
Default

Quote:
Originally Posted by moebius View Post
Thank you very much, according to my expectations, the Titan V is so far the second best with 478 us/it to 442 us/it compared to a Tesla V100-SXM2-16GB. I'm already working on an application-oriented top list for gpuowl, which I will publish here in the forum.
For comparison, one of my Radeon VIIs:

undervolted, underclocked to sclk=3, mem overclocked to 1200:

Code:
2020-10-23 04:23:47 gfx906+sram-ecc-0 77936867 OK      800   0.00%;  556 us/it; ETA 0d 12:02; 1579c241dc63eca6 (check 0.39s)
2020-10-23 04:24:04 gfx906+sram-ecc-0 77936867 OK    30000   0.04%;  561 us/it; ETA 0d 12:08; c4e0ff35e3290d98 (check 0.39s)
2020-10-23 04:24:21 gfx906+sram-ecc-0 77936867 OK    60000   0.08%;  560 us/it; ETA 0d 12:07; 0945da4dc08bdd95 (check 0.39s)
two instances:

Code:
2020-10-23 04:30:52 gfx906+sram-ecc-0 77936867 OK   270000   0.35%;  985 us/it; ETA 0d 21:15; dc349756c5f05abf (check 0.57s)
2020-10-23 04:31:01 gfx906+sram-ecc-0 77936867 OK   270000   0.35%;  986 us/it; ETA 0d 21:16; dc349756c5f05abf (check 0.57s)
2020-10-23 04:32:22 gfx906+sram-ecc-0 77936867 OK   360000   0.46%;  985 us/it; ETA 0d 21:14; 992df79b843f90de (check 0.57s)
2020-10-23 04:32:32 gfx906+sram-ecc-0 77936867 OK   360000   0.46%;  985 us/it; ETA 0d 21:14; 992df79b843f90de (check 0.57s)
for an average of 493 us/it.

undervolted, underclocked (slightly) to sclk=4, mem overclocked to 1200:

Code:
2020-10-23 04:26:43 gfx906+sram-ecc-0 77936867 OK    90000   0.12%;  526 us/it; ETA 0d 11:22; 0bacff453b2f470e (check 0.38s)
2020-10-23 04:26:47 gfx906+sram-ecc-0 77936867 OK    97200   0.12%;  525 us/it; ETA 0d 11:22; ddaaad369befab47 (check 0.36s)
two instances:

Code:
2020-10-23 04:27:51 gfx906+sram-ecc-0 77936867 OK   150000   0.19%;  920 us/it; ETA 0d 19:53; 127631386c6a9b17 (check 0.55s)
2020-10-23 04:28:01 gfx906+sram-ecc-0 77936867 OK   150000   0.19%;  920 us/it; ETA 0d 19:53; 127631386c6a9b17 (check 0.54s)
2020-10-23 04:28:19 gfx906+sram-ecc-0 77936867 OK   180000   0.23%;  920 us/it; ETA 0d 19:53; 6bee5d054f770861 (check 0.54s)
2020-10-23 04:28:29 gfx906+sram-ecc-0 77936867 OK   180000   0.23%;  920 us/it; ETA 0d 19:53; 6bee5d054f770861 (check 0.56s)
for an average of 460 us/it.
Prime95 is offline   Reply With Quote
Old 2020-10-23, 05:00   #2528
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

22·7·11·29 Posts
Default

Quote:
Originally Posted by Viliam Furik View Post
But shouldn't then the code be reworked to work with FP32? It seems like it should work - has a lot higher maximum value.
Yep, but the things are not exactly so. Having a "lot higher" maximum value comes with a penalty on accuracy. It can't store all the numbers in between, but only a (very VERY) small part of them.

Say for example you want to rewrite mfaktc (which uses int32) to use FP32, to speed it up in some cards which have "pure FP32" hardware. For the most of the cards, the same units do either integer, either fp32 processing, so you won't get anything, but some gaming cards have dedicated fp32 cores inside, which suck at integer arithmetic, and you may get a speedup doing so. But...

A 32 bit register can only store a number of 2^32 different values, regardless of how you "see" this register (i.e. regardless of the codification you associate to it). In the "unsigned int32" codification, you can put there a number from 0 to 2^32-1 exactly, i.e. lossless. Without losing information. It means, when you write 89, yo read back 89.

In the "fp32" codification, you can only put there a much lower number of numbers from this range, lossless. Actually, only about 0.4% of them can be stored exact. For all the other "larger" numbers (or smaller than 1, fractional, by the way), you write "x", but when you read back, you read an "x+epsilon" or "x-epsilon". The codification is not "exact". It is the same idea as when you count to a hundred, yo do it one by one, but when you get higher, you say "few hundred", or "few thousand", or "the budget of this project is about five millions and half", you are not anymore interested on the exact value, and look only to the most significant digits, as many as you can remember (store in your "space" in your brain). That's not useful for integer arithmetic, you will need to use two FP32 registers, to store the same information as you store in one int32 register, and that is worth only if you can achieve a double speed (well, about, in rough terms, the things are more complex than that).

All the issue is the fact that, in 32 bit floats, numbers are represented as "sign*1.fraction*2^exponent", where the sign, fraction, and exponent are stored inside of the 32 bit register, therefore they take 32 bits in total, but their positions and sizes are fixed. As the sign is 1 bit, you can only have 8 bits for the exponent, and 23 bits for the fraction. Therefore, you can represent a very large number, like 618970019642690137449562112 (which is 2^89), by setting the exponent to 89 and the fraction to zero, but you will not be able to store the most of the numbers in between, like for example 33556688, which is just a 25 bit number.

If you google "the smallest positive integer that can't be stored in fp32" (or just go to wikipedia and read the theory), you will find out a lot of interesting things.

For a smaller scale, imagine you have a 3 bit register. You can store inside a number between 000 binary (decimal zero) and 111 binary (decimal 7). You can see this as an "unsigned integer on 3 bits", and then the information inside represents a number between 0 and 7, in order in binary: 000=0, 001=1, 010=2, 011=3, 100=4, 101=5, 110=6, 111=7. No other possibility.

You can also consider this as "signed integer on 3 bits", and in that case, you need a bit to store the sign, let's consider first bit is for sign, then the larger integer you can store there will be 3 (using the two remaining bits) and your values will be, in order: 100=-4, 101=-3, 110=-2, 111=-1, 000=0, 001=1, 010=2, 011=3, there is no other possibility (and yes, there is a reason to put them in that order, to have the additions and multiplications work properly, without changing the addition and multiplication rules).

You could see the 3 bits also like a "unsigned float on 3 bits", and in that case, the information inside will represent (I use letters for decimal numbers to avoid confusion with 0 and 1 binary): 000=100=zero*, 110=0.25, 111=0.5, 001=one, 010=3, 011=7. The advantage is that you can store "higher numbers", as well as numbers which are not integers, but you lose the accuracy, as you can't store all the numbers in between. To store the integer 5 exactly, you will need two of these "3 bit registers".

So. here you can store a "larger" number (as well as a smaller, fractional) compared with unsigned integer, but every time you will write a 4, you will read back a 3, and every time you will write a 6 you will read back a 7. But yes, you can store a "larger" number, for sure.
--------
*Edit: note that here you have 2 possibilities to store the value "zero", this is deliberate, because floats, in theory, NEVER represent exact values, therefore you may consider zero as being an infinitesimal small value, and it makes sense to have a positive and a negative one (like an "epsilon", in math, or even in programming).

Last fiddled with by LaurV on 2020-10-23 at 07:55
LaurV is offline   Reply With Quote
Old 2020-10-23, 05:19   #2529
moebius
 
moebius's Avatar
 
Jul 2009
Germany

5·7·13 Posts
Default

Quote:
Originally Posted by Prime95 View Post
For comparison, one of my Radeon VIIs:
Code:
2020-10-23 04:26:47 gfx906+sram-ecc-0 77936867 OK    97200   0.12%;  525 us/it; ETA 0d 11:22; ddaaad369befab47 (check 0.36s)
Thanks for the trouble, I'll take the best value for one instance,because it should be a fair comparison. It is only important that gpuowl runs stable without errors with the selected settings.

Last fiddled with by moebius on 2020-10-23 at 05:22
moebius is online now   Reply With Quote
Old 2020-10-23, 08:54   #2530
aheeffer
 
Aug 2020

1000002 Posts
Default

Quote:
Originally Posted by Prime95 View Post
The second proof failure is here: 108979853

I do not know if Bruno Victal is a forum member and can comment on what may have happened.
This is weird! Just finished a PRP test for

108980089

and the result was refused, though it was assigned to me through Primenet. As I have seen the name of the PRP tester mentioned before, did this proof certification succeed?
aheeffer is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1657 2020-10-27 01:23
GPUOWL AMD Windows OpenCL issues xx005fs GpuOwl 0 2019-07-26 21:37
Testing an expression for primality 1260 Software 17 2015-08-28 01:35
Testing Mersenne cofactors for primality? CRGreathouse Computer Science & Computational Number Theory 18 2013-06-08 19:12
Primality-testing program with multiple types of moduli (PFGW-related) Unregistered Information & Answers 4 2006-10-04 22:38

All times are UTC. The time now is 20:45.

Thu Nov 26 20:45:43 UTC 2020 up 77 days, 17:56, 4 users, load averages: 0.91, 1.18, 1.29

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.