View Single Post
2019-06-23, 16:01   #1255
kriesel

"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2×32×7×47 Posts

Quote:
 Originally Posted by Prime95 I thought Mihai had agreed to make type-1 residues gpuowl's default. However, it is still producing type-4.
Gpuowl implemented LL through v0.6, PRP with residue type 4 initially (v0.7 to at least 1.1), switched to type 1 by v1.5 and continued it to at least v3.9, then back to type 4 when PRP-1 was added in v4; when P-1 was separated in v6.0, PRP3 remained type 4 through at least v6.5.

I've started a reference table available at https://www.mersenneforum.org/showpo...3&postcount=15 including a couple other variables too (like when nonzero offset was available in gpuowl, or Jacobi check available in the LL flavors). It's incomplete and a work in progress. I haven't tested, built, downloaded, or even identified the commits for all the 0.1 increment versions yet.

Some useful versions in my opinion are:
v0.5 LL with pseudorandom offset, no Jacobi check; most efficient near the upper limit of the 4M fft ~70-77M exponent; useful for helping DC past LL first tests

v0.6 LL with Jacobi check for helping DC past LL first tests done with nonzero offset; most efficient near the upper limit of the 4M fft ~70-77M exponent; I think zero offset only

v1.9 PRP DC, 4M is fast, limited to zero offset, type 1 residues. (2, 4, 8M; fastest times for each that I've seen in testing on RX480. Although driver updates necessary for v2.0 support that caused a 5% slowdown affected that.)

v3.8 PRP, 8M for ~150M exponents is fast; type 1 residues, zero offset limitation

V6.2-6.5 PRP type 4 residues, many fft lengths, and speeds I've checked are competitive with the best of the previous versions, latest and greatest, limited to zero offset, separate P-1 (which runs for some but I've had crashes with the P-1 in every attempt)

Iteration timing benchmarks vs. a variety of gpuowl versions and fft lengths run on the same system and RX480 gpu are available at https://www.mersenneforum.org/showpo...35&postcount=2

Switching between versions and supporting multiple versions is easy. I have dozens on one system with 2 AMD gpus. I use a separate directory for each, shortcuts to get there, and simple batch files containing the executable name and the usual command line options (this is on Windows 7 or 10 typically). For example, g65.bat for V6.5 is
Code:
gpuowl-win -device 0 -carry short -fft +0 -use ORIG_X2

:dev 0 rx480, 1 rx550
:  -carry long -fft +0  -carry short -use FMA_X2  -use ORIG_X2
I find it handy to have a reminder in comments there which gpu model is which device number, on each system, especially for 3 or more per system, and to have different options there in comments for fast convenient copy/paste into the command in line one.

Last fiddled with by kriesel on 2019-06-23 at 16:41