mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing > GpuOwl

Reply
 
Thread Tools
Old 2020-09-28, 23:19   #23
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

473410 Posts
Default

Quote:
Originally Posted by storm5510 View Post
I think I will keep my 6x until I see how this all shakes out.
There is no need to get rid of a working installation to run a different version. Just use separate folders and shortcuts.
kriesel is online now   Reply With Quote
Old 2020-09-29, 01:46   #24
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

24·83 Posts
Default

Quote:
Originally Posted by storm5510 View Post
I think I will keep my 6x until I see how this all shakes out.
That's very wise.
preda is offline   Reply With Quote
Old 2020-09-29, 01:50   #25
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

24608 Posts
Default Savefile transition from 6.x

Quote:
Originally Posted by preda View Post
The savefiles are now named like this:
86059247-000014500.prp
Please either finish the exponent on 6.x and start a fresh one on 7.x, or:

carefully manually rename the 6.x savefile (.owl) to the new numbered name format .prp

<exp>-<iteration>.prp
where iteration should be on 9 digits as above. Feel free to make a backup beforehand.

BTW, 7.x is not ready for general use yet, still plenty of rough corners I'm working on now.
preda is offline   Reply With Quote
Old 2020-09-29, 01:59   #26
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

53016 Posts
Default New P-1 testing

If you feel adventurous, please help with the new P-1 testing.

The bug: the most important bug we want to trigger is: the candidate *has a factor* that should be detected acording to B1/B2 bounds, but is not detected.

How to trigger this bug:

1. please choose an exponent (of various size) with a known factor. You should know which are the minimally required B1/B2 to find this factor (these can be found from the factorization of factor-1).

2. repeatedly run PRP on this exponent while changing:
- FFT size
- -maxAlloc (e.g. use 3 values: max allowed on the GPU, a very small e.g. 1GB or 800M, and something in between like 3GB or 7GB)
- anything else you feel like changing (-carry long, etc)

3. run with different bounds:
- first set a B1 large enough that it should find the factor by itself (in first stage). Run first-stage. Feel free to repeatedly interrupt it (Ctrl-C) and reload, etc.

- next set a B1 that is not large enough, and check detection in second stage. Again interrupt/reload at will.

4. Use your imagination to torment the P-1 in other ways. But in general always run with some bounds that should detect the factor, and if it's ever *not detected*, report the bug.

If you do find a bug, try to reproduce it yourself. This helps identify the conditions that trigger it.

In general, before anybody should do actual P-1 work, please run at least a few such tests. Otherwise P-1 may be broken (blind, does not find factors) and we're just wasting cycles imagining we're doing P-1.

thanks

PS: this testing is useful for both P-1 stages, not only first-stage, as second-stage changed too.

PPS: a good approach is to start with some simple test-case, verify that it works correctly, and next complicate it a bit, and repeat. Don't start with something fancy when maybe something trivially simple would break it just as well.

Last fiddled with by preda on 2020-09-29 at 02:10
preda is offline   Reply With Quote
Old 2020-09-29, 02:17   #27
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

127E16 Posts
Default

There's a list of P-1 test candidates with known factors and required bounds here.
I'll add more coverage of the mersenne.org exponent range as they are found.

Last fiddled with by kriesel on 2020-09-29 at 02:25
kriesel is online now   Reply With Quote
Old 2020-09-29, 12:25   #28
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

24×83 Posts
Default

Quote:
Originally Posted by kriesel View Post
There's a list of P-1 test candidates with known factors and required bounds here.
I'll add more coverage of the mersenne.org exponent range as they are found.
Another P-1 list can be found in GpuOwl source code
https://github.com/preda/gpuowl/blob...st-pm1/pm1.txt

For testing it's a good idea to use exponents with lower B1, B2 values as they'll complete faster.
preda is offline   Reply With Quote
Old 2020-09-29, 12:31   #29
firejuggler
 
firejuggler's Avatar
 
Apr 2010
Over the rainbow

32·52·11 Posts
Default

so, smooth factor found on mersenne.ca?
firejuggler is offline   Reply With Quote
Old 2020-09-29, 12:40   #30
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2×32×263 Posts
Default

Quote:
Originally Posted by preda View Post
Another P-1 list can be found in GpuOwl source code
https://github.com/preda/gpuowl/blob...st-pm1/pm1.txt

For testing it's a good idea to use exponents with lower B1, B2 values as they'll complete faster.
That's a long list all within 86M-87M exponent. CUDAPm1 practice was to have test exponents using a variety of fft lengths of current and future interest. Bugs only affecting some fft lengths or bounds could get missed more easily if the test vector is concentrated in or limited to part of the parameter space.
kriesel is online now   Reply With Quote
Old 2020-09-29, 12:51   #31
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

24·83 Posts
Default P2 memory usage

Because I need to specify all the time "first stage" or "second stage" of P-1, from now on I'm going to use this notation to identify the stages: "P1" denoting P-1 FS, and "P2" meaning P-1 SS.

I simplified the P2 implementation; now there are only two cases for memory use in P2, let's call them "low memory" and "high memory".

a) "low memory" uses D=210, where only 24 "big" buffers are allocated for P2 (plus a few auxilliarry buffers).
b) "high memory" uses D=2310, where 240 "big" buffers are allocated for P2 (plus the same nb. of auxilliaries as before)

The cost of P2 is dominated by the number "n" of primes between B1 and B2, which require about 0.85 * n muls (this value is pretty much the same between the low/high memory variants), plus an "overhead" for walking in steps of size D from B1 to B2, where each step requires 2 muls. It turns out that this overhead is about 2% (of the whole P2) for the "high memory" case, and about 20% for the "low memory" case, and this is why the "high memory" P2 is more efficient (by about 20%) than the low-memory P2.

Long story short, it is good for P2 to be able to run in the "240 buffers" mode (D=2310).

At the wavefront one "big" buffer is 44MB, so the "high memory" case would require about -maxAlloc 11G

Notes:
"D" above is a parameter of P2 -- it indicates the "step" of walking from B1 to B2. The values above are small primorials:
210 = 2*3*5*7
2310 = 2*3*5*7*11

"big" buffer: the buffers used have fixed length N given by the FFT size (e.g. N=5.5M for FFT=5.5M), but contain either 32-bit integers ("small" buffers) or 64-bit FP ("big" buffers).

Last fiddled with by preda on 2020-09-29 at 13:04
preda is offline   Reply With Quote
Old 2020-09-29, 12:58   #32
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

24·83 Posts
Default

Quote:
Originally Posted by kriesel View Post
That's a long list all within 86M-87M exponent. CUDAPm1 practice was to have test exponents using a variety of fft lengths of current and future interest. Bugs only affecting some fft lengths or bounds could get missed more easily if the test vector is concentrated in or limited to part of the parameter space.
Yes indeed (that list from gpuowl needs either updating with more varied exponents, or deleting).
preda is offline   Reply With Quote
Old 2020-09-29, 14:19   #33
masser
 
masser's Avatar
 
Jul 2003
wear a mask

5C016 Posts
Default

Here's a list I gleaned from the mersenne.ca "Factors missed by P-1" list . Maybe try these for testing?

Code:
Pminus1=1,2,10002859,-1,55000,838750,64
Pminus1=1,2,21150827,-1,6133,596857,66
Pminus1=1,2,31919773,-1,1901,84737,66
Pminus1=1,2,48701273,-1,570000,570000,69
Pminus1=1,2,50077721,-1,280000,280000,69
Pminus1=1,2,61684171,-1,13381,2443933,74
Pminus1=1,2,72713617,-1,3847,1047701,81
Pminus1=1,2,89281183,-1,65173,1303669,80
Pminus1=1,2,95675581,-1,4493,1143563,75
Pminus1=1,2,102086261,-1,1733,7253,74
Pminus1=1,2,102227777,-1,10601,14159,74
Pminus1=1,2,102001051,-1,37123,3078469,72
masser is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
GpuOwl PRP-Proof changes preda GpuOwl 20 2020-10-17 06:51
gpuowl: runtime error SELROC GpuOwl 59 2020-10-02 03:56
gpuOWL for Wagstaff GP2 GpuOwl 22 2020-06-13 16:57
gpuowl tuning M344587487 GpuOwl 14 2018-12-29 08:11
How to interface gpuOwl with PrimeNet preda PrimeNet 2 2017-10-07 21:32

All times are UTC. The time now is 05:54.

Mon Nov 30 05:54:08 UTC 2020 up 81 days, 3:05, 3 users, load averages: 1.25, 1.16, 1.21

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.