mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Data > Marin's Mersenne-aries

Reply
 
Thread Tools
Old 2021-07-01, 09:24   #1
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

7×829 Posts
Default Production (wavefront) P-1

Could we get some more volunteers performing wavefront P-1 to good bounds?
The P-1 crowd are not keeping up with demand, so the P-1 wavefront is essentially the low end of the PRP wavefront, with PRP testers doing a lot of P-1 too. Sometimes to good bounds, sometimes not.
The work distribution map https://www.mersenne.org/primenet/ shows 687 assignments currently in 104M-105M ATM.
175 (25.5%) of those are mine; so 512 others'.
A little more help with P-1 please?
kriesel is online now   Reply With Quote
Old 2021-07-01, 09:49   #2
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

9,787 Posts
Default

The number of assignments is not important. Important is how fast you go through them. You may have a hundred assignments and turn up one every hour, or you may have 5 and turn up one every 20 minutes.

Anyhow, nitpicking apart, that's a good call and I may join you after I finish the current task (1-2 days more to go), but keep in mind that a lot of P-1 work does not appear reserved as so, due to the fact that new flavors of the new programs do the PRP and P-1 in the same time. So there may be more activity going on there than it is visible from the reservation chart.


Edit: also, low category work is not easy to get, I tried and got 106M, 107M, and 110M. Maybe we can corrupt Chris to offer P-1 work for our Colab workers?

Last fiddled with by LaurV on 2021-07-01 at 10:07
LaurV is offline   Reply With Quote
Old 2021-07-01, 15:12   #3
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

132538 Posts
Default

My prime95 P-1 workers report results immediately by PrimeNet API, and have a variety of processing speeds; some 5 hours, some 14 hours, etc for ~104M, good bounds P-1 & liberal but reasonable for system response ram settings.
Radeon VIIs and other manual assignments get reported usually daily. An RVII can churn through ~40/day each. So yes, there are large disparities in speed.
Number of assignments was the measure available.

Here's an example of how quickly a completed P-1 gets gobbled up. 104614871 Waited over 3 months for P-1, then through PRP assignment and completion and in CERT in 2 days.

P-1 of higher Cat #s is helpful too; prevents them from getting to 1 or 0 without P-1 done, so pays off a little later.

Last fiddled with by kriesel on 2021-07-01 at 15:37 Reason: edit long url to [M] form / test links always / don't post exponents before breakfast?
kriesel is online now   Reply With Quote
Old 2021-07-01, 15:17   #4
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

9,787 Posts
Default

Quote:
Originally Posted by kriesel View Post
Here's an example of how quickly a completed P-1 gets gobbled up. 10461487
That sounds like a 10M exponent to me
LaurV is offline   Reply With Quote
Old 2021-07-01, 21:43   #5
S485122
 
S485122's Avatar
 
"Jacob"
Sep 2006
Brussels, Belgium

174510 Posts
Default

If the current software, I’m too lazy now to check, does P+1 at the same time, P-1 as a work type is only useful if the exponent gets picked up by an instance that still uses LL and not PRP.

A solution could be to reserve the ”P-1’ed” exponents for the LL work requests.

Perhaps P-1 would be more beneficial on the exponents needing a double check, although the fact that only one LL test would be saved implies lowish bounds, the factoring buffs would redo the factoring attempt with higher bounds.

From the time P-1 has been available as an independant work type, the bounds should have computed differently for that work type. For P-1 as an independent work type the bounds should be based on the mean of Primenet cpu’s and available only to workers with sufficient memory. At the moment they are based on the individual machine doing the factoring attempt, as if it must be the case when an LL test is started on an exponent for which no prior P-1 had been done.

Jacob
S485122 is offline   Reply With Quote
Old 2021-07-01, 22:42   #6
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

7·829 Posts
Default

LL has been retired as a first time primality test, in favor of the PRP/GEC/proof/Cert sequence, saving about 1.02 times a full DC run.

And there's a case for PRP/GEC/proof/CERT being superior, in DC situations, too, even on past LL first test cases; about equally fast overall run, with far lower error rate. (Almost no PRP DC/TC or quad check runs, compared to 2%-4% LLTC cost on average. That's a ~3% speedup, for ~0.4% or less proof/CERT cost depending on proof power.)

PRP and P-1 may be performed in combination at reduced P-1 stage 1 cost, on gpuowl v7.x-y, requiring a GPU with sufficient DP performance to be worthwhile, and a sufficient level of OpenCL support in its driver (at least OpenCL 2.0 atomics).

Factoring runs intended as P+1 are separate and have a low yield for factors, so not advisable as preparation before PRP or LL first tests or DCs. Also in some cases P+1 attempts turn out to effectively have been P-1.

Since there may be some running mprime or prime95 or Mlucas versions not supporting PRP proof generation yet, the two-primality-test bounds target values are still in place. (PRP without proof generation and subsequently verified Certs would get PRP DC, often with proof generation & cert.)

Everyone, please finish upgrading to PRP proof capable software where practical, to help avoid the full cost of a DC.

Compared to running 2-tests-bounds P-1 for a subset of exponents that would be destined for LL, LLDC, LLTC, LLQC etc as needed, and 1-tests-bounds P-1 for a subset that would require PRP/GEC/proof assignment, running 2-tests-bounds P-1 on everything is slightly deeper than optimal, but that additional factoring effort can be considered a gift to the factor-hunters that will eventually follow. It does yield factors at a satisfying rate.
I recently ran 31 100Mdigit exponents that had had TF only to 76 not recommended 81 bit depth. Three of the 31 were eliminated with 4 factors found with P-1 bounds (mersenne.ca GPU72 row), only one factor of which would have been found by TF first to 81 bits. No additional factors were found by TF to 81 bits, on the 28 P-1 no-factor survivors.

A practical consideration is if the P-1 bounds run are inadequate, it does not retire the P-1 task in the mersenne.org database. Running gpu72 row bounds P-1 ahead of all PRP testers ensures that following CPU or GPU PRP assignments will have had sufficient P-1 applied to clear the P-1 task in the database first, and give the later PRP tester just PRP to do, not P-1 first (if sufficient memory is present and enabled on that at-P-1-time-unknown PRP tester, then no-factor-found exponents get PRP).

That practical consideration is the case for which this thread was created.

It's not the best choice of hardware and software, but it is possible to run suitable P-1 bounds for the current wavefront and somewhat higher, on CUDAPm1, on even ancient 1GB NVIDIA GPUs, by bumping the trailing tests-saved number in the assignment from 2 to 3. Or by specifying bounds and exponent on the command line.

Recommended sequence for wavefront testing is:
TF to recommended levels. (Since almost all TF is done on GPUs, to the GPU72 TF level.)
P-1 to recommended bounds. (Conservative position is to P-1 to the larger bounds of PrimeNet or GPU72 rows.)
PRP/GEC/proof and Cert.

No P+1, no ECM, no LL. (George has confirmed that P+1 factoring is not worthwhile at p~104M, and retired LL assignments.)

Except LL the heck out of anything that returns PRP result "probably prime" with a successful Cert, while maintaining confidentiality of the new likely prime discovery. Until the press release is out.

Last fiddled with by kriesel on 2021-07-01 at 23:04
kriesel is online now   Reply With Quote
Old 2021-07-01, 22:54   #7
masser
 
masser's Avatar
 
Jul 2003
wear a mask

7·13·19 Posts
Default

Some potential P-1'ers might like to know what B1/B2 values are considered "good bounds" for production (wavefront) P-1. Could you provide some rough estimates here?

Maybe another way to ask my question: What should x be in the sentence, "Below B1 = x, you are wasting our time (someone will likely have to re-do it soon) by doing independent P-1 on a wavefront exponent."

Last fiddled with by masser on 2021-07-01 at 22:57
masser is online now   Reply With Quote
Old 2021-07-01, 23:28   #8
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

7·829 Posts
Default

Typically now recommended bounds are ~B1=650,000,B2=24,000,000. I'm not sure at what exponent that changes again. But anyone can check at https://www.mersenne.ca/exponent/exponent goes here, drop everything else after the last preceding /, that is, everything in bold
For example, https://www.mersenne.ca/exponent/104500391
Prime95/mprime performs an optimization computation at P-1 "PFactor" assignment start, and I trust George to have gotten it right, or right enough, providing the system has adequate ram present, and enough ram has been enabled in prime95/mprime by the user. Start with prime95's readme file "desirable" guidance and the system specs.

I would use the GPU72 row, not the PrimeNet row, and definitely not the actual row,currently B1=B2=935,000 in the example exponent above, which was a waste of time (B2=B1) and lesser factor probability, only partly compensated by one additional bit of perhaps-excessive TF, on that example exponent. (If it was on an extremely high TF/DP ratio GPU such as RTX20xx or RTX30xx or GTX16xx, it may have been justifiable TF.)

See also https://www.mersenneforum.org/showpo...9&postcount=20 which applies generally to prime hunting P-1, and has been available since November 2019; IIRC both RDS and R. Gerbicz were in agreement with the conclusions. Things that people sometimes claim are optimal are not. For P-1 factoring for prime hunting, do the whole job, well enough, once per exponent. Do not run stage1-only. Do not run inadequate bounds.

Getting the exact optimal bounds is not critical. As long as the P-1 task gets cleared, in the no-factor outcome, the slopes (partial derivatives vs. B1, or B2) of probable computation time near a local or global minimum will be small.

I used to do gpuowl default B1=1M, B2=30M for lower exponents. But I can get through P-1 of ~35% more exponents per day on the same GPU following the mersenne.ca guidance.

Last fiddled with by kriesel on 2021-07-01 at 23:36
kriesel is online now   Reply With Quote
Old 2021-07-01, 23:44   #9
S485122
 
S485122's Avatar
 
"Jacob"
Sep 2006
Brussels, Belgium

174510 Posts
Default

I made a mistake : what I remember having read is that there was a possibility of doing stage 1 P-1 factoring concurrently with the PRP test and at a much lower cost than doing them separately. The latest readme file doesn't mention this and I couldn't find a relevant posts anymore, not in Ken's reference thread, not via a forum or Internet search. Since I can't find basis of my arguing about P-1 bounds my message becomes a bit irrelevant.

In other words I should have done what I admonished others to do : not be lazy and research first before posting.

Sorry about that :-(

Jacob
S485122 is offline   Reply With Quote
Old 2021-07-02, 00:29   #10
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

16AB16 Posts
Default

Quote:
Originally Posted by S485122 View Post
what I remember having read is that there was a possibility of doing stage 1 P-1 factoring concurrently with the PRP test and at a much lower cost than doing them separately.
In Gpuowl 7.x yes, available now, and maybe coming someday in prime95/mprime.
kriesel is online now   Reply With Quote
Old 2021-07-02, 01:59   #11
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

9,787 Posts
Default

Quote:
Originally Posted by S485122 View Post
I remember <...> that there was a possibility of doing stage 1 P-1 factoring concurrently with the PRP test and at a much lower cost than doing them separately.
That is what I was talking about, in post #2 above.

@kriesel: man, please try to keep your posts shorter. If you really feel compelled to write, then write full post, read it again, and delete about half of it, the half you think is less important than the other half. Don't take it personal, some of your posts are really REALLY good, like post #6 above, which is a , but they are TOO LONG. People do not have the patience to read them.

And yeah, I know I am the wrong person to give such advice, being myself a long speech guy, and talking rubbish most of the time , but trust me here.
(this is personal opinion, I (as a mod) didn't get any complaint recently about the size or content of your posts, but I have to mobilize myself all the time to go through the whole content of some of your posts - and I am a patient reader, I like to read, and I understand the involved math).

For post #6, the only addition I would make, in red below, as the exponents get higher, the following algorithm becomes more efficient:

Quote:
Originally Posted by kriesel View Post
Recommended sequence for wavefront testing is:
TF to recommended levels (or to recommended_level_minus_1, according with the exponent size and the GPU card you have). (Since almost all TF is done on GPUs, to the GPU72 TF level.)
P-1 to recommended bounds. (Conservative position is to P-1 to the larger bounds of PrimeNet or GPU72 rows.)
TF last bit (this is either the recommended level, or the recommended_level_plus_1, see the observation above).

PRP/GEC/proof and Cert.

No P+1, no ECM, no LL. (George has confirmed that P+1 factoring is not worthwhile at p~104M, and retired LL assignments.)

LL should be done only by few selected persons, to confirm a prime found by PRP. (I formulated this clearer).


That is because as the exponents go higher (PRP takes longer) and as most of new cards can do some FFT work (i.e. P-1) , then doing P-1 will find factors faster than doing the last TF bitlevel. Well.. now there is a long discussion about the fact that P-1 gets combined with PRP, which would mean that you have to stop your PRP test after the P-1 is done, do the last TF bit, and then continue the PRP. This is freaking inconvenient, and almost nobody is doing it, but it is still the most profitable way. The "intensive_done_P-1" exponents should only be PRP-ed with non-guOWL software, as gpuOwl will do its own P-1 at "saved cost". But at the end, everybody should do whatever run their boat...

Last fiddled with by LaurV on 2021-07-02 at 02:20
LaurV is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
COVID vaccination wavefront Batalov Science & Technology 274 2021-10-21 15:26
Received P-1 assignment ahead of wavefront? ixfd64 PrimeNet 1 2019-03-06 22:31
Call for GPU Workers to help at the "LL Wavefront" chalsall GPU Computing 24 2015-07-11 17:48
P-1 & LL wavefront slowed down? otutusaus PrimeNet 159 2013-12-17 09:13
Production of Dates heich1 Information & Answers 35 2011-12-02 01:12

All times are UTC. The time now is 01:18.


Mon Oct 25 01:18:08 UTC 2021 up 93 days, 19:47, 0 users, load averages: 1.11, 1.28, 1.39

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.