20181207, 00:46  #34  
Sep 2003
2×1,289 Posts 
Quote:
Like I said earlier, for any given Mersenne factor, you can figure out what B1 and B2 bounds would have found that factor by P−1 testing. Some large factors are easy to find by P−1, some small ones are hard or impossible to find by P−1, it all depends on the individual factor. Formulate a precise question and you might get a precise answer. 

20181207, 12:42  #35  
"/X\(‘‘)/X\"
Jan 2013
Ͳօɾօղէօ
101011011100_{2} Posts 
Quote:


20181207, 16:27  #36 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
13×263 Posts 
P1 bounds determination
As far as I can determine, it's not primenet doing the B1, B2, d, e or NRP determination and dictating to the applications, it's most applications optimizing the bounds and other parameters, unless specified by the user, and the applications afterward telling primenet in the results record what parameters were selected and used.
The applications, mprime, prime95, CUDAPm1 (but not gpuowl v5.0's PRP1), unless the user specifies otherwise, try to optimize the probable savings in total computing time for the exponent, based on computed probabilities over combinations of many B1 values and several B2 values, of finding a P1 factor, for
From experiments with prime95, with somewhat larger exponents, it appears that optimization calculation occurs also during prime95 Test Status output generation, which shows considerable lag for P1 work compared to other computation types. It appears there's no caching of previous computation of the optimal P1 bounds. In my experience prime95 status output without a stack of P1 work assignment is essentially instantaneous, while this example attached takes 5 seconds, even immediately after a preceding one. With larger P1 exponents or more P1 assignments (deeper work caching or more complete dedication of a system to P1 work than the 1/4 in my example) I think that 5 seconds will increase. prime95.log: Code:
Got assignment [aid redacted]: P1 M89787821 Sending expected completion date for M89787821: Dec 05 2018 ... [Thu Dec 06 09:17:24 2018  ver 29.4] Sending result to server: UID: Kriesel/emu, M89787821 completed P1, B1=730000, B2=14782500, E=12, Wg4: 123E2311, AID: redacted PrimeNet success code with additional info: CPU credit is 7.3113 GHzdays. Code:
Pfactor=[aid],1,2,89794319,1,76,2 It's there to read in the source codes also. CUDAPm1 example: worktodo entry from manual assignment: Code:
PFactor=[aid],1,2,292000031,1,81,2 Code:
CUDAPm1 v0.20  DEVICE 1  name GeForce GTX 480 Compatibility 2.0 clockRate (MHz) 1401 memClockRate (MHz) 1848 totalGlobalMem zu totalConstMem zu l2CacheSize 786432 sharedMemPerBlock zu regsPerBlock 32768 warpSize 32 memPitch zu maxThreadsPerBlock 1024 maxThreadsPerMP 1536 multiProcessorCount 15 maxThreadsDim[3] 1024,1024,64 maxGridSize[3] 65535,65535,65535 textureAlignment zu deviceOverlap 1 CUDA reports 1426M of 1536M GPU memory free. Index 91 Using threads: norm1 256, mult 128, norm2 32. Using up to 1408M GPU memory. Selected B1=1830000, B2=9607500, 2.39% chance of finding a factor Starting stage 1 P1, M292000031, B1 = 1830000, B2 = 9607500, fft length = 16384K GPUOwL's PRP1 implementation is a bit different approach, and requires user selection of B1. It defaults to B2=p but allows other B2 to be user specified. See https://www.mersenneforum.org/showth...=22204&page=70, posts 765767 for Preda's description of gpuowl v5.0 P1 handling. (See posts 694706 for his earlier B1only development; https://www.mersenneforum.org/showth...=22204&page=64.) (Code authors are welcome to weigh in re any errors, omissions, nuances etc.) Last fiddled with by kriesel on 20181207 at 16:47 
20181208, 06:32  #37 
Jul 2018
2^{2}×7 Posts 
Personally, regarding TF vs. P1: I find with my hardware that, in terms of maximizing d(probability of getting a factor)/dt, I should not TF to a higher level than around 74 bits. For exponents near 90M, a given one of my cards takes about a half hour to run through 7374 bits, with success probability ~1.35%. That same card can do a P1 with about 3.6% probability of success (using whatever bounds the software defaults to) in an hour and a half, thrice the time. Going to 75 would be too much. So if I want to maximize my factorsfound per time in a range near 90M that has already been TF'd to 74 bits or more, then I should do P1 work. In that sense, it's possible 76 bits is too high... on the other hand, my cards have a lot of memory, which probably pushes the TF/P1 boundary down somewhat. But also d(probability of success under default params given available memory)/d(available memory) is not that big  I don't know enough now about what the requirements are and how p(success) varies with B1, B2 to say.
In terms of optimal work reduction, I think that how many factors TF might find that P1 would miss is not as important as the pertime probability of finding a factor. I think you could take this as a multiarmed bandit problem where each action is a pair (factoring method, device) that has some time cost and some factorprobability reward. It's somewhat complicated by that failure to find a factor for a given exponent also returns a small amount of information ("no factors under 2^75") which influences the future factorprobability estimate for a given (method, device) on that exponent. (Not that this makes the allocation problem easier, but there is a framework one could use to analyze it at least...) Of course, optimal work reduction isn't the only metric; one might be interested in e.g. maximizing coverage in a given range, in which case the best strategy might be different but probably this modeling approach would still find it. One might also be interested in maximizing the rate of Mersenne prime yield, which might also involve admitting "LL" as an action. Hopefully current "economic crossover point" analysis matches whatever this would come up with. Last fiddled with by penlu on 20181208 at 07:17 
20181209, 22:27  #38  
If I May
"Chris Halsall"
Sep 2002
Barbados
21170_{8} Posts 
Quote:
You are in a somewhat unique situation as far as you are willing and able to target your "firepower" optimally. Few are as focused as to the optimality of deployment of cycles. Primenet and GPU72 are somewhat constrained as to what they can assign for optimal throughput because each user tends to fetch only a single type of work for each of their kit. To put on the table, we are currently overpowered with (GPU) TF'ing and (CPU and GPU) P1'ing; we are years ahead of the LL'ers. What will come soon is the time to LLTF to 77 "bits". But possibly only after a P1 run. Any advise anyone has with regards how to optimally manage this would be most welcomed. 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
TF bit level  davieddy  Puzzles  71  20131222 07:26 
Probability of TF per bit level  James Heinrich  PrimeNet  11  20110126 20:07 
Expiring policy for V5 Server  lycorn  PrimeNet  16  20081012 22:35 
k=5 and policy on reservations  gd_barnes  Riesel Prime Search  33  20071014 07:46 
Reservation policy  gd_barnes  Riesel Prime Search  6  20071001 18:52 