20200516, 16:55  #1 
"Ed Hall"
Dec 2009
Adirondack Mtns
2^{3}×3^{2}×47 Posts 
Polynomial Selection Parameters Discussion
Note: This initial post is for reference only. The following posts are for discussion.
This thread is intended to allow discussion of the parameters used in Polynomial Selection for CADONFS. The following parameters are avaiable ("tasks.polyselect." preceeds each in the parameters files): Code:
degree threads P admin admax incr nrkeep adrange nq sopteffort ropteffort Extraction from CADONFS params.c90, explaining available parameters: Code:
########################################################################### # Polynomial selection task with Kleinjung's algorithm (2008) ########################################################################### tasks.polyselect.degree = 4 # degree of the algebraic polynomial #tasks.polyselect.threads = 2 # # How many threads to use per polyselect process tasks.polyselect.P = 10000 # choose lc(g) with two prime factors in [P,2P] # Setting a large value will most likely find better polynomials, # but the number of found polynomials will be smaller. # As a rule of thumb, we want at least 100 found polynomials in total # (without norm limitation, see below). # We must have P larger than primes in the SPECIAL_Q[] list. tasks.polyselect.admin = 0 # min value for leading coefficient of f(x) # If not set, the default is 0. # Note: admin is not necessarily a multiple of incr (defined below). tasks.polyselect.admax = 100e3 # max value for leading coefficient of f(x) # In fact the maximal value is admax1, so that we can run for example: # polyselect admin 1000 admax 2000 and polyselect admin 2000 admax 3000 # without duplicating the value 2000. tasks.polyselect.incr = 60 # increment for leading coefficient of f(x) # This factor is usually a smooth number, which forces projective roots in # the algebraic polynomial. 60 is a good start, 210 is popular as well. # Note: admin and admax are not necessarily multiples of incr, for # example polyselect incr 60 admin 0 admax 200 will deal with 60, 120, # and 180, while polyselect incr 60 admin 200 admax 400 will deal with # 240, 300, and 360. # The polynomial selection search time is proportional to the # length of the search interval, i.e., (admaxadmin)/incr. tasks.polyselect.nrkeep = 40 # number of polynomials kept in stage 1 of # polynomial selection (size optimization) tasks.polyselect.adrange = 5000 # size of individual tasks # Polynomial selection is split into several individual tasks. The # complete range from admin to admax has to be covered for the polynomial # selection to complete. The number of individual tasks is # (polsel_admaxpolsel_admin)/polsel_adrange. Each such task is issued as # a workunit to a slave for computation. tasks.polyselect.nq = 256 # Number of small prime combinations in the leading coefficient of the # linear polynomial. Safe to leave at the default value. # Recommended values are powers of the degree, e.g., 625 for degree 5, # or 1296 for degree 6. Here 256 = 4^4 thus the leading coefficient of # the linear polynomial will be q1*q2*q3*q4*p1*p2 where q1,q2,q3,q4 are # small primes, and P <= p1, p2 < 2*P. For a given pair (p1,p2), 256 # different combinations of q1,q2,q3,q4 will be tried. tasks.polyselect.sopteffort = 0 # Indicates how much effort is spent in stage 1 (size optimization) of the # polynomial selection. The default value is 0. The size optimization time # is roughly proportional to 1+sopteffort. tasks.polyselect.ropteffort = 0.2 # Indicates how much effort is spent in stage 2 (rootsieve) of the # polynomial selection. The number of sublattices considered is # proportional to ropteffort, and the rootsieve time is roughly # proportional to ropteffort. This is a floatingpoint value. 
20200516, 16:55  #2 
"Ed Hall"
Dec 2009
Adirondack Mtns
2^{3}·3^{2}·47 Posts 
Now, for my current questions:
 (degree) Is there a "ruleofthumb" for determining what degree?  (admin) Other than to eliminate duplication, is there a reason not to start searching at admin = 0?  (nrkeep) I see 200 for the default number to keep at 190 digit sizes. If there is ranking of polynomials done to determine which 200 to keep, is it really possible that one far down the initial ranking would turn out better than the rest later on? 
20200516, 18:17  #3 
"Ed Hall"
Dec 2009
Adirondack Mtns
2^{3}·3^{2}·47 Posts 
Perhaps I have found the answer, in that the "raw exp_E" score range appears to be rather narrow, resulting in a large number of candidates with identical scores.

20200516, 19:03  #4  
"Curtis"
Feb 2005
Riverside, CA
4398_{10} Posts 
Quote:
While using CADO, I've found the deg 4/5 cutoff to be just under 100 digits, and deg 5/6 to be around 215. For SNFS, poly search obviously doesn't matter, so theory predicts optimal degree more accurately. If I recall, 4/5 is around 110 digits, 5/6 205210, 6/7 340 or 350. Sorry that I'm too lazy to locate the formula for you! There are two reasons to set admin: 1. As you've discovered quite a few times, CADO sometimes gets "stuck" on a small c5 value, delaying poly select quite a lot. 2. As the size of the input rises, it becomes less and less likely that a verysmall c5 value will produce the best poly. Generally, this is because skew gets too high (at least, that's my layman's grasp of what goes on). For 120140 digit inputs, #1 has me setting admin in the 200010000 range to try to avoid those software hiccups. For 180210 digits, I set admin higher because of #2. How many polys to rootopt is a matter of time spent vs chance of reward. CADO has a better sizeopt ranking algo than msieve, so it's not necessary to rootopt hundreds of polys in CADO like we used to in msieve. That said, at large inputs like 190 digits the rootopt phase is a tiny fraction of sizeopt search time, so may as well rootopt a few hundred in case something unusual happens. On small inputs, say <130 digits, the rootopt step is a large fraction of sizeopt time; I've found I get better results by adding extra sizeopt effort and reducing the nrkeep to 48 or lower. The time is better spent generating more candidates than hoping a 50th place poly ends up best after rootopt. You can see this clearly if you mess with the settings on a C100; rootopt takes longer than sizeopt with default settings (if you change to deg 5, at least)! Doubling sizeopt time and halving nrkeep usually yields a better poly. CADO rootopts in blocks of 6 polys, so I make nrkeep a multiple of 6 nearly always. 

20200516, 19:08  #5 
Jun 2012
2872_{10} Posts 
I’ve struggled with this parameter  why hang on to hundreds of likely subpar polys that seemingly take forever to complete rootopt, instead just keep the “cream of crop”. But then if one cuts nrkeep to a much lower value (say 40) has the best poly potentially been filtered out before root opt? Still 200 seems like an awful lot of interim polys to carry along.

20200516, 19:27  #6  
Jun 2012
2^{3}×359 Posts 
Quote:
Last fiddled with by swellman on 20200516 at 19:28 

20200516, 22:59  #7 
"Ed Hall"
Dec 2009
Adirondack Mtns
6470_{8} Posts 
The description of P, says that larger values normally find better, but fewer, polynomials. If this is the case, would raising P and letting the number of found polynomials fall below nrkeep make sense?
Is a test of yield a possible strategy, e.g, test a small min/max region for yield and then raise P and adjust the search region to produce the optimum number of polynomials? (The description suggests at least 100.) Is the returned count of found polynomials consistent, or does it increase/decrease, across a range? If you ran two searches across the same region with vastly different P, would they find the same polynomials, with the larger P finding fewer of them? Or, would the found polynomials be different? 
20200516, 23:04  #8 
"Ed Hall"
Dec 2009
Adirondack Mtns
110100111000_{2} Posts 

20200517, 02:33  #9  
"Curtis"
Feb 2005
Riverside, CA
2·3·733 Posts 
Quote:
choose lc(g) with two prime factors in [P,2P] So, if you double P (call it Q=2P), it's highly unlikely a polynomial will have this property in both [P,Q] and [Q,2Q]. I've changed P by 25% and had the same best polynomial pop out of the two runs; I've changed P by 2x and never (in just a handful of trials) had the same poly come out of both runs. Each adrange is producing hundreds of candidate polys; over the course of a poly select run you're finding thousands and keeping the best 60 or 90 or whatever. That doesn't mean you should inflate P so high that you only find a few hundred polys; things don't work that way. I've been selecting P values such that the lastplace Exp value (the one published during sizeopt) is a minimum. I figure that's the P that finds the most "good" poly candidates, while trying to optimize for 5th place or 1st place is more a matter of luck. 

20200517, 13:26  #10 
"Ed Hall"
Dec 2009
Adirondack Mtns
2^{3}·3^{2}·47 Posts 
Some Levity for a Sunday Morning
Apparently, I have some parameters a bit off. Actually, I overran a bunch of timedout WUs and am restarting a snapshot, but the ETAs are scary!
Code:
Info:Polynomial Selection (size optimized): Parsed 378 polynomials, added 0 to priority queue (has 200) Info:Polynomial Selection (size optimized): Marking workunit c190_polyselect1_456960458880#2 as ok (92.9% => ETA Wed Apr 19 00:12:46 4237) Code:
Info:Polynomial Selection (size optimized): Parsed 370 polynomials, added 1 to priority queue (has 200) Info:Polynomial Selection (size optimized): Worst polynomial in queue now has exp_E 53.980000 Info:Polynomial Selection (size optimized): Marking workunit c190_polyselect1_443520445440#2 as ok (97.9% => ETA Mon Dec 13 04:11:14 2066) On a more serious note, I do see that even with a doubling of P, there are still more poly candidates found in one WU than the size of nrkeep at 200. 
20200517, 14:54  #11 
"Ed Hall"
Dec 2009
Adirondack Mtns
2^{3}·3^{2}·47 Posts 
Sorry to be so inquisitive, but I'm lost on something else:
Code:
Parsed 401 polynomials, added 327 to priority queue (has 200) Worst polynomial in queue now has exp_E 56.820000 . . . Parsed 367 polynomials, added 45 to priority queue (has 200) Worst polynomial in queue now has exp_E 55.920000 . . . Parsed 376 polynomials, added 9 to priority queue (has 200) Worst polynomial in queue now has exp_E 55.320000 Last fiddled with by EdH on 20200517 at 14:58 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Polynomial selection  Max0526  NFS@Home  9  20170520 08:57 
Improved NFS polynomial selection  jasonp  Operation Kibibit  5  20140907 11:02 
SNFS Polynomial selection help?  mhill12  Factoring  59  20130909 22:40 
2^8771 polynomial selection  fivemack  Factoring  47  20090616 00:24 
Polynomial selection  CRGreathouse  Factoring  2  20090525 07:55 