20220401, 19:47  #1 
"Ed Hall"
Dec 2009
Adirondack Mtns
11×443 Posts 
CADONFS Data Harvesting
This is mostly a question for VBCurtis:
If I'm only running CADONFS through sieving, are there any bits of timing data vs. composite sizes you may be interested in? I won't be able to flag whether I'm using a params file modified by you or me, or an original, but I can probably harvest certain details contained in the log file, or even from the snapshot. This is the CADONFS portion of a normal run for my scripts:  CADONFS is called by a script and given a few standard items. The rest are supplied by the params files.  CADONFS performs the Polyselect and Lattice Sieving via server/clients.  A side script is started (depending on the composite being >125 digits) that examines the relations and performs Msieve filtering until a matrix can be built.   Once successful, CADONFS is told to shut down.    The shutdown may occur anywhere from Lattice Sieving to LA.  If the composite is <125 digits, CADONFS completes the factorization. In light of the fact the process may or may not get to/through filtering, is there info that would be of value to gather? 
20220401, 23:30  #2 
"Curtis"
Feb 2005
Riverside, CA
2^{5}·3^{2}·19 Posts 
Yes, if you can connect the timing data to the params used for the job and to the composite size (including first digit).
Also, small jobs are easy to collect data, and I think I'm done with params below 135140 digits. So, maybe only for 150+ digit jobs? That data can be used to build a bit of a size  vs  sievetime curve, and any outliers mean "find better params for that size, please." If you'd like to do that, I'll be happy to review the data to see where I likely have suboptimal params. 
20220402, 01:40  #3 
"Ed Hall"
Dec 2009
Adirondack Mtns
11×443 Posts 
It's only for a c140, but how would this look for data:
Code:
N = 336... <140 digits> tasks.I = 14 tasks.lim0 = 8800000 tasks.lim1 = 14400000 tasks.lpb0 = 30 tasks.lpb1 = 31 tasks.qmin = 900000 tasks.filter.target_density = 130.0 tasks.filter.purge.keep = 180 tasks.polyselect.P = 182500 tasks.polyselect.admax = 146e3 tasks.polyselect.admin = 1800 tasks.polyselect.degree = 5 tasks.polyselect.incr = 120 tasks.polyselect.nq = 15625 tasks.polyselect.nrkeep = 66 tasks.polyselect.ropteffort = 16 tasks.sieve.lambda0 = 1.82 tasks.sieve.lambda1 = 1.81 tasks.sieve.mfb0 = 56 tasks.sieve.mfb1 = 58 tasks.sieve.ncurves0 = 18 tasks.sieve.ncurves1 = 24 tasks.sieve.qrange = 10000 Polynomial Selection (root optimized): Total time: 6113.74 Polynomial Selection (root optimized): Rootsieve time: 6112.81 Generate Factor Base: Total cpu/real time for makefb: 22.71/1.43919 Lattice Sieving: Total number of relations: 100246494 Lattice Sieving: Total time: 322777s Filtering  Duplicate Removal, splitting pass: CPU time for dup1: 245.5s 
20220402, 02:07  #4 
Apr 2020
3^{2}·5·19 Posts 
Unhelpfully, while the total time for a completed job is given in wallclock time and in threadtime, this value for the sieving step is neither of these: it's clienttime, so should be about threadtime/2 unless you've changed threadsperclient from the default. Just something that needs to be kept in mind when making comparisons.

20220402, 02:50  #5  
"Ed Hall"
Dec 2009
Adirondack Mtns
11×443 Posts 
Quote:
Also, would it be helpful if I translate that value into 89:39:37? 

20220402, 06:00  #6 
"Curtis"
Feb 2005
Riverside, CA
1560_{16} Posts 
A note of 4 threads per client is enough I can double the time if I compare to my own machines, or just leave it asis when comparing to other runs of yours.
I do think you should have everything run 4threaded so that the mix of 2threaded clients and 4threaded clients doesn't dirty the data. Polyselect params aren't really of interest, but poly score is of interest. Alas, Cado's score report is only comparable to other polys that use the exactsame siever & lim's, which is annoying. Poly select time is useful, as I am still not convinced I'm doing the right amount of poly select relative to sieve time! You may be using some older params files before I learned that the ratio of last Q sieved to startingQ should be no more than 8. If you notice any finalQ that's much higher than 8x the startingQ for that params file, boost startingQ accordingly. I'm seeing best performance with this ratio around 6 for C140+ jobs, a bit higher ratio works fine for smaller jobs. 
20220402, 12:52  #7 
"Ed Hall"
Dec 2009
Adirondack Mtns
4873_{10} Posts 
I was also thinking a note about clients would be better, since then you know. If I simply adjusted the value, we'd never be sure. And, I can leave out the 2thread easy enough, although I wonder the actual effect of one, 2thread client alongside 5779, fourthread clients. Then, again, what's its contribution among the set? I doubt it would be missed.
Should I add in the full polynomial? There are at least two different polynomials (of three) in my current sample. I would expect the final one to be the one used. I should be able to harvest that. (I'm pretty sure) I could add in a cownoise score. I'm currently running a c160, that is supposed to finish tonight. I can start using it as my sample, instead of the current c140. If I understand, I can drop all tasks.polyselect values. Is there interest in the separate Rootsieve time? What about the Factor Base value? ETA: If I achieve my goal of having CADONFS continue sieving until I tell it to stop, there will be no filtering time. I will probably remove that item from my data list. Last fiddled with by EdH on 20220402 at 13:34 
20220402, 15:12  #8 
"Curtis"
Feb 2005
Riverside, CA
2^{5}×3^{2}×19 Posts 
If there's a cownoise score, the actual poly has no use to the data summary.
I agree that 12 twothreaded instances won't color the data from a 50+ client farm! 
20220402, 15:34  #9 
"Ed Hall"
Dec 2009
Adirondack Mtns
4873_{10} Posts 
Here's another run against the c140:
Code:
N = 336... <140 digits> tasks.I = 14 tasks.lim0 = 8800000 tasks.lim1 = 14400000 tasks.lpb0 = 30 tasks.lpb1 = 31 tasks.qmin = 900000 tasks.filter.target_density = 130.0 tasks.filter.purge.keep = 180 tasks.sieve.lambda0 = 1.82 tasks.sieve.lambda1 = 1.81 tasks.sieve.mfb0 = 56 tasks.sieve.mfb1 = 58 tasks.sieve.ncurves0 = 18 tasks.sieve.ncurves1 = 24 tasks.sieve.qrange = 10000 Polynomial Selection (root optimized): Total time: 6113.74 Lattice Sieving: Total number of relations: 100246494 Lattice Sieving: Total time: 322777s (all clients used 4 threads) Found 55764577 unique, 23084551 duplicate, and 0 bad relations. cownoise Best MurphyE for polynomial is 2.51691527e11 
20220402, 15:54  #10 
Apr 2020
3^{2}·5·19 Posts 

20220402, 16:20  #11  
"Ed Hall"
Dec 2009
Adirondack Mtns
11·443 Posts 
Quote:
Code:
N = 336... <140 digits> tasks.I = 14 tasks.lim0 = 8800000 tasks.lim1 = 14400000 tasks.lpb0 = 30 tasks.lpb1 = 31 tasks.qmin = 900000 tasks.filter.target_density = 130.0 tasks.filter.purge.keep = 180 tasks.sieve.lambda0 = 1.82 tasks.sieve.lambda1 = 1.81 tasks.sieve.mfb0 = 56 tasks.sieve.mfb1 = 58 tasks.sieve.ncurves0 = 18 tasks.sieve.ncurves1 = 24 tasks.sieve.qrange = 10000 Polynomial Selection (size optimized): Total time: 29411.9 Polynomial Selection (root optimized): Total time: 6113.74 Lattice Sieving: Total number of relations: 100246494 Lattice Sieving: Total time: 322777s (all clients used 4 threads) Found 55764577 unique, 23084551 duplicate, and 0 bad relations. cownoise Best MurphyE for polynomial is 2.51691527e11 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
CADO help  henryzz  CADONFS  6  20220913 23:11 
CADO NFS  Shaopu Lin  CADONFS  522  20210504 18:28 
CADONFS Timing Data For Many Factorizations  EdH  EdH  8  20190520 15:07 
CADONFS  skan  Information & Answers  1  20131022 07:00 
CADO  R.D. Silverman  Factoring  4  20081106 12:35 