mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > CADO-NFS

Reply
 
Thread Tools
Old 2022-04-01, 19:47   #1
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

11×443 Posts
Default CADO-NFS Data Harvesting

This is mostly a question for VBCurtis:

If I'm only running CADO-NFS through sieving, are there any bits of timing data vs. composite sizes you may be interested in?

I won't be able to flag whether I'm using a params file modified by you or me, or an original, but I can probably harvest certain details contained in the log file, or even from the snapshot.

This is the CADO-NFS portion of a normal run for my scripts:
- CADO-NFS is called by a script and given a few standard items. The rest are supplied by the params files.
- CADO-NFS performs the Polyselect and Lattice Sieving via server/clients.
- A side script is started (depending on the composite being >125 digits) that examines the relations and performs Msieve filtering until a matrix can be built.
- - Once successful, CADO-NFS is told to shut down.
- - - The shutdown may occur anywhere from Lattice Sieving to LA.
- If the composite is <125 digits, CADO-NFS completes the factorization.

In light of the fact the process may or may not get to/through filtering, is there info that would be of value to gather?
EdH is offline   Reply With Quote
Old 2022-04-01, 23:30   #2
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

25·32·19 Posts
Default

Yes, if you can connect the timing data to the params used for the job and to the composite size (including first digit).

Also, small jobs are easy to collect data, and I think I'm done with params below 135-140 digits. So, maybe only for 150+ digit jobs? That data can be used to build a bit of a size - vs - sievetime curve, and any outliers mean "find better params for that size, please."

If you'd like to do that, I'll be happy to review the data to see where I likely have suboptimal params.
VBCurtis is offline   Reply With Quote
Old 2022-04-02, 01:40   #3
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

11×443 Posts
Default

It's only for a c140, but how would this look for data:
Code:
N = 336... <140 digits>
tasks.I = 14
tasks.lim0 = 8800000
tasks.lim1 = 14400000
tasks.lpb0 = 30
tasks.lpb1 = 31
tasks.qmin = 900000
tasks.filter.target_density = 130.0
tasks.filter.purge.keep = 180
tasks.polyselect.P = 182500
tasks.polyselect.admax = 146e3
tasks.polyselect.admin = 1800
tasks.polyselect.degree = 5
tasks.polyselect.incr = 120
tasks.polyselect.nq = 15625
tasks.polyselect.nrkeep = 66
tasks.polyselect.ropteffort = 16
tasks.sieve.lambda0 = 1.82
tasks.sieve.lambda1 = 1.81
tasks.sieve.mfb0 = 56
tasks.sieve.mfb1 = 58
tasks.sieve.ncurves0 = 18
tasks.sieve.ncurves1 = 24
tasks.sieve.qrange = 10000
Polynomial Selection (root optimized): Total time: 6113.74
Polynomial Selection (root optimized): Rootsieve time: 6112.81
Generate Factor Base: Total cpu/real time for makefb: 22.71/1.43919
Lattice Sieving: Total number of relations: 100246494
Lattice Sieving: Total time: 322777s
Filtering - Duplicate Removal, splitting pass: CPU time for dup1: 245.5s
Anything missing or not really of interest?
EdH is offline   Reply With Quote
Old 2022-04-02, 02:07   #4
charybdis
 
charybdis's Avatar
 
Apr 2020

32·5·19 Posts
Default

Quote:
Originally Posted by EdH View Post
Code:
Lattice Sieving: Total time: 322777s
Unhelpfully, while the total time for a completed job is given in wall-clock time and in thread-time, this value for the sieving step is neither of these: it's client-time, so should be about thread-time/2 unless you've changed threads-per-client from the default. Just something that needs to be kept in mind when making comparisons.
charybdis is offline   Reply With Quote
Old 2022-04-02, 02:50   #5
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

11×443 Posts
Default

Quote:
Originally Posted by charybdis View Post
Unhelpfully, while the total time for a completed job is given in wall-clock time and in thread-time, this value for the sieving step is neither of these: it's client-time, so should be about thread-time/2 unless you've changed threads-per-client from the default. Just something that needs to be kept in mind when making comparisons.
I'm a tiny bit confused, but all my clients are based on four threads, except an occasional two thread machine that is very rarely engaged. I could make that rarely into never without issue, since it is also my GPU machine and I'm normally doing other stuff with it. Are you saying I should divide this value by 4? Or, should I just note on that line, 4 threads per client?

Also, would it be helpful if I translate that value into 89:39:37?
EdH is offline   Reply With Quote
Old 2022-04-02, 06:00   #6
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

156016 Posts
Default

A note of 4 threads per client is enough- I can double the time if I compare to my own machines, or just leave it as-is when comparing to other runs of yours.
I do think you should have everything run 4-threaded so that the mix of 2-threaded clients and 4-threaded clients doesn't dirty the data.
Polyselect params aren't really of interest, but poly score is of interest. Alas, Cado's score report is only comparable to other polys that use the exact-same siever & lim's, which is annoying. Poly select time is useful, as I am still not convinced I'm doing the right amount of poly select relative to sieve time!

You may be using some older params files before I learned that the ratio of last Q sieved to starting-Q should be no more than 8. If you notice any final-Q that's much higher than 8x the starting-Q for that params file, boost starting-Q accordingly. I'm seeing best performance with this ratio around 6 for C140+ jobs, a bit higher ratio works fine for smaller jobs.
VBCurtis is offline   Reply With Quote
Old 2022-04-02, 12:52   #7
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

487310 Posts
Default

I was also thinking a note about clients would be better, since then you know. If I simply adjusted the value, we'd never be sure. And, I can leave out the 2-thread easy enough, although I wonder the actual effect of one, 2-thread client alongside 57-79, four-thread clients. Then, again, what's its contribution among the set? I doubt it would be missed.

Should I add in the full polynomial? There are at least two different polynomials (of three) in my current sample. I would expect the final one to be the one used. I should be able to harvest that.

(I'm pretty sure) I could add in a cownoise score.

I'm currently running a c160, that is supposed to finish tonight. I can start using it as my sample, instead of the current c140.

If I understand, I can drop all tasks.polyselect values.

Is there interest in the separate Rootsieve time?

What about the Factor Base value?

ETA: If I achieve my goal of having CADO-NFS continue sieving until I tell it to stop, there will be no filtering time. I will probably remove that item from my data list.

Last fiddled with by EdH on 2022-04-02 at 13:34
EdH is offline   Reply With Quote
Old 2022-04-02, 15:12   #8
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

25×32×19 Posts
Default

If there's a cownoise score, the actual poly has no use to the data summary.
I agree that 1-2 two-threaded instances won't color the data from a 50+ client farm!
VBCurtis is offline   Reply With Quote
Old 2022-04-02, 15:34   #9
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

487310 Posts
Default

Here's another run against the c140:
Code:
N = 336... <140 digits>
tasks.I = 14
tasks.lim0 = 8800000
tasks.lim1 = 14400000
tasks.lpb0 = 30
tasks.lpb1 = 31
tasks.qmin = 900000
tasks.filter.target_density = 130.0
tasks.filter.purge.keep = 180
tasks.sieve.lambda0 = 1.82
tasks.sieve.lambda1 = 1.81
tasks.sieve.mfb0 = 56
tasks.sieve.mfb1 = 58
tasks.sieve.ncurves0 = 18
tasks.sieve.ncurves1 = 24
tasks.sieve.qrange = 10000
Polynomial Selection (root optimized): Total time: 6113.74
Lattice Sieving: Total number of relations: 100246494
Lattice Sieving: Total time: 322777s (all clients used 4 threads)
Found 55764577 unique, 23084551 duplicate, and 0 bad relations.
cownoise Best MurphyE for polynomial is 2.51691527e-11
The discrepancy in relations counts is due to the filtering tests for Msieve running while CADO-NFS is still sieving. Would a ratio of duplication be a less confusing value? Or, would you rather be able to look at the actual numbers as shown?
EdH is offline   Reply With Quote
Old 2022-04-02, 15:54   #10
charybdis
 
charybdis's Avatar
 
Apr 2020

32·5·19 Posts
Default

Quote:
Originally Posted by EdH View Post
Is there interest in the separate Rootsieve time?
Surely there's not much point in reporting it unless you report the time for the rest of polyselect too?
charybdis is offline   Reply With Quote
Old 2022-04-02, 16:20   #11
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

11·443 Posts
Default

Quote:
Originally Posted by charybdis View Post
Surely there's not much point in reporting it unless you report the time for the rest of polyselect too?
Found it! Is this better?:
Code:
N = 336... <140 digits>
tasks.I = 14
tasks.lim0 = 8800000
tasks.lim1 = 14400000
tasks.lpb0 = 30
tasks.lpb1 = 31
tasks.qmin = 900000
tasks.filter.target_density = 130.0
tasks.filter.purge.keep = 180
tasks.sieve.lambda0 = 1.82
tasks.sieve.lambda1 = 1.81
tasks.sieve.mfb0 = 56
tasks.sieve.mfb1 = 58
tasks.sieve.ncurves0 = 18
tasks.sieve.ncurves1 = 24
tasks.sieve.qrange = 10000
Polynomial Selection (size optimized): Total time: 29411.9
Polynomial Selection (root optimized): Total time: 6113.74
Lattice Sieving: Total number of relations: 100246494
Lattice Sieving: Total time: 322777s (all clients used 4 threads)
Found 55764577 unique, 23084551 duplicate, and 0 bad relations.
cownoise Best MurphyE for polynomial is 2.51691527e-11
EdH is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
CADO help henryzz CADO-NFS 6 2022-09-13 23:11
CADO NFS Shaopu Lin CADO-NFS 522 2021-05-04 18:28
CADO-NFS Timing Data For Many Factorizations EdH EdH 8 2019-05-20 15:07
CADO-NFS skan Information & Answers 1 2013-10-22 07:00
CADO R.D. Silverman Factoring 4 2008-11-06 12:35

All times are UTC. The time now is 07:03.


Tue Sep 27 07:03:44 UTC 2022 up 40 days, 4:32, 0 users, load averages: 1.38, 1.35, 1.31

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔