mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Msieve

Reply
 
Thread Tools
Old 2020-10-21, 19:07   #1
swishzzz
 
Jan 2012
Toronto, Canada

59 Posts
Default Estimating number of relations needed

I am trying to run a 200 digit SNFS job for the first time and after sieving from 7.5M to ~15M I've collected about 33.5M relations vs the estimated minimum of about 28.6M but it isn't getting past the filtering step. It reduces to just 3499 relations and 0 ideals after some number of passes and the number of duplicate relations is about 10% the number of relations which is similar to that of some smaller jobs I've finished.

Anyway, I would have expected the number of relations and ideals after filtering to be much greater than what is indicated here. I'm not sure if the issue can be resolved just with more sieving, but if that is the case is there a way to reliably estimate how many more relations I might need?

Full log is also attached.

Code:
Wed Oct 21 14:44:04 2020  Msieve v. 1.53 (SVN unknown)
Wed Oct 21 14:44:04 2020  random seeds: bd75a574 74f633da
Wed Oct 21 14:44:04 2020  factoring 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001023 (200 digits)
Wed Oct 21 14:44:05 2020  searching for 15-digit factors
Wed Oct 21 14:44:05 2020  commencing number field sieve (200-digit input)
Wed Oct 21 14:44:05 2020  R0: 5000000000000000000000000000000000000000
Wed Oct 21 14:44:05 2020  R1: -1
Wed Oct 21 14:44:05 2020  A0: 5115
Wed Oct 21 14:44:05 2020  A1: 0
Wed Oct 21 14:44:05 2020  A2: 0
Wed Oct 21 14:44:05 2020  A3: 0
Wed Oct 21 14:44:05 2020  A4: 0
Wed Oct 21 14:44:05 2020  A5: 16
Wed Oct 21 14:44:05 2020  skew 3.17, size 3.898e-14, alpha 0.504, combined = 1.155e-11 rroots = 1
Wed Oct 21 14:44:05 2020  
Wed Oct 21 14:44:05 2020  commencing relation filtering
Wed Oct 21 14:44:05 2020  estimated available RAM is 7542.8 MB
Wed Oct 21 14:44:05 2020  commencing duplicate removal, pass 1
Wed Oct 21 14:47:13 2020  error -15 reading relation 30460330
Wed Oct 21 14:47:24 2020  error -15 reading relation 32331885
Wed Oct 21 14:47:25 2020  error -15 reading relation 32392947
Wed Oct 21 14:47:25 2020  error -11 reading relation 32423445
Wed Oct 21 14:47:30 2020  error -9 reading relation 33359915
Wed Oct 21 14:47:31 2020  skipped 5 relations with composite factors
Wed Oct 21 14:47:31 2020  found 3606403 hash collisions in 33483146 relations
Wed Oct 21 14:47:49 2020  added 75 free relations
Wed Oct 21 14:47:49 2020  commencing duplicate removal, pass 2
Wed Oct 21 14:48:00 2020  found 3078858 duplicates and 30404363 unique relations
Wed Oct 21 14:48:00 2020  memory use: 138.6 MB
Wed Oct 21 14:48:00 2020  reading ideals above 720000
Wed Oct 21 14:48:00 2020  commencing singleton removal, initial pass
Wed Oct 21 14:50:59 2020  memory use: 753.0 MB
Wed Oct 21 14:50:59 2020  reading all ideals from disk
Wed Oct 21 14:51:00 2020  memory use: 1037.9 MB
Wed Oct 21 14:51:02 2020  keeping 38289591 ideals with weight <= 200, target excess is 159943
Wed Oct 21 14:51:04 2020  commencing in-memory singleton removal
Wed Oct 21 14:51:06 2020  begin with 30404363 relations and 38289591 unique ideals
Wed Oct 21 14:51:12 2020  reduce to 3499 relations and 0 ideals in 17 passes
Wed Oct 21 14:51:12 2020  max relations containing the same ideal: 0
Wed Oct 21 14:51:12 2020  filtering wants 1000000 more relations
Attached Files
File Type: log 200_1023.log (108.0 KB, 71 views)

Last fiddled with by LaurV on 2020-10-22 at 07:52 Reason: replaced quote tag with code tag due to aesthetics (very long number, difficult to read)
swishzzz is offline   Reply With Quote
Old 2020-10-21, 19:35   #2
Brownfox
 
Brownfox's Avatar
 
Dec 2017

3F16 Posts
Default

I think you've just crossed a boundary into needing the 14e siever, which is why it seems to be longer.



It just needs more relations - probably 40-45 million, based on similar numbers that I've factored.
Brownfox is offline   Reply With Quote
Old 2020-10-21, 20:54   #3
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

4,789 Posts
Default

If the large-prime bound jumped up 1 from your previous jobs, you'll need about 70% more relations than you did for the previous jobs, but not all that much more overall wall-clock time; more relations are found per unit time with a larger large-prime setting.

That's the lpbr and lpba setting in the .poly file.

SNFS has a bunch of moving parts; experience will show you the way. If you're using the factmsieve.py script, the target relations numbers are not very accurate. You can edit the script as you gain experience, to have it not bother trying to filter until it's in the ballpark of your previous jobs of that size. There is quite a bit of speed to be found in tweaking the settings in that script; YAFU has incorporated lots of such tweaks, and if you're in Linux then CADO is faster still but takes some experience to run for SNFS jobs.
VBCurtis is online now   Reply With Quote
Old 2020-10-21, 21:43   #4
swishzzz
 
Jan 2012
Toronto, Canada

59 Posts
Default

Thanks for the explanations everyone. This job did switch from the 13e to the 14e siever compared to the last one (156 digits) and the large prime bounds actually jumped twice from 2^27 to 2^29. I think I will just manually modify the relations estimate in the factmsieve.py script for now.
swishzzz is offline   Reply With Quote
Old 2020-10-22, 07:53   #5
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

22×17×139 Posts
Default

I edited the OP, see there the reason. Please mind for the future.
LaurV is offline   Reply With Quote
Old 2020-10-22, 16:17   #6
chris2be8
 
chris2be8's Avatar
 
Sep 2009

32·227 Posts
Default

Quote:
Originally Posted by swishzzz View Post
Code:
Wed Oct 21 14:51:12 2020  reduce to 3499 relations and 0 ideals in 17 passes
That line implies you need at least 30% more relations. You might need a lot more but won't need significantly less than that.

If the number of relations, X, and the number of ideals, Y, are both reasonably large (at least 4 digits) and X < Y then add 10%+(Y-X)*4 else add 5%.

That's an empirical relationship (ie I don't know why it works). But it's useful.

Chris
chris2be8 is offline   Reply With Quote
Old 2020-10-23, 01:30   #7
swishzzz
 
Jan 2012
Toronto, Canada

59 Posts
Default

Quote:
Originally Posted by chris2be8 View Post
That line implies you need at least 30% more relations. You might need a lot more but won't need significantly less than that.

If the number of relations, X, and the number of ideals, Y, are both reasonably large (at least 4 digits) and X < Y then add 10%+(Y-X)*4 else add 5%.

That's an empirical relationship (ie I don't know why it works). But it's useful.

Chris

Interesting. I've just completed this factorization and my numbers were a little less than what your formula would have suggested. Ended up needing 42.2M raw relations (about 26% more than in my first post); 35.9M was where it first filtered down to a non-trivial number of relations with 4.5M relations and 5.6M ideals, which would have meant 8M relations or ~44M total.
swishzzz is offline   Reply With Quote
Old 2020-10-23, 16:10   #8
chris2be8
 
chris2be8's Avatar
 
Sep 2009

32×227 Posts
Default

I worked out my guidelines doing rather smaller SNFS jobs than your job. So I'm not too surprised you managed with fewer relations. A better guess would take the duplicate rate into account as well. It's only a rough guide but a lot better than repeatedly trying to build a matrix until we have enough relations.

Chris
chris2be8 is offline   Reply With Quote
Old 2021-03-31, 18:52   #9
swishzzz
 
Jan 2012
Toronto, Canada

59 Posts
Default

I'm running two similar SNFS jobs as in the OP concurrently, and I'm noticing a rather massive discrepancy in yields from these:

Poly 1:

Code:
n: 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000040039
type: snfs
c5: 16
c0: 200195
Y1: -1
Y0: 5000000000000000000000000000000000000000
q0: 3300000
Poly 2:

Code:
n: 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000039669
type: snfs
c5: 16
c0: 198345
Y1: -1
Y0: 5000000000000000000000000000000000000000
q0: 3300000
So same size, basically the same skew as the two polys only differ slightly in the constant term. Sieving from q=3300k to 3400k yielded about 430k relations for poly 1 and only 290k for poly 2. Poly 1 built a 20% smaller than usual matrix with ~37.4M unique relations (so perhaps it would have been possible to build with 35-36M), but poly 2 barely built a matrix with 38.5M uniques. So with slower sieving + more relations needed I'd estimate poly 2 to be at least 4 digits more difficult than poly 1 which is somewhat surprising.

Am I correct in understanding from this is that this is due to smooth relations being harder to find in the algebraic ring extension of Z that include a root of poly 2 compared to that for poly 1, and if so is there a good way to estimate the "difficulty" of these polynomials without actually doing the sieving?

Last fiddled with by swishzzz on 2021-03-31 at 18:53 Reason: quote -> code
swishzzz is offline   Reply With Quote
Old 2021-03-31, 19:00   #10
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

478910 Posts
Default

You didn't mention how much time the sieving takes- yield isn't really related to how fast relations are found. Comparing 290k relations from a Q-range to 430k relations doesn't tell you anything about difficulty.
Compare sec/rel in a variety of ranges, or compare total sieving time; yield is a red herring.
VBCurtis is online now   Reply With Quote
Old 2021-03-31, 19:45   #11
swishzzz
 
Jan 2012
Toronto, Canada

59 Posts
Default

I ran the two jobs on different boxes (8 cores for poly 1 and 6 cores for poly 2) but here are some stats from various Q ranges which shows that poly 2 is quite a bit slower:

Quote:
Poly 1 sieved from Q = 3300k - 12400k
Poly 2 sieved from Q = 3300k - 18100k

Q = 3300k - 3400k
Poly 1: 430k rel, 46.1 rel/sec/core
Poly 2: 290k rel, 36.7 rel/sec/core

Q = 7000k - 7100k
Poly 1: 465k rel, 50.1 rel/sec/core
Poly 2: 312k rel, 38.3 rel/sec/core

Q = 11000k - 11100k
Poly 1: 456k rel, 47.2 rel/sec/core
Poly 2: 304k rel, 35.1 rel/sec/core

Q = 16000 to 16100k
Poly 2: 286k rel, 30.9 rel/sec/core
swishzzz is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Estimating the number of primes in a partially-factored number CRGreathouse Probability & Probabilistic Number Theory 15 2014-08-13 18:46
Estimating time needed for GNFS CRGreathouse Factoring 16 2014-03-10 03:40
Estimating time needed for GNFS CRGreathouse Factoring 0 2014-03-02 04:18
Estimating the number of prime factors a number has henryzz Math 7 2012-05-23 01:13
Estimating minimum relations bchaffin Factoring 24 2012-03-24 18:37

All times are UTC. The time now is 22:30.

Mon May 17 22:30:55 UTC 2021 up 39 days, 17:11, 0 users, load averages: 2.97, 2.77, 2.90

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.