mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Msieve (https://www.mersenneforum.org/forumdisplay.php?f=83)
-   -   Estimating number of relations needed (https://www.mersenneforum.org/showthread.php?t=26112)

swishzzz 2020-10-21 19:07

Estimating number of relations needed
 
1 Attachment(s)
I am trying to run a 200 digit SNFS job for the first time and after sieving from 7.5M to ~15M I've collected about 33.5M relations vs the estimated minimum of about 28.6M but it isn't getting past the filtering step. It reduces to just 3499 relations and 0 ideals after some number of passes and the number of duplicate relations is about 10% the number of relations which is similar to that of some smaller jobs I've finished.

Anyway, I would have expected the number of relations and ideals after filtering to be much greater than what is indicated here. I'm not sure if the issue can be resolved just with more sieving, but if that is the case is there a way to reliably estimate how many more relations I might need?

Full log is also attached.

[code]
Wed Oct 21 14:44:04 2020 Msieve v. 1.53 (SVN unknown)
Wed Oct 21 14:44:04 2020 random seeds: bd75a574 74f633da
Wed Oct 21 14:44:04 2020 factoring 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001023 (200 digits)
Wed Oct 21 14:44:05 2020 searching for 15-digit factors
Wed Oct 21 14:44:05 2020 commencing number field sieve (200-digit input)
Wed Oct 21 14:44:05 2020 R0: 5000000000000000000000000000000000000000
Wed Oct 21 14:44:05 2020 R1: -1
Wed Oct 21 14:44:05 2020 A0: 5115
Wed Oct 21 14:44:05 2020 A1: 0
Wed Oct 21 14:44:05 2020 A2: 0
Wed Oct 21 14:44:05 2020 A3: 0
Wed Oct 21 14:44:05 2020 A4: 0
Wed Oct 21 14:44:05 2020 A5: 16
Wed Oct 21 14:44:05 2020 skew 3.17, size 3.898e-14, alpha 0.504, combined = 1.155e-11 rroots = 1
Wed Oct 21 14:44:05 2020
Wed Oct 21 14:44:05 2020 commencing relation filtering
Wed Oct 21 14:44:05 2020 estimated available RAM is 7542.8 MB
Wed Oct 21 14:44:05 2020 commencing duplicate removal, pass 1
Wed Oct 21 14:47:13 2020 error -15 reading relation 30460330
Wed Oct 21 14:47:24 2020 error -15 reading relation 32331885
Wed Oct 21 14:47:25 2020 error -15 reading relation 32392947
Wed Oct 21 14:47:25 2020 error -11 reading relation 32423445
Wed Oct 21 14:47:30 2020 error -9 reading relation 33359915
Wed Oct 21 14:47:31 2020 skipped 5 relations with composite factors
Wed Oct 21 14:47:31 2020 found 3606403 hash collisions in 33483146 relations
Wed Oct 21 14:47:49 2020 added 75 free relations
Wed Oct 21 14:47:49 2020 commencing duplicate removal, pass 2
Wed Oct 21 14:48:00 2020 found 3078858 duplicates and 30404363 unique relations
Wed Oct 21 14:48:00 2020 memory use: 138.6 MB
Wed Oct 21 14:48:00 2020 reading ideals above 720000
Wed Oct 21 14:48:00 2020 commencing singleton removal, initial pass
Wed Oct 21 14:50:59 2020 memory use: 753.0 MB
Wed Oct 21 14:50:59 2020 reading all ideals from disk
Wed Oct 21 14:51:00 2020 memory use: 1037.9 MB
Wed Oct 21 14:51:02 2020 keeping 38289591 ideals with weight <= 200, target excess is 159943
Wed Oct 21 14:51:04 2020 commencing in-memory singleton removal
Wed Oct 21 14:51:06 2020 begin with 30404363 relations and 38289591 unique ideals
Wed Oct 21 14:51:12 2020 reduce to 3499 relations and 0 ideals in 17 passes
Wed Oct 21 14:51:12 2020 max relations containing the same ideal: 0
Wed Oct 21 14:51:12 2020 filtering wants 1000000 more relations
[/code]

Brownfox 2020-10-21 19:35

I think you've just crossed a boundary into needing the 14e siever, which is why it seems to be longer.



It just needs more relations - probably 40-45 million, based on similar numbers that I've factored.

VBCurtis 2020-10-21 20:54

If the large-prime bound jumped up 1 from your previous jobs, you'll need about 70% more relations than you did for the previous jobs, but not all that much more overall wall-clock time; more relations are found per unit time with a larger large-prime setting.

That's the lpbr and lpba setting in the .poly file.

SNFS has a bunch of moving parts; experience will show you the way. If you're using the factmsieve.py script, the target relations numbers are not very accurate. You can edit the script as you gain experience, to have it not bother trying to filter until it's in the ballpark of your previous jobs of that size. There is quite a bit of speed to be found in tweaking the settings in that script; YAFU has incorporated lots of such tweaks, and if you're in Linux then CADO is faster still but takes some experience to run for SNFS jobs.

swishzzz 2020-10-21 21:43

Thanks for the explanations everyone. This job did switch from the 13e to the 14e siever compared to the last one (156 digits) and the large prime bounds actually jumped twice from 2^27 to 2^29. I think I will just manually modify the relations estimate in the factmsieve.py script for now.

LaurV 2020-10-22 07:53

I edited the OP, see there the reason. Please mind for the future.

chris2be8 2020-10-22 16:17

[QUOTE=swishzzz;560589]
[code]
Wed Oct 21 14:51:12 2020 reduce to 3499 relations and 0 ideals in 17 passes
[/code][/QUOTE]

That line implies you need at least 30% more relations. You might need a lot more but won't need significantly less than that.

If the number of relations, X, and the number of ideals, Y, are both reasonably large (at least 4 digits) and X < Y then add 10%+(Y-X)*4 else add 5%.

That's an empirical relationship (ie I don't know why it works). But it's useful.

Chris

swishzzz 2020-10-23 01:30

[QUOTE=chris2be8;560730]That line implies you need at least 30% more relations. You might need a lot more but won't need significantly less than that.

If the number of relations, X, and the number of ideals, Y, are both reasonably large (at least 4 digits) and X < Y then add 10%+(Y-X)*4 else add 5%.

That's an empirical relationship (ie I don't know why it works). But it's useful.

Chris[/QUOTE]


Interesting. I've just completed this factorization and my numbers were a little less than what your formula would have suggested. Ended up needing 42.2M raw relations (about 26% more than in my first post); 35.9M was where it first filtered down to a non-trivial number of relations with 4.5M relations and 5.6M ideals, which would have meant 8M relations or ~44M total.

chris2be8 2020-10-23 16:10

I worked out my guidelines doing rather smaller SNFS jobs than your job. So I'm not too surprised you managed with fewer relations. A better guess would take the duplicate rate into account as well. It's only a rough guide but a lot better than repeatedly trying to build a matrix until we have enough relations.

Chris

swishzzz 2021-03-31 18:52

I'm running two similar SNFS jobs as in the OP concurrently, and I'm noticing a rather massive discrepancy in yields from these:

Poly 1:

[CODE]
n: 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000040039
type: snfs
c5: 16
c0: 200195
Y1: -1
Y0: 5000000000000000000000000000000000000000
q0: 3300000
[/CODE]

Poly 2:

[CODE]
n: 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000039669
type: snfs
c5: 16
c0: 198345
Y1: -1
Y0: 5000000000000000000000000000000000000000
q0: 3300000
[/CODE]

So same size, basically the same skew as the two polys only differ slightly in the constant term. Sieving from q=3300k to 3400k yielded about 430k relations for poly 1 and only 290k for poly 2. Poly 1 built a 20% smaller than usual matrix with ~37.4M unique relations (so perhaps it would have been possible to build with 35-36M), but poly 2 barely built a matrix with 38.5M uniques. So with slower sieving + more relations needed I'd estimate poly 2 to be at least 4 digits more difficult than poly 1 which is somewhat surprising.

Am I correct in understanding from this is that this is due to smooth relations being harder to find in the algebraic ring extension of Z that include a root of poly 2 compared to that for poly 1, and if so is there a good way to estimate the "difficulty" of these polynomials without actually doing the sieving?

VBCurtis 2021-03-31 19:00

You didn't mention how much time the sieving takes- yield isn't really related to how fast relations are found. Comparing 290k relations from a Q-range to 430k relations doesn't tell you anything about difficulty.
Compare sec/rel in a variety of ranges, or compare total sieving time; yield is a red herring.

swishzzz 2021-03-31 19:45

I ran the two jobs on different boxes (8 cores for poly 1 and 6 cores for poly 2) but here are some stats from various Q ranges which shows that poly 2 is quite a bit slower:

[QUOTE]
Poly 1 sieved from Q = 3300k - 12400k
Poly 2 sieved from Q = 3300k - 18100k

Q = 3300k - 3400k
Poly 1: 430k rel, 46.1 rel/sec/core
Poly 2: 290k rel, 36.7 rel/sec/core

Q = 7000k - 7100k
Poly 1: 465k rel, 50.1 rel/sec/core
Poly 2: 312k rel, 38.3 rel/sec/core

Q = 11000k - 11100k
Poly 1: 456k rel, 47.2 rel/sec/core
Poly 2: 304k rel, 35.1 rel/sec/core

Q = 16000 to 16100k
Poly 2: 286k rel, 30.9 rel/sec/core
[/QUOTE]


All times are UTC. The time now is 19:26.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.