mersenneforum.org Advice for large GNFS jobs?
 Register FAQ Search Today's Posts Mark Forums Read

2013-07-03, 17:34   #12
WraithX

Mar 2006

23×59 Posts

Quote:
 Originally Posted by VBCurtis t13 looks stronger than t29 or t32.
Oh? Can you explain why? I'm still very new to this and would like to understand.
Quote:
 Originally Posted by VBCurtis Do these parameter choices change for a sextic vs a quintic? Did you ever figure out if the stage 2 speed differences were 32 vs 64 bitness, or win vs linux?
I was testing all of those different parameters with a single quintic poly. This brings me back to my question 1) above. Should I find the best params for one poly, and then use those to find the best poly? Or should I use one set of params to find the best poly, and then find the best params for that winning poly? Also, should this procedure be done separately for deg-5 and deg-6 poly's?

I'll have to go back and look at my notes to see why I thought there was a speed difference. I think once I figured out how to correctly set up all the different -np1/nps/npr parameter choices, most of the speed differences disappeared. Although, as was explained by jasonp, there were still big speed differences when running -npr between deg-5 and deg-6, always.

@henryzz: Thanks for the info on the 96 bit choice. I thought it might be related to the software, but I wasn't sure. I used the lasieve4I16e siever for all of the above tests.

If anyone can help, I'd still very much like to know the answers to all of my questions in post #9.

 2013-07-03, 19:14 #13 VBCurtis     "Curtis" Feb 2005 Riverside, CA 104518 Posts When the large-prime bound goes up by 1 bit, the number of relations needed doubles. When one is increased by 1 bit but the other remains unchanged, the number of relations goes up by sqrt2. So, 33-33 has to find relations twice as fast as 32-32 to actually be equal, and 33-32 has to be 1.41x as fast. So, to compare, we double the sec/relation of all the 33/33 tests, and multiply all the 33/32 tests by 1.414. That's why I think #13 is best of these choices. From the GNFS parameters list, RSA200 used rlim180M/alim300M, and large-prime bounds of 35 bits (!!). This required 2700M relations. RSA-768 used rlim 200M/alim 1100M for 232 digits. So, it seems the group who factored these believes alim can be profitably increased compared to rlim. Note I have *no* experience, and am only parroting what I've read. The sheet also notes the 15/16e siever cutoff is just over 180 digits. Last fiddled with by VBCurtis on 2013-07-03 at 19:25 Reason: clarity, and RSA info
 2013-07-04, 16:21 #14 lorgix     Sep 2010 Scandinavia 11478 Posts On the topic of large GNFS jobs: How much non-ECC RAM makes sense if I intend to use it for LA? In my case it's a dual-CPU G34-board that can handle 128GB of non-ECC RAM. Would it be crazy to equip it with 64GB non-ECC RAM? Maybe non-ECC RAM is a bad idea altogether?
 2013-07-04, 16:25 #15 jasonp Tribal Bullet     Oct 2004 3·1,163 Posts Msieve's LA includes consistency checks every few minutes and writes checkpoint files every hour and on any interruption, so the worst that would happen is that an error (for whatever reason) is detected and you lose an hour's work. That being said, the last time we had this discussion ECC memory really cost only a tiny bit more than non-ECC, so to me it would be worth buying a little peace of mind. Last fiddled with by jasonp on 2013-07-04 at 16:26
2013-07-05, 14:49   #16
WraithX

Mar 2006

23·59 Posts

Batalov, jrk, frmky, fivemack, jasonp, and anyone else who may have worked on large gnfs factorizations, I'd really like to create some sort of reference/Q&A/FAQ to help guide newcomers who want to work on large gnfs factorizations [especially me, right now ]. If any of you could take even just one or two of the questions from post #9 from this thread (LINK) that would help out greatly.

Quote:
 Originally Posted by VBCurtis When the large-prime bound goes up by 1 bit, the number of relations needed doubles. When one is increased by 1 bit but the other remains unchanged, the number of relations goes up by sqrt2. So, 33-33 has to find relations twice as fast as 32-32 to actually be equal, and 33-32 has to be 1.41x as fast. So, to compare, we double the sec/relation of all the 33/33 tests, and multiply all the 33/32 tests by 1.414. That's why I think #13 is best of these choices. From the GNFS parameters list, RSA200 used rlim180M/alim300M, and large-prime bounds of 35 bits (!!). This required 2700M relations. RSA-768 used rlim 200M/alim 1100M for 232 digits. So, it seems the group who factored these believes alim can be profitably increased compared to rlim. Note I have *no* experience, and am only parroting what I've read. The sheet also notes the 15/16e siever cutoff is just over 180 digits.
Thank you VBCurtis for explaining this. So, to try to put this in equation form, the multiplier needed to compare relation yields from test jobs with different large prime bounds would be:
Multiplier = ${2}^{\frac{(LPBR-lpbr) + (LPBA-lpba)}{2}}$
Where:
LPBR is the larger rational large prime bound
lpbr is the smaller rational large prime bound
LPBA is the larger algebraic large prime bound
lpba is the smaller algebraic large prime bound
And then you multiply the yield from the smaller lpb job by the multiplier to compare that to the yield from the job with the larger lpb. Does this sound right?

Would this work for number of relations needed, also? ie the 35/35 job from RSA200 needed 2700M relations. Would a 32/33 job need:
${2}^{\frac{(35-32) + (35-33)}{2}} = {2}^{\frac{3+2}{2}} = {2}^{2.5} = {5.6568}$
2700/5.6568 = ~477M relations?
Also, is this total or unique relations?

So, for the lpb multiplier described above, it looks like I should probably use the results from my test18 since it has the best weighted yield. Here is a table that includes the weighted yield from all of my 36 tests:
Code:
*** Test *** rlim  alim  lpbr lpba mfbr mfba

*** t01 *** 200e6 200e6 32 32 64 96
total yield: 586, q=100001029 (2.49877 sec/rel) * 2.0000 = 1172
total yield: 434, q=500001001 (3.45742 sec/rel) * 2.0000 =  868

*** t02 *** 350e6 200e6 32 32 64 96
total yield: 640, q=100001029 (2.51675 sec/rel) * 2.0000 = 1280
total yield: 468, q=500001001 (3.50973 sec/rel) * 2.0000 =  936

*** t03 *** 500e6 200e6 32 32 64 96
total yield: 665, q=100001029 (2.61559 sec/rel) * 2.0000 = 1330
total yield: 480, q=500001001 (3.67786 sec/rel) * 2.0000 =  960

*** t04 *** 200e6 350e6 32 32 64 96
total yield: 586, q=100001029 (2.49873 sec/rel) * 2.0000 = 1172
total yield: 457, q=500001001 (3.63930 sec/rel) * 2.0000 =  914

*** t05 *** 350e6 350e6 32 32 64 96
total yield: 640, q=100001029 (2.51599 sec/rel) * 2.0000 = 1280
total yield: 492, q=500001001 (3.65579 sec/rel) * 2.0000 =  984

*** t06 *** 500e6 350e6 32 32 64 96
total yield: 665, q=100001029 (2.61557 sec/rel) * 2.0000 = 1330
total yield: 505, q=500001001 (3.81100 sec/rel) * 2.0000 = 1010

*** t07 *** 200e6 500e6 32 32 64 96
total yield: 586, q=100001029 (2.50044 sec/rel) * 2.0000 = 1172
total yield: 469, q=500001001 (3.81422 sec/rel) * 2.0000 =  938

*** t08 *** 350e6 500e6 32 32 64 96
total yield: 640, q=100001029 (2.51385 sec/rel) * 2.0000 = 1280
total yield: 505, q=500001001 (3.82250 sec/rel) * 2.0000 = 1010

*** t09 *** 500e6 500e6 32 32 64 96
total yield: 665, q=100001029 (2.61408 sec/rel) * 2.0000 = 1330
total yield: 519, q=500001001 (3.96526 sec/rel) * 2.0000 = 1038

*** t10 *** 200e6 200e6 32 33 64 96
total yield: 897, q=100001029 (1.63303 sec/rel) * 1.4142 = 1268
total yield: 703, q=500001001 (2.14675 sec/rel) * 1.4142 =  994

*** t11 *** 350e6 200e6 32 33 64 96
total yield: 993, q=100001029 (1.62162 sec/rel) * 1.4142 = 1404
total yield: 748, q=500001001 (2.19623 sec/rel) * 1.4142 = 1057

*** t12 *** 500e6 200e6 32 33 64 96
total yield: 1029, q=100001029 (1.69091 sec/rel) * 1.4142 = 1455
total yield: 771, q=500001001 (2.29173 sec/rel)  * 1.4142 = 1090

*** t13 *** 200e6 350e6 32 33 64 96
total yield: 897, q=100001029 (1.63289 sec/rel) * 1.4142 = 1268
total yield: 761, q=500001001 (2.18615 sec/rel) * 1.4142 = 1076

*** t14 *** 350e6 350e6 32 33 64 96
total yield: 993, q=100001029 (1.62174 sec/rel) * 1.4142 = 1404
total yield: 808, q=500001001 (2.22832 sec/rel) * 1.4142 = 1142

*** t15 *** 500e6 350e6 32 33 64 96
total yield: 1029, q=100001029 (1.69071 sec/rel) * 1.4142 = 1455
total yield: 833, q=500001001 (2.31128 sec/rel)  * 1.4142 = 1178

*** t16 *** 200e6 500e6 32 33 64 96
total yield: 897, q=100001029 (1.63459 sec/rel) * 1.4142 = 1268
total yield: 784, q=500001001 (2.28254 sec/rel) * 1.4142 = 1108

*** t17 *** 350e6 500e6 32 33 64 96
total yield: 993, q=100001029 (1.62053 sec/rel) * 1.4142 = 1404
total yield: 836, q=500001001 (2.31623 sec/rel) * 1.4142 = 1182

*** t18 *** 500e6 500e6 32 33 64 96
total yield: 1029, q=100001029 (1.68960 sec/rel) * 1.4142 = 1455
total yield: 862, q=500001001 (2.39455 sec/rel)  * 1.4142 = 1219

*** t19 *** 200e6 200e6 33 32 64 96
total yield: 783, q=100001029 (1.89119 sec/rel) * 1.4142 = 1107
total yield: 589, q=500001001 (2.57094 sec/rel) * 1.4142 =  832

*** t20 *** 350e6 200e6 33 32 64 96
total yield: 861, q=100001029 (1.89129 sec/rel) * 1.4142 = 1217
total yield: 644, q=500001001 (2.57473 sec/rel) * 1.4142 =  910

*** t21 *** 500e6 200e6 33 32 64 96
total yield: 900, q=100001029 (1.95379 sec/rel) * 1.4142 = 1272
total yield: 663, q=500001001 (2.68922 sec/rel) * 1.4142 =  937

*** t22 *** 200e6 350e6 33 32 64 96
total yield: 783, q=100001029 (1.89126 sec/rel) * 1.4142 = 1107
total yield: 621, q=500001001 (2.70439 sec/rel) * 1.4142 =  878

*** t23 *** 350e6 350e6 33 32 64 96
total yield: 861, q=100001029 (1.89104 sec/rel) * 1.4142 = 1217
total yield: 680, q=500001001 (2.67079 sec/rel) * 1.4142 =  961

*** t24 *** 500e6 350e6 33 32 64 96
total yield: 900, q=100001029 (1.95429 sec/rel) * 1.4142 = 1272
total yield: 702, q=500001001 (2.76947 sec/rel) * 1.4142 =  992

*** t25 *** 200e6 500e6 33 32 64 96
total yield: 783, q=100001029 (1.89111 sec/rel) * 1.4142 = 1107
total yield: 638, q=500001001 (2.83263 sec/rel) * 1.4142 =  902

*** t26 *** 350e6 500e6 33 32 64 96
total yield: 861, q=100001029 (1.89246 sec/rel) * 1.4142 = 1217
total yield: 698, q=500001001 (2.79481 sec/rel) * 1.4142 =  987

*** t27 *** 500e6 500e6 33 32 64 96
total yield: 900, q=100001029 (1.95448 sec/rel) * 1.4142 = 1272
total yield: 721, q=500001001 (2.88571 sec/rel) * 1.4142 = 1019

*** t28 *** 200e6 200e6 33 33 64 96
total yield: 1183, q=100001029 (1.25192 sec/rel) * 1.0000 = 1183
total yield: 945, q=500001001 (1.60321 sec/rel)  * 1.0000 =  945

*** t29 *** 350e6 200e6 33 33 64 96
total yield: 1310, q=100001029 (1.24225 sec/rel) * 1.0000 = 1310
total yield: 1023, q=500001001 (1.62118 sec/rel) * 1.0000 = 1023

*** t30 *** 500e6 200e6 33 33 64 96
total yield: 1372, q=100001029 (1.28140 sec/rel) * 1.0000 = 1372
total yield: 1056, q=500001001 (1.68797 sec/rel) * 1.0000 = 1056

*** t31 *** 200e6 350e6 33 33 64 96
total yield: 1183, q=100001029 (1.25142 sec/rel) * 1.0000 = 1183
total yield: 1017, q=500001001 (1.65128 sec/rel) * 1.0000 = 1017

*** t32 *** 350e6 350e6 33 33 64 96
total yield: 1310, q=100001029 (1.24304 sec/rel) * 1.0000 = 1310
total yield: 1102, q=500001001 (1.64791 sec/rel) * 1.0000 = 1102

*** t33 *** 500e6 350e6 33 33 64 96
total yield: 1372, q=100001029 (1.28214 sec/rel) * 1.0000 = 1372
total yield: 1141, q=500001001 (1.70360 sec/rel) * 1.0000 = 1141

*** t34 *** 200e6 500e6 33 33 64 96
total yield: 1183, q=100001029 (1.25128 sec/rel) * 1.0000 = 1183
total yield: 1048, q=500001001 (1.72428 sec/rel) * 1.0000 = 1048

*** t35 *** 350e6 500e6 33 33 64 96
total yield: 1310, q=100001029 (1.24330 sec/rel) * 1.0000 = 1310
total yield: 1138, q=500001001 (1.71471 sec/rel) * 1.0000 = 1138

*** t36 *** 500e6 500e6 33 33 64 96
total yield: 1372, q=100001029 (1.28211 sec/rel) * 1.0000 = 1372
total yield: 1178, q=500001001 (1.76656 sec/rel) * 1.0000 = 1178

2013-07-05, 16:41   #17
VBCurtis

"Curtis"
Feb 2005
Riverside, CA

23×191 Posts

The equation for number of relations is provided in the attached file, which I picked up a year or so ago somewhere on the factoring forum.

If you add some lines in the 175+ range from your experiences, please post an updated file. I am also interested in creating a beginner's guide to large GNFS jobs, but I don't know much.
-Curtis

edit: I believe expected relation counts are for unique relations.
Attached Files
 _GGNFS_Parameters_a.zip (18.5 KB, 56 views)

Last fiddled with by VBCurtis on 2013-07-05 at 16:50

2013-07-05, 18:14   #18
lorgix

Sep 2010
Scandinavia

3×5×41 Posts

Quote:
 Originally Posted by VBCurtis The equation for number of relations is provided in the attached file, which I picked up a year or so ago somewhere on the factoring forum. If you add some lines in the 175+ range from your experiences, please post an updated file. I am also interested in creating a beginner's guide to large GNFS jobs, but I don't know much. -Curtis edit: I believe expected relation counts are for unique relations.
A guide would be very nice.
I haven't gotten to 175+ yet, but I recently finished a 158.
Code:
rlim: 30000000
alim: 30000000
lpbr: 29
lpba: 29
mfbr: 58
mfba: 58
rlambda: 2.5
alambda: 2.5
I gathered 46.2M rels, 41.6M unique.
The matrix used 1710.3MB, building it used 2048.2MB.
LA ran for ~25.7hrs using four threads on a i7 965X.
I didn't do any trial sieving for this, and I did not adjust the auto-selected parameters above. I probably should have upped all of them.

 2013-07-05, 18:58 #19 Batalov     "Serge" Mar 2008 Phi(4,2^7658614+1)/2 23×7×163 Posts With B.Dodson, we have worked on some GNFS factorizations up to size 180, then we ran out of then available resources. The only person here who has hands-on experience with gnfs-197, -202, -207 and gnfs-212 is frmky. The rule of thumb is that the time required for sieving will double with each 5 digits. The effort required for linear algebra is unique for very large sizes. What good will be 20Tb (totally from the top of my head) of relations if one will not be able to solve the resulting matrix? You can prepare yourself for solving a gnfs-180 by solving a gnfs-170 (the differences will only be needing more memory, needing more reliable memory, and more time), but you will have no transfer of experience to solving a gnfs-190, -200 or -210. Those will require different hardware. (Maybe gnfs-190 could be done the old way, maybe not.)
 2013-07-05, 20:11 #20 VBCurtis     "Curtis" Feb 2005 Riverside, CA 112916 Posts Wraith weighted his trial sieves for the number of relations needed. But won't the 33-33 job, with its greater number of (easier to find) relations, also create a significantly larger matrix? I grasp the matrix solve-time is much shorter than the sieving time, but as Batalov points out we must be *able* to solve the matrix on our hardware! So, what happens if we choose to give up a bit of sieve speed to stick with, say, 32-32 for Wraith's C210? Or, for the sake of discussion, what would happen if he tried 31-31? Would he run out of special-q to search before finding enough relations? Is there a similar reason (i.e. probability of completing the factorization, not sieve time) to choose 3 large primes instead of 2? I feel like these questions are a setup for RDS to appear and tell me to read a paper- it's summer, I'm off, I have time for that. I suppose I'll consult the msieve readme files for an appropriate reference paper and try to answer my own questions.
 2013-07-06, 02:05 #21 jasonp Tribal Bullet     Oct 2004 3×1,163 Posts More large primes, and larger large primes, make filtering harder but don't necessarily make the matrix bigger. The matrix for M1039 in 2007 (35-bit large primes, matrix size 67M) was only a little bigger than the matrix for M1031 (33-bit large primes, matrix size 63M). That's not intuitive, and may just be a result of having very few data points at that size. Filtering has gotten noticeably better in recent years. Your intuition is correct about what happens in the sieving when you use large primes that are too small. NFS@Home actually did run out of special-q when sieving for M1061, which was the major driving force behind adopting the laseive5 siever.
2013-07-07, 19:23   #22
lorgix

Sep 2010
Scandinavia

3×5×41 Posts

Quote:
 Originally Posted by lorgix Code: rlim: 30000000 alim: 30000000 lpbr: 29 lpba: 29 mfbr: 58 mfba: 58 rlambda: 2.5 alambda: 2.5 I gathered 46.2M rels, 41.6M unique. The matrix used 1710.3MB, building it used 2048.2MB. LA ran for ~25.7hrs using four threads on a i7 965X.
I've been told that the siever (14e in this case) and lpbr should be chosen such that a special-q yields two relations, on average.
Experimenting with the parameters above, without knowing much about what they actually do, I came up with:
Code:
rlim: 55000000
alim: 55000000
lpbr: 29 or 30
lpba: 29 or 30
mfbr: 59 or 60
mfba: 59 or 60
rlambda: 2.65
alambda: 2.65
I think those parameters would have been better. Again, this is a c158. Comments?

Now, questions:

What's the significance of rlim? Other than causing sieving to be done over rlim/2 through rlim.

What's "the difference" between the r-parameters and the a-parameters? How should I adjust them, and what should I expect from that?
I notice that for small jobs they are often the same.
In the experiment above I noticed that letting either of lpbr and lpba be 29 and the other one 30 brought me as close to two relations per q as I could get.
Assuming that is somewhat desirable to begin with; which should I chose, and what result will it have?

What effect (if any) can mfb and lambda have, other than affecting yield per q and speed of sieving? I'm assuming they have a cost, because they did not seem to slow down sieving appreciably.

About poly select: I've read that something like 3-5% of anticipated sieving time should be spent in poly select. Is this somewhat accurate? Is it dependent on input size to a significant degree? Can I help collect data points for larger inputs?

For the c158 above I used the -wide option in YAFU, I suspect that deep would have been better. 16448 ranges of size 250 in 300hrs(!). I still have that p-file, in case anyone wants to have a look. (Tell me where to upload it.)

The sieving experiments were done under Windows, using this.

Any helpful responses to my questions, or this thread in general, would be greatly appreciated.

 Similar Threads Thread Thread Starter Forum Replies Last Post ryanp Factoring 69 2013-04-30 00:28 ixfd64 Factoring 3 2012-06-06 08:27 WraithX Msieve 18 2012-05-20 22:19 Syd Aliquot Sequences 7 2011-03-14 18:35 bdodson Factoring 20 2008-11-26 20:45

All times are UTC. The time now is 16:46.

Mon Oct 26 16:46:41 UTC 2020 up 46 days, 13:57, 0 users, load averages: 2.05, 1.81, 1.69