mersenneforum.org Advice for large GNFS jobs?
 Register FAQ Search Today's Posts Mark Forums Read

 2013-04-24, 16:08 #1 WraithX     Mar 2006 1110110002 Posts Advice for large GNFS jobs? Hello everyone, I'm planning on working on a 210-digit GNFS number here pretty soon. I'm using msieve (1.51) to do gpu poly selection right now. I see from the msieve.log that it expects to find an E value from 1.16e-015 to > 1.34e-015. Is this about what others have seen from their large GNFS jobs? Should I look for a poly with larger E? Also, how long would you estimate the sieving to take? Can I also ask for advice on parameters and which siever to use? I saw that recently Dan Ee posted some optimized binaries for Win64. I plan to use those, but should I use the 15e, or maybe the 16e sievers for this size job? Should I use 2 or 3 large primes on the algebraic side? Should I use 2 or 3 large primes on the rational side? Over what range of values should I sieve for a job this size? What size matrix (dimensions and/or memory size) would we expect from a job this size? I have two large memory machines (48GB and 64GB), hopefully it will fit onto one of those. Will the answer to most of these questions depend on the best poly that I find? I'll ask more when I have a good poly in hand. Thanks for any advice and insights you can provide.
 2013-04-24, 16:19 #2 jasonp Tribal Bullet     Oct 2004 3×1,163 Posts The built-in parameters for 210-digit polynomial selection are not going to work without manual intervention. For example, it's not clear whether you should look for degree-5 or degree-6. Check Readme.nfs for all the knobs that are necessary to specify how the search should proceed, you will need to override several of them in ways that are not clear to me. You also should not start with a leading coefficient of 1; Paul Zimmermann has shown that you improve the chances of finding a good polynomial if you look for leading coefficients that balance the optimal skew against the size of the middle polynomial coefficient. That's definitely the case for degree-6, and probably true for degree 5. If this is not an RSA challenge number, I hope it has had enough ECM thrown at it...
2013-04-24, 16:28   #3
jyb

Aug 2005
Seattle, WA

5·11·29 Posts

Quote:
Originally Posted by jasonp
Quote:
 Originally Posted by WraithX Hello everyone, I'm planning on working on a 210-digit GNFS number here pretty soon.
If this is not an RSA challenge number, I hope it has had enough ECM thrown at it...
Just what would constitute "enough ECM" in this case? The standard rule of thumb would suggest it should be checked to t70, right? I may be wrong, but I doubt it's had that much!

 2013-04-24, 16:37 #4 bsquared     "Ben" Feb 2007 2×3×557 Posts Possibly helpful reference, from a number at least twice as easy as your target: https://listserv.nodak.edu/cgi-bin/w...;ac63e2ab.1208
2013-04-25, 13:33   #5
WraithX

Mar 2006

23·59 Posts

Quote:
 Originally Posted by jasonp The built-in parameters for 210-digit polynomial selection are not going to work without manual intervention. For example, it's not clear whether you should look for degree-5 or degree-6. Check Readme.nfs for all the knobs that are necessary to specify how the search should proceed, you will need to override several of them in ways that are not clear to me. You also should not start with a leading coefficient of 1; Paul Zimmermann has shown that you improve the chances of finding a good polynomial if you look for leading coefficients that balance the optimal skew against the size of the middle polynomial coefficient. That's definitely the case for degree-6, and probably true for degree 5. If this is not an RSA challenge number, I hope it has had enough ECM thrown at it...
Yeah, the first range I searched was 1e8-2e8, then I did go back and search 1-1e8. Altogether I've done (or am currently working on) 1-1e8, 1e8-2e8, 2e8-3e8, 3e8-4e8, 10e8-11e8, and 100e8-101e8. All of those were with -np1. I used a much smaller stage1_norm on all of those (5e29) to minimize how much output they produced. Each range gives between 400k and 500k hits. -nps on all of those hits runs pretty quickly (depending on computer, usually ~20 minutes). I had run into a self-made problem when trying to run -npr. ie, I was using "sort -g -k 11 | head -10000" to find the best -nps results. However, since these were all degree 5 polynomials, I was basically just getting the head of the -nps results, which definitely wasn't what I wanted. I have finally figured out that I should use "sort -g -k 10 | head -10000" for these degree 5 polynomials. So, I am rerunning -npr and am getting much better results so far.

I am seeing a couple of different reactions from msieve when I run various steps on different computers. (all of these are CUDA enabled and run on computers with Nvidia cards) On my Linux x64 machine, where I built msieve from svn (...maybe 839?), when I run -nps it is extremely fast. It can process 500k -np1 hits in about 20 minutes, and that's on one core. On my Windows x64 machine, when I run the 32-bit CUDA-enabled windows msieve (1.51 from sourceforge) it takes about 2.5 hours to process 500k -np1 hits on one core. Is this speed difference because of the 32/64 bit difference? Or maybe there is a Linux/Windows speed difference? Or maybe a combination of those two factors?

Also, when I run -npr on the top ~10000 -nps hits, I am seeing similar speed differences. On my Linux x64 machine it takes about 2 hours to process the top 12000 -nps hits on 12 cores (1000/core). On my Windows x64 machine with 32-bit msieve 1.51, it takes about 25 hours to process the top 16000 -nps hits on 16 cores (1000/core). Could this step also be showing a speed difference because of the 32/64 bit difference?

As for having enough ECM thrown at it, I don't have the numbers in front of me, but I think I've put in about 1.3*t65 into this. Still a bit off from the recommended t70, but I thought switching now would be a better use of my resources. I only have about 36 cores I can put into sieving, so I may eventually try to open this up as a project here on the forums. But, I'd like to find a good polynomial and good parameters before I get to that point. I'll start looking into tweaking the other poly search parameters like you've mentioned. Thanks for the hints and tips so far. If you or anyone else can think of other things I should consider or try out, I'd be very interested to hear it.

 2013-04-25, 14:08 #6 swellman     Jun 2012 24×181 Posts Have you checked Kamada's site? May be helpful on parameter selection (or not). http://homepage2.nifty.com/m_kamada/math/graphs.htm
 2013-06-05, 13:55 #7 WraithX     Mar 2006 23×59 Posts I have a few questions about command line options and parameters for msieve 1.51 with CUDA. 1) I know stage1_norm is used in -np1. Does it affect -nps or -npr? 2) I know stage2_norm is used in -nps and -npr. Does it affect -np1? 3) From the documentation, it looks like -t X is only used by -np1 and not -nps or -npr. Is this correct? 4) I've read that running -np1 multiple times, with a fixed set of arguments, will usually not produce the same results (because the search space is large and is split into pieces and random pieces are searched). I'm wondering if running -nps or -npr multiple times, with a fixed set of arguments, will always produce the same results? Or could those produce different polynomials because of different random internal states in msieve? 5) Are Murphy E scores directly comparable between degree 5 and degree 6 polynomials? I think not, but I want to be sure. Thanks everyone for the links to help guide me on parameter selection. I have some ideas of what values to use, and I'll be test sieving those probably by the end of this month. BTW, here are some of the best poly's I've found so far for my C210. I'm going to go back and try tweaking the norm's in -nps and -npr to see if I get anything better. Code: Degree-5 Polynomials: skew 291528643.68, size 1.051e-20, alpha -7.678, combined = 9.961e-16 rroots = 3 skew 104279094.33, size 1.124e-20, alpha -6.911, combined = 1.038e-15 rroots = 5 skew 609572156.15, size 9.604e-21, alpha -9.019, combined = 9.234e-16 rroots = 5 Degree-6 Polynomials: skew 1564743.15, size 2.403e-15, alpha -10.513, combined = 9.028e-016 rroots = 2
2013-06-05, 17:48   #8
jasonp
Tribal Bullet

Oct 2004

1101101000012 Posts

Quote:
 Originally Posted by WraithX 1) I know stage1_norm is used in -np1. Does it affect -nps or -npr? 2) I know stage2_norm is used in -nps and -npr. Does it affect -np1?
The stage 1 norm doesn't influence anything in stage 2.
The stage 2 norms don't influence anything in stage 1.
Quote:
 3) From the documentation, it looks like -t X is only used by -np1 and not -nps or -npr. Is this correct?
Correct, stage 2 is not multithreaded. Lionel has wanted multithreaded stage 2 for a long time, but the structure of the code would need some pretty fundamental changes to allow it.
Quote:
 I'm wondering if running -nps or -npr multiple times, with a fixed set of arguments, will always produce the same results? Or could those produce different polynomials because of different random internal states in msieve?
Stage 2 should be deterministic, there is no use of randomness
Quote:
 5) Are Murphy E scores directly comparable between degree 5 and degree 6 polynomials? I think not, but I want to be sure.
They're not comparable. The E-value is an integration times a degree-specific constant, and we don't know what the constant is; it would almost certainly depend on things like the size of the sieving region.

 2013-07-03, 12:23 #9 WraithX     Mar 2006 7308 Posts I'm starting to do some test sieving on the polynomials I have found for my gnfs C210. I was wondering several different things about this step: 1) Should I try to find the best set of parameters (rlim,alim,lpbr,lpba,mfbr,mfba) using one polynomial, and then use those parameters to compare all the different polynomials I have? Or should I pick one set of parameters, use those to compare my different polynomials, and then try to optimize the parameters based on the winning polynomial? For a gnfs C210: 2) Should I try sieving on the rational or algebraic side? Or does this depend on the winning polynomial? 3) Should I use 2LP or 3LP for the different sides? Or does this also depend on the winning polynomial? 4) When using 2LP that means: mfbr = 2*lpbr and/or mfba = 2*lpba and rlambda/alambda ~= 2.6 (or 2.7), right? 5) When using 3LP that means: mfbr = 3*lpbr and/or mfba = 3*lpba and rlambda/alambda ~= 3.6 (or 3.7), right? I ask 4 and 5 because I sometimes see people quote params of 33-bit lpbr/lpba with 64-bit mfbr/mfba or a 33-bit lpbr/lpba with 96-bit mfbr/mfba, instead of 33&66 or 33&99. Right now I'm testing these variables: rlim = {200e6, 350e6, 500e6} alim = {200e6, 350e6, 500e6} lpbr = {32, 33} lpba = {32, 33} mfbr = 64 mfba = 96 at Q=100e6 and Q=500e6. 6) Are these reasonable test points, or should I try some other variations too? 7) Is 16e the right siever to use for a gnfs C210? 8) Does (total yield)*(sec/rel) = total time to find those relations? 9) How many raw relations should I be looking to collect for this gnfs C210? 10) What is a good range of Q to test to find these relations? Q = {50e6 to 500e6}? Or Q = {100e6 to 1000e6}? Or...? Also, down below I am including the results of my test sieving. Do these look like reasonable outputs? I did "-f 100000000 -c 1000" and "-f 500000000 -c 1000" for each test to hopefully get a good approximation of how sieving will go over these big ranges. I'm sure there is probably more I should be asking, but these are the big questions I have for right now. I'd appreciate any insights or advice anyone can give on this big project I'm about to undertake. Of the 36 tests suggested above, with results down below, it looks like I probably want to go with t29 or t32. Does this sound right from these results? Code: *** Test *** rlim alim lpbr lpba mfbr mfba *** t01 *** 200e6 200e6 32 32 64 96 total yield: 586, q=100001029 (2.49877 sec/rel) total yield: 434, q=500001001 (3.45742 sec/rel) *** t02 *** 350e6 200e6 32 32 64 96 total yield: 640, q=100001029 (2.51675 sec/rel) total yield: 468, q=500001001 (3.50973 sec/rel) *** t03 *** 500e6 200e6 32 32 64 96 total yield: 665, q=100001029 (2.61559 sec/rel) total yield: 480, q=500001001 (3.67786 sec/rel) *** t04 *** 200e6 350e6 32 32 64 96 total yield: 586, q=100001029 (2.49873 sec/rel) total yield: 457, q=500001001 (3.63930 sec/rel) *** t05 *** 350e6 350e6 32 32 64 96 total yield: 640, q=100001029 (2.51599 sec/rel) total yield: 492, q=500001001 (3.65579 sec/rel) *** t06 *** 500e6 350e6 32 32 64 96 total yield: 665, q=100001029 (2.61557 sec/rel) total yield: 505, q=500001001 (3.81100 sec/rel) *** t07 *** 200e6 500e6 32 32 64 96 total yield: 586, q=100001029 (2.50044 sec/rel) total yield: 469, q=500001001 (3.81422 sec/rel) *** t08 *** 350e6 500e6 32 32 64 96 total yield: 640, q=100001029 (2.51385 sec/rel) total yield: 505, q=500001001 (3.82250 sec/rel) *** t09 *** 500e6 500e6 32 32 64 96 total yield: 665, q=100001029 (2.61408 sec/rel) total yield: 519, q=500001001 (3.96526 sec/rel) *** t10 *** 200e6 200e6 32 33 64 96 total yield: 897, q=100001029 (1.63303 sec/rel) total yield: 703, q=500001001 (2.14675 sec/rel) *** t11 *** 350e6 200e6 32 33 64 96 total yield: 993, q=100001029 (1.62162 sec/rel) total yield: 748, q=500001001 (2.19623 sec/rel) *** t12 *** 500e6 200e6 32 33 64 96 total yield: 1029, q=100001029 (1.69091 sec/rel) total yield: 771, q=500001001 (2.29173 sec/rel) *** t13 *** 200e6 350e6 32 33 64 96 total yield: 897, q=100001029 (1.63289 sec/rel) total yield: 761, q=500001001 (2.18615 sec/rel) *** t14 *** 350e6 350e6 32 33 64 96 total yield: 993, q=100001029 (1.62174 sec/rel) total yield: 808, q=500001001 (2.22832 sec/rel) *** t15 *** 500e6 350e6 32 33 64 96 total yield: 1029, q=100001029 (1.69071 sec/rel) total yield: 833, q=500001001 (2.31128 sec/rel) *** t16 *** 200e6 500e6 32 33 64 96 total yield: 897, q=100001029 (1.63459 sec/rel) total yield: 784, q=500001001 (2.28254 sec/rel) *** t17 *** 350e6 500e6 32 33 64 96 total yield: 993, q=100001029 (1.62053 sec/rel) total yield: 836, q=500001001 (2.31623 sec/rel) *** t18 *** 500e6 500e6 32 33 64 96 total yield: 1029, q=100001029 (1.68960 sec/rel) total yield: 862, q=500001001 (2.39455 sec/rel) *** t19 *** 200e6 200e6 33 32 64 96 total yield: 783, q=100001029 (1.89119 sec/rel) total yield: 589, q=500001001 (2.57094 sec/rel) *** t20 *** 350e6 200e6 33 32 64 96 total yield: 861, q=100001029 (1.89129 sec/rel) total yield: 644, q=500001001 (2.57473 sec/rel) *** t21 *** 500e6 200e6 33 32 64 96 total yield: 900, q=100001029 (1.95379 sec/rel) total yield: 663, q=500001001 (2.68922 sec/rel) *** t22 *** 200e6 350e6 33 32 64 96 total yield: 783, q=100001029 (1.89126 sec/rel) total yield: 621, q=500001001 (2.70439 sec/rel) *** t23 *** 350e6 350e6 33 32 64 96 total yield: 861, q=100001029 (1.89104 sec/rel) total yield: 680, q=500001001 (2.67079 sec/rel) *** t24 *** 500e6 350e6 33 32 64 96 total yield: 900, q=100001029 (1.95429 sec/rel) total yield: 702, q=500001001 (2.76947 sec/rel) *** t25 *** 200e6 500e6 33 32 64 96 total yield: 783, q=100001029 (1.89111 sec/rel) total yield: 638, q=500001001 (2.83263 sec/rel) *** t26 *** 350e6 500e6 33 32 64 96 total yield: 861, q=100001029 (1.89246 sec/rel) total yield: 698, q=500001001 (2.79481 sec/rel) *** t27 *** 500e6 500e6 33 32 64 96 total yield: 900, q=100001029 (1.95448 sec/rel) total yield: 721, q=500001001 (2.88571 sec/rel) *** t28 *** 200e6 200e6 33 33 64 96 total yield: 1183, q=100001029 (1.25192 sec/rel) total yield: 945, q=500001001 (1.60321 sec/rel) *** t29 *** 350e6 200e6 33 33 64 96 total yield: 1310, q=100001029 (1.24225 sec/rel) total yield: 1023, q=500001001 (1.62118 sec/rel) *** t30 *** 500e6 200e6 33 33 64 96 total yield: 1372, q=100001029 (1.28140 sec/rel) total yield: 1056, q=500001001 (1.68797 sec/rel) *** t31 *** 200e6 350e6 33 33 64 96 total yield: 1183, q=100001029 (1.25142 sec/rel) total yield: 1017, q=500001001 (1.65128 sec/rel) *** t32 *** 350e6 350e6 33 33 64 96 total yield: 1310, q=100001029 (1.24304 sec/rel) total yield: 1102, q=500001001 (1.64791 sec/rel) *** t33 *** 500e6 350e6 33 33 64 96 total yield: 1372, q=100001029 (1.28214 sec/rel) total yield: 1141, q=500001001 (1.70360 sec/rel) *** t34 *** 200e6 500e6 33 33 64 96 total yield: 1183, q=100001029 (1.25128 sec/rel) total yield: 1048, q=500001001 (1.72428 sec/rel) *** t35 *** 350e6 500e6 33 33 64 96 total yield: 1310, q=100001029 (1.24330 sec/rel) total yield: 1138, q=500001001 (1.71471 sec/rel) *** t36 *** 500e6 500e6 33 33 64 96 total yield: 1372, q=100001029 (1.28211 sec/rel) total yield: 1178, q=500001001 (1.76656 sec/rel)
 2013-07-03, 15:29 #10 VBCurtis     "Curtis" Feb 2005 Riverside, CA 106078 Posts t13 looks stronger than t29 or t32. Do these parameter choices change for a sextic vs a quintic? Did you ever figure out if the stage 2 speed differences were 32 vs 64 bitness, or win vs linux?
2013-07-03, 15:45   #11
henryzz
Just call me Henry

"David"
Sep 2007
Cambridge (GMT/BST)

34·71 Posts

Quote:
 Originally Posted by VBCurtis Do these parameter choices change for a sextic vs a quintic?
They should I would imagine as with a sextic the algebraic side is harder to factor than the quintic.

The 96 bit limit is a limitation of that version of the siever. If you use lasieve5(only available for linux and a bit buggy) then this limit can be passed.

Last fiddled with by henryzz on 2013-07-03 at 15:47

 Similar Threads Thread Thread Starter Forum Replies Last Post ryanp Factoring 69 2013-04-30 00:28 ixfd64 Factoring 3 2012-06-06 08:27 WraithX Msieve 18 2012-05-20 22:19 Syd Aliquot Sequences 7 2011-03-14 18:35 bdodson Factoring 20 2008-11-26 20:45

All times are UTC. The time now is 05:10.

Sun Nov 29 05:10:08 UTC 2020 up 80 days, 2:21, 3 users, load averages: 2.65, 2.12, 1.76