mersenneforum.org Advice for large GNFS jobs?
 Register FAQ Search Today's Posts Mark Forums Read

 2013-07-07, 21:10 #23 Batalov     "Serge" Mar 2008 Phi(4,2^7658614+1)/2 216468 Posts There is one group that had approached a gnfs-193 on a college-size cluster. The reservation is still pending, >3 years later. Don't take even a gnfs-193 lightly. (...and there are no much smaller unreserved gnfs numbers in the Cunningham project. There are gnfs-192 numbers: 3,664+ and 5,485+, and then there are some above.) You may or may not know that msieve has certain issues (and these are under active research) every time the size limit is advanced. It is not just a matter of huge resources and "pushing a button". The often partially quoted Donald Knuth's saying goes: "Science is what we understand well enough to explain to a computer. Art is everything else we do."
 2013-07-08, 06:01 #24 fivemack (loop (#_fork))     Feb 2006 Cambridge, England 2·29·109 Posts I have a bit of experience (25 factorisations) in the high-150s-low-160s range; with my circumstances (proportionally better resources for linear algebra than for sieving) I have found that 31-bit large primes, 14e siever, and aiming for no more than 160 million relations works significantly more quickly than 30-bit at that level. Last fiddled with by fivemack on 2013-07-08 at 06:02
2013-07-08, 07:08   #25
VBCurtis

"Curtis"
Feb 2005
Riverside, CA

23·32·61 Posts

Quote:
 Originally Posted by fivemack I have a bit of experience (25 factorisations) in the high-150s-low-160s range; with my circumstances (proportionally better resources for linear algebra than for sieving) I have found that 31-bit large primes, 14e siever, and aiming for no more than 160 million relations works significantly more quickly than 30-bit at that level.
Since you specified no other parameters, the rest are default? Specifically rlim and alim...

How big are the matrices your settings produce? How much do they vary for, say, a C160?

Thanks for the info- while I don't plan to do a C170+, I'm interested in learning where we should deviate from the defpar file for 145-165 digit work.

 2013-07-08, 08:46 #26 fivemack (loop (#_fork))     Feb 2006 Cambridge, England 2×29×109 Posts I tend to run with rlim=alim, run over Q from lim/3 to lim, and do test sieving until I find a value of lim that gives enough relations - I'm not using defpar.txt at all. lim=48M was what I used for the last C160 I did. The matrices are maybe nine million-ish, they do vary a fair amount; I'm using the same computers for sieving and for matrix, saving 10% of sieving time even at the price of doubling the matrix time is a good trade-off. Last fiddled with by fivemack on 2013-07-08 at 08:48
2013-07-08, 10:04   #27
lorgix

Sep 2010
Scandinavia

3·5·41 Posts

Quote:
 Originally Posted by fivemack I have a bit of experience (25 factorisations) in the high-150s-low-160s range; with my circumstances (proportionally better resources for linear algebra than for sieving) I have found that 31-bit large primes, 14e siever, and aiming for no more than 160 million relations works significantly more quickly than 30-bit at that level.
So you prefer faster sieving at the price of a tougher matrix, is that right?
Quote:
 Originally Posted by fivemack I tend to run with rlim=alim, run over Q from lim/3 to lim, and do test sieving until I find a value of lim that gives enough relations - I'm not using defpar.txt at all. lim=48M was what I used for the last C160 I did. The matrices are maybe nine million-ish, they do vary a fair amount; I'm using the same computers for sieving and for matrix, saving 10% of sieving time even at the price of doubling the matrix time is a good trade-off.
OK, so the lower rlim is offset by starting sieving from lim/3 instead of lim/2.

But when I tried sieving with
Code:
rlim: 48000000
alim: 48000000
lpbr: 31
lpba: 31
mfbr: 62
mfba: 62
rlambda: 2.65
alambda: 2.65
I got 4.6rels/q. Which seems to be significantly more than the two per q that I was recommended to aim for.
Is this because of your preference for faster sieving, or is two rels per q not a good rule of thumb?

This of course relates to my questions earlier in this thread; what are the consequences of changing these parameters, other than the change in yield and sieving speed that I can see?

 2013-07-08, 11:07 #28 fivemack (loop (#_fork))     Feb 2006 Cambridge, England 2·29·109 Posts I suppose I think of 'two relations per Q' more as a lower bound; if you're not getting even that many relations, probably the parameters could be better. That's in part because it tends to mean you're working at the edge of the siever's capability and so you'll risk unreasonable duplicate rates. The minima in this world are generally reasonably flat, but increasing the large-prime bound does sometimes more-than-double the rate of relation production while less-than-doubling the number of relations needed, which is an unusually good tradeoff.
 2013-07-08, 14:41 #29 henryzz Just call me Henry     "David" Sep 2007 Cambridge (GMT/BST) 22·1,433 Posts For some jobs it might be worth baring in mind that increasing the lpb(r/a) without changing mfb(r/a) doesn't change the speed of the sieving. It will just be a harder filtering problem. Since filtering doesn't take at all long with smaller numbers if a number is on the border then you could hope to get a little help from an increased lpb(r/a) without increasing the complexity much(i.e. you expect 80% of the extra relations to be singletons). Working out how many relations you need could be tricky using this trick.
2013-07-09, 12:20   #30
lorgix

Sep 2010
Scandinavia

61510 Posts

Quote:
 Originally Posted by fivemack I suppose I think of 'two relations per Q' more as a lower bound; if you're not getting even that many relations, probably the parameters could be better. That's in part because it tends to mean you're working at the edge of the siever's capability and so you'll risk unreasonable duplicate rates. The minima in this world are generally reasonably flat, but increasing the large-prime bound does sometimes more-than-double the rate of relation production while less-than-doubling the number of relations needed, which is an unusually good tradeoff.
I'm assuming there is something like an upper bound too then. Where would that be? In my test case, 31bit yields 79% more relations than 30bit.

By the way: Say I'm choosing two CPUs for my G34-board, looking to maximize Msieve performance. Should I go with 6140 or 6172? (8*2.6GHz or 12*2.1GHz) Would a cooler alternative like 6166 HE be crazy?
Quote:
 Originally Posted by henryzz For some jobs it might be worth baring in mind that increasing the lpb(r/a) without changing mfb(r/a) doesn't change the speed of the sieving. It will just be a harder filtering problem. Since filtering doesn't take at all long with smaller numbers if a number is on the border then you could hope to get a little help from an increased lpb(r/a) without increasing the complexity much(i.e. you expect 80% of the extra relations to be singletons). Working out how many relations you need could be tricky using this trick.
I unfortunately don't quite understand this post.
I have figured out that if mfb is too far behind 2*lpb then the yield will be bad or terrible.

2013-07-09, 13:29   #31
henryzz
Just call me Henry

"David"
Sep 2007
Cambridge (GMT/BST)

22×1,433 Posts

Quote:
 Originally Posted by lorgix I unfortunately don't quite understand this post. I have figured out that if mfb is too far behind 2*lpb then the yield will be bad or terrible.
Basically by increasing the lpb you get more relations for free. The only cost is a slightly longer filtering but the extra relations might push you over the edge into the matrix building with less sieving/get a better matrix.
You are not expecting to get the full 2x relations like you would need if you also increased mfb. For example if increasing the lpb by 1 gets you 1.2x as much yield then you can expect to need <=1.2x as many relations. This trick is based upon you needing <1.2x or making it easier to oversieve to get a better matrix. A high percentage of the extra relations will be singletons but the few that aren't could be quite helpful.

If it wasn't for filtering taking longer you could run every factorization with the maximum lpb(33 in most binaries but all that is needed is a recompile). The limit basically means you throw away relations while sieving.

2013-07-09, 13:46   #32
fivemack
(loop (#_fork))

Feb 2006
Cambridge, England

2·29·109 Posts

Quote:
 Originally Posted by lorgix I'm assuming there is something like an upper bound too then. Where would that be?
I've never really tried to find out; I've factored a C120 using 30-bit large primes to see what would happen, IIRC it needed well under 80 million relations to make a matrix, but took a lot longer than using more optimal parameters.

Quote:
 By the way: Say I'm choosing two CPUs for my G34-board, looking to maximize Msieve performance. Should I go with 6140 or 6172? (8*2.6GHz or 12*2.1GHz) Would a cooler alternative like 6166 HE be crazy?
You mean literally msieve performance (linear-algebra step) or are you doing sieving too? In the latter case I would go for more slower CPUs (I have 6168s); http://www.ebay.com/itm/AMD-OPTERON-...item3cd3448870 has a pair of 6166HE for a very reasonable price.

The motherboards and memory are quite power-intensive so the 15 watt difference in CPU power is immaterial.

If just linear algebra, that's an interesting question (three CPUs per memory controller seems to saturate, so there's less point in getting six rather than four), and I actually don't know what the answer is now that msieve threading has been so dramatically improved.

At present, the best parameters I have for MPI linear algebra on my quad-socket 6168 Opteron run only 10% faster than on my single-Haswell; I clearly ought to devote some effort to finding better parameters!

2013-07-09, 13:48   #33
lorgix

Sep 2010
Scandinavia

11478 Posts

Quote:
 Originally Posted by henryzz Basically by increasing the lpb you get more relations for free. The only cost is a slightly longer filtering but the extra relations might push you over the edge into the matrix building with less sieving/get a better matrix. You are not expecting to get the full 2x relations like you would need if you also increased mfb. For example if increasing the lpb by 1 gets you 1.2x as much yield then you can expect to need <=1.2x as many relations. This trick is based upon you needing <1.2x or making it easier to oversieve to get a better matrix. A high percentage of the extra relations will be singletons but the few that aren't could be quite helpful. If it wasn't for filtering taking longer you could run every factorization with the maximum lpb(33 in most binaries but all that is needed is a recompile). The limit basically means you throw away relations while sieving.
I think I have a rough grasp on how lpb works now. You seem to be making a case for higher lpb, but you agree that lpb can be set too high, right?
I don't quite get the bold part. I don't understand how mfb works. Are you saying it increases complexity and that I can sometimes get the benefits of a higher lpb without paying the price of a higher mfb? (which would be higher complexity, and I don't know what that is in this context(harder filtering? Is that what your other post was saying?))

I feel that I'm missing a few pieces, but I'm still learning. Hopefully others will benefit from these discussions.

Thank you all for your patience.

Quote:
 Originally Posted by fivemack I've never really tried to find out; I've factored a C120 using 30-bit large primes to see what would happen, IIRC it needed well under 80 million relations to make a matrix, but took a lot longer than using more optimal parameters.
OK, maybe I'll do some experimenting to get an idea of what range is reasonable.

Quote:
 You mean literally msieve performance (linear-algebra step) or are you doing sieving too? In the latter case I would go for more slower CPUs (I have 6168s); http://www.ebay.com/itm/AMD-OPTERON-...item3cd3448870 has a pair of 6166HE for a very reasonable price. The motherboards and memory are quite power-intensive so the 15 watt difference in CPU power is immaterial. If just linear algebra, that's an interesting question (three CPUs per memory controller seems to saturate, so there's less point in getting six rather than four), and I actually don't know what the answer is now that msieve threading has been so dramatically improved. At present, the best parameters I have for MPI linear algebra on my quad-socket 6168 Opteron run only 10% faster than on my single-Haswell; I clearly ought to devote some effort to finding better parameters!
Your reasoning is that eight faster cores would be enough to keep four memory channels busy in LA, while the near perfect scaling of the sieving step would make the 12-core a winner. Right?
Yes, I'm considering those 6166 HE. The price is hard to beat. They should sieve slightly faster than a 6140. The question is how much worse they would be in LA. Does it make any difference that the total L2-cache will be larger on the 12-core ones?
MPI seems tricky... I guess I'll have to look into that subject.

Last fiddled with by lorgix on 2013-07-09 at 14:26 Reason: adding response to fivemack

 Similar Threads Thread Thread Starter Forum Replies Last Post ryanp Factoring 69 2013-04-30 00:28 ixfd64 Factoring 3 2012-06-06 08:27 WraithX Msieve 18 2012-05-20 22:19 Syd Aliquot Sequences 7 2011-03-14 18:35 bdodson Factoring 20 2008-11-26 20:45

All times are UTC. The time now is 05:43.

Sun Oct 25 05:43:23 UTC 2020 up 45 days, 2:54, 0 users, load averages: 1.59, 1.73, 1.66

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.