mersenneforum.org A couple of questions about polynomial selection
 Register FAQ Search Today's Posts Mark Forums Read

 2010-11-20, 01:54 #1 aifbowman   Nov 2010 2·7 Posts A couple of questions about polynomial selection Hi all, Sorry if the answers to these questions are very simple, I am new to msieve and still trying to get to grips with it. I am currently trying to factorize a 160 digit key. I ran msieve's polynomial selection for a couple of days and then started sieving with the best key from the limited number generated. My first question is whether if I find a better polynomial (I am still searching as that part runs on my GPU so doesnt slow down the sieving at all), can I continue the sieving with this better one or would I have to disgard all the relations found with the old one and start from scratch again? My second questions is, with msieve's polynomials, what refers to the Murphy score? Is it the 'e' field e.g. 'e 1.075e-012', and is the best polynomial the one with the highest value for this, Any help would be much appreciated, Anthony
 2010-11-20, 04:26 #2 Batalov     "Serge" Mar 2008 San Diego, Calif. 1039810 Posts No. Yes.
 2010-11-20, 11:56 #3 jasonp Tribal Bullet     Oct 2004 67728 Posts This extra credit assignment from Imperial College London really has the class scrambling :) The Murphy score is the probability that a random value of the NFS polynomials, drawn from somewhere in the sieving region, will have all of its factors smaller than some bound. Larger 'e' score is supposed to be better, since a larger score implies that sieving will find more relations. It's tempting to assume that sieving will run X% faster if using a polynomial whose e-value is X% better, but that doesn't appear to happen when there are large differences in e-value. Once you decide on a polynomial pair, everything else is specific to that pair, so the amount of effort to spend on polynomial selection is a judgment call that depends on how long you anticipate the sieving to take, whether poly selection has gotten lucky yet, and how many machines you have that will be idle until poly selection is done. The cutoff point is particularly tricky when doing poly selection using a graphics card, because most of these individual efforts have several machines but only one graphics card, and I have no idea whether I've chosen the best parameters for that one card to use.
 2010-11-20, 14:34 #4 aifbowman   Nov 2010 2×7 Posts Haha I thought I was the only one scrambling! Have others come to this forum looking for help too? Thanks for the help Jason, so you are saying that if I did find a better polynomial then I would have to start from scratch again building relations? This is going to be a tough one to call because I started sieving with a pretty awful poly (e value 9.45e-13) and I am starting to find some now in the 1/2e-12 region. I am 10% in to the sieving. What I am most worried about is whether such a bad poly could cause the factorization to fail- I have access to enough computing power for the fact that it will take longer not to matter so much, but could it cause the factorization to fail entirely? Again, thanks for your help, I've put far too much time into this to screw it all up now...
2010-11-20, 14:36   #5
TimSorbet
Account Deleted

"Tim Sorbera"
Aug 2006
San Antonio, TX USA

10000101101112 Posts

Quote:
 Originally Posted by aifbowman Haha I thought I was the only one scrambling! Have others come to this forum looking for help too?
Just that one, as far as I'm aware. It'd be worth it for you to read that thread as well. I still think the only realistic chance is for the students to collaborate, if they're allowed to. Unless the students could make use of many school computers, either alone or working together.

Last fiddled with by TimSorbet on 2010-11-20 at 14:40

2010-11-20, 14:39   #6
axn

Jun 2003

156216 Posts

Quote:
 Originally Posted by aifbowman I am 10% in to the sieving. What I am most worried about is whether such a bad poly could cause the factorization to fail-
No. Just will take longer. Do you know how to estimate how many relations to collect? If you do, even now, you can test if the newer better poly can end up finishing the factorization quicker.

 2010-11-20, 14:55 #7 aifbowman   Nov 2010 168 Posts Thanks mini-geek, I've read through the other thread and I'm not too worried about the competition I have to say :p. I am making use of multiple school computers (i7-920s with 8GB RAM) for the sieving, whilst using the Quadro 290 GFX cards on the same machines for the polynomial selection. @axn: I don't know how to myself, but when I run the factmsieve.py script it provides me with a miniumum number of relations. I guess what I could do would be to start sieving with the new poly, compare the rate of relation collection with that from the old poly and then just do some brief calculations to see whether the old poly with the headstart or the new one would finish first. Can anyone tell me how much time I need to allow for the matrix, square root etc steps (post-processing?) on one of the machines I mentioned above with this 160 digit key? As that will be what tells me whether I have a fighting chance of finishing in time.
2010-11-20, 15:08   #8
TimSorbet
Account Deleted

"Tim Sorbera"
Aug 2006
San Antonio, TX USA

11×389 Posts

Quote:
 Originally Posted by aifbowman I am making use of multiple school computers (i7-920s with 8GB RAM) for the sieving, whilst using the Quadro 290 GFX cards on the same machines for the polynomial selection. ... Can anyone tell me how much time I need to allow for the matrix, square root etc steps (post-processing?) on one of the machines I mentioned above with this 160 digit key? As that will be what tells me whether I have a fighting chance of finishing in time.
This c161 factorization I mentioned had the post-processing done on four cores. Looks like it took roughly 2 hours for the preliminary work, then about 100 hours for the matrix solving, then about 2 hours for each square root. Each square root has, independently, about a 50/50 chance of producing the factors. So you're looking at about 4 days. Of course, not all quad cores are created equally, but it's a good enough estimate. Using 8 threads might make it a little faster, but since it's still just four physical cores, it'll be close.

 2010-11-20, 15:22 #9 aifbowman   Nov 2010 2·7 Posts Thanks again mini-geek. You don't happen to know what quad core exactly was used for the c161? I looked through the thread but couldn't find any mention of it. The quads in the PCs I am using have hyper-threading for 8 virtual cores so as you say that could speed things up a little bit. Is msieve's code optimized for multiple cores by default or do I need to change anything? All research I've done into this has said so but can I just confirm that there is no way of splitting the post processing between multiple computers?
2010-11-20, 15:29   #10
TimSorbet
Account Deleted

"Tim Sorbera"
Aug 2006
San Antonio, TX USA

11×389 Posts

Quote:
 Originally Posted by aifbowman Thanks again mini-geek. You don't happen to know what quad core exactly was used for the c161? I looked through the thread but couldn't find any mention of it.
No, I don't. If it was important, you could ask fivemack, who did it, but it really doesn't make too much of a difference. Probably plus or minus a day at the most.
Quote:
 Originally Posted by aifbowman The quads in the PCs I am using have hyper-threading for 8 virtual cores so as you say that could speed things up a little bit. Is msieve's code optimized for multiple cores by default or do I need to change anything?
Use the -t argument to set the number of threads it should use. From msieve -h:
Code:
-t <num>  use at most <num> threads
Quote:
 Originally Posted by aifbowman All research I've done into this has said so but can I just confirm that there is no way of splitting the post processing between multiple computers?
Right, at least not with normal programs and methods.
Relatedly, the square root step could actually be run on multiple cores and/or computers, but msieve normally just runs it on one core (regardless of the -t setting). If you're short on time, you could watch for when the linear algebra completes, then start three more square root instances so you'd be almost certain to have the factors within two hours. (it's like flipping a coin: flip one and you have a 1/2 chance of heads, flip four at once and you have a 15/16 chance of at least one heads) You'd want to use this option to specify that each instance should try different dependency numbers:
Code:
-nc3 [X,Y] perform only NFS square root (compute
dependency numbers X through Y, 1<=X,Y<=64)

Last fiddled with by TimSorbet on 2010-11-20 at 15:35

 2010-11-20, 15:43 #11 jasonp Tribal Bullet     Oct 2004 2·1,789 Posts If you're 10% into the sieving with a bad poly and found one whose e-value is 10x better, I'd expect the sieving with the latter poly to finish ~20-30% faster. The correct way to test this out is to do about 1/1000 of the sieving using the new poly with the same parameters, then estimate the relation rate and see if the total time is larger than 90% of your current estimated total time. The postprocessing can be split across multiple computers but not easily. Latter-day versions of msieve can distribute the matrix across a tightly coupled parallel computer using MPI, but you have to have MPI installed and the machines need a fast interconnect for it to be worthwhile (gigabit ethernet at a minimum). Even then the runtime for a matrix of this size is near the lower end of the range where a speedup could be expected from using, say, 4 machines in parallel. It won't be 4x faster, maybe 30% faster if you're lucky. Plus the code is a little experimental even though it has been used for huge problems. I mentioned the scrambling because other forums are also fielding questions about 160-digit factorizations. I've also fielded one request through sourceforge with the same question :) Last fiddled with by jasonp on 2010-11-20 at 15:53

 Similar Threads Thread Thread Starter Forum Replies Last Post evanmiyakawa Information & Answers 4 2017-11-07 01:37 theshark Information & Answers 21 2014-08-30 17:36 fivemack Factoring 47 2009-06-16 00:24 CRGreathouse Factoring 2 2009-05-25 07:55 Optics Information & Answers 8 2009-04-25 18:23

All times are UTC. The time now is 10:21.

Sat Sep 23 10:21:33 UTC 2023 up 10 days, 8:03, 0 users, load averages: 1.06, 1.42, 1.42