20131012, 14:24  #1 
"Ed Hall"
Dec 2009
Adirondack Mtns
3,617 Posts 
Poly Search vs Sieving times
Sorry for the neophyte question, but for smaller composites ~150 digits across multiple machines, is there a real gain in overall sieving if a poly with a greater than expected E score is found early in the search?
Specifics: Code:
Msieve v. 1.52 (SVN 886) random seeds: 55df12b7 e5d3c816 factoring 218876969782929216599190531580538390484044829345681891460187089862281796240529496476113446316632558113449224677358919720113328517497603198421529 (144 digits) searching for 15digit factors commencing number field sieve (144digit input) commencing number field sieve polynomial selection polynomial degree: 5 max stage 1 norm: 8.34e+21 max stage 2 norm: 1.39e+20 min Evalue: 1.03e11 poly select deadline: 311905 time limit set to 86.64 CPUhours expecting poly E from 1.12e11 to > 1.29e11 The following poly was found within the first few minutes: Code:
# norm 8.158791e14 alpha 8.748871 e 1.309e11 rroots 5 skew: 52981938.39 c0: 5925697648913135366967245004294844179177 c1: 468536815244189902014756820336233 c2: 26076232373209850831435621 c3: 542019174557361769 c4: 10239243052 c5: 48 Y0: 21467961164039091417408999106 Y1: 1118175346469539 Would a better poly outweigh the 11 hours of search time left for a 4860 hour project? Thanks... Last fiddled with by EdH on 20131012 at 14:26 Reason: added info 
20131012, 17:38  #2 
"Curtis"
Feb 2005
Riverside, CA
3^{3}·173 Posts 
If the "expected score" is accurate at this size, you should not search further. You may well find a poly that's 5% faster by completing the search, but it's pretty clear that 5% of sieve time is greater than the polysearch time.
However, the Expected Score code is simply guesswork without previous experience about its accuracy, you don't know if you might improve by 10% or more. I'd compromise, and search a couple more hours if nothing comes close in that time, I'd assume I got lucky at the outset and proceed to sieving. A5 coefficients below 10000 should not be searched msieve takes longer at very low A5 values to do the same work. Also note skew is 50M, very high for this size of number; this occasionally causes hiccups in later stages (I have not run into such a hiccup yet, but I follow Frmky's advice to use large A5, say 550M for this size number). Large A5 coefficients have lower skews than very small values. 
20131012, 18:32  #3 
"Ed Hall"
Dec 2009
Adirondack Mtns
3,617 Posts 
Thanks!
My impatience got me to break in at about 2.5 hours of searching and that was the poly chosen  no others were close. My first set of relations came in at 2.2% of the estimated minimum (32523338) in about 1.2 hours. This calculates to (very) roughly 55 hours for sieving, although a couple of the machines will be off at night. I guess I'll see how it turns out... 
20131012, 19:44  #4 
"Ed Hall"
Dec 2009
Adirondack Mtns
3,617 Posts 
Wow! I have a new estimate for sieving time of ~26 hours. I guess the first estimate hadn't received relations from several of the machines. I'll have to see how it works out in reality.
The "Wow!" part is that I remember taking over two weeks to factor c100s... 
20131013, 14:14  #5 
"Ed Hall"
Dec 2009
Adirondack Mtns
3,617 Posts 
There I was!
closing in on a matrix, when the power company lost control, and my scripts weren't robust enough to carry through a restart, so I had to resort to semimanually* completing the job. So much for an accurate timing for this composite factorization... Oh, well, if anything good has come of it, it got me off my a** to place the main factoring machine on a UPS that was already sitting there, waiting to be put into use... *semimanually meaning a manual restart of factmsieve.py and manually concatenating all the relations from those machines that are still running... 
20131013, 18:05  #6 
"Ed Hall"
Dec 2009
Adirondack Mtns
3,617 Posts 
Well, that didn't work.
Now I'm lost! factmsieve.py wouldn't restart  kept giving an error 255. Renamed number.dat.cyc and that cleared the factmsieve.py error. But, number.dat was trashed. Rebuilt that, but factmsieve.py wouldn't go anywhere, so I have moved to msieve direct entry. Now all I get are 11 errors for the entire 4G number.dat file. Currently, I'm retrieving the rels from the spairs.save.gz file to see if I can get anywhere from there. Last fiddled with by EdH on 20131013 at 18:08 Reason: removed erroneous info 
20131013, 19:45  #7 
I moo ablest echo power!
May 2013
6CD_{16} Posts 
One thing to try is restoring the relations found using the spairs.gz file like you're doing, deleting all the .cyc and such, and then rerunning using the "nc" command. That should take of it, if I understand your error correctly. I've had to do that before when my .dat file disappeared.

20131013, 20:07  #8  
"Ed Hall"
Dec 2009
Adirondack Mtns
3,617 Posts 
Quote:
Code:
matrix needs more columns than rows; try adding 23% more relations Quote:


20131014, 15:56  #9 
"Ed Hall"
Dec 2009
Adirondack Mtns
3,617 Posts 
The matrix finally built properly for a solve...

20131014, 16:00  #10 
I moo ablest echo power!
May 2013
1,741 Posts 
I have that issue sometimes as wellI've gotten to 120% of the estimated relations before the matrix builds, even when using a large interval step. Maybe somebody better versed in this can suggest a cause?

20131014, 20:00  #11 
"Curtis"
Feb 2005
Riverside, CA
1001000111111_{2} Posts 
The estimates for # of relations in the factmsieve script are not very accurate it uses an exponential with size of number as input, while the actual situation is more like a step function with the steps located where lpbr jumps bits. Serge (Batalov) posted a patch to fix this in some thread I'll see if I can find and link it.
Actual rels needed is roughly 21M for 28bit projects, 40M for 29bit projects, and (I am told) the low 80s for 30bit projects. Some polys produce more or fewer duplicate relations, and sometimes a matrix builds with fewerthantypical relations, so there is a fair amount of variety from project to project. My vague grasp of the "hiccup" mentioned above is that skew alters the area sieved for each specialQ, with higher skews associated with smaller areas. Very high skews may require more specialQ to be searched, and that requirement can lead to using specialQ higher than a poly's efficient sieve range. So it's not that reaching 120% of the script's expected rels is a hiccup it's that the last few rels might be found at a rate half (or worse?) of the sec/rel you achieved during most of the sieving. If you read through threads of large forumteamfactorizations, you'll see discussions amounting to "we've run out of good Q, now what?". Highskew polys run out of good Q more often than lowskew polys. At singleusersize projects, we can preempt this problem by choosing the nexthigher siever version for a project we're even slightly nervous about. We might choose to use 14e instead of 13e for projects a few digits lower than the script's cutoff; in fact, I edited my factmsieve code to shift the cutoff a bit lower. As Mr Womack points out, this makes factorizations in the 135 to 150 digit level more fireandforget at the expense of a few hours of sieving. I have not yet attempted a 30bit project myself to know how much of this extends upward. Curtis Last fiddled with by VBCurtis on 20131014 at 20:02 Reason: added info in first paragraph 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Poly search candidates  schickel  Msieve  32  20131105 19:11 
Resume msieve poly search job?  Andi47  Msieve  1  20110328 04:30 
gpu poly search error  bdodson  Msieve  10  20101109 19:46 
Poly search for c157 from 4788:2422  henryzz  Aliquot Sequences  59  20090704 06:27 
Poly search for c137 from 4788:2408  axn  Aliquot Sequences  15  20090528 16:50 