mersenneforum.org call for volunteers: RSA768 polynomial selection
 Register FAQ Search Today's Posts Mark Forums Read

 2011-05-01, 02:21 #45 jasonp Tribal Bullet     Oct 2004 3×1,181 Posts After spending 8 hours boiling down the first 223k stage 1 hits, the root sieve ran 37 times and the best polynomial found was Code: # norm 3.907218e-017 alpha -12.627485 e 4.061e-017 rroots 4 skew: 325308648.10 c0: 4541756129545864384960730407369094599176464560226285939040 c1: -5948272091293540014654958118926031775084977330112 c2: -401087617345026325354287460803504393810426 c3: 1769424544559462821421052544837 c4: 4150609116863184653468972 c5: -1225564956623636 c6: 4037328 Y0: -25940569391568647937086335710354295189 Y1: 7693840331322664974655 This polynomial was apparently pretty lucky; the stage 1 hit that caused it was not all that great, and the alpha is good but many other polynomials in its batch of 200 had better alpha scores. There are other stage 1 hits whose average size was 10x better than this one, that only lead to polynomials about 75% as good as this one.
 2011-05-01, 09:01 #46 debrouxl     Sep 2009 2·3·163 Posts Result of two msieve -np1 runs over the A6=30000-40000 range (split in four pieces, one per core): http://www.sendspace.com/file/frmhq3 Last fiddled with by jasonp on 2011-05-01 at 15:36 Reason: got it, thanks!
2011-05-01, 15:57   #47
bdodson

Jun 2005
lehigh.edu

210 Posts

Quote:
 Originally Posted by Jeff Gilchrist ... I'm now running svn 564 and will let you know if I see any more crap. Jeff.
Code:
@Jason:  Since all of the knowlegable users refer to mieve by the svn,
and those of us that consider ourselves lucky when we manage to
fiddle a Makefile into working have only the msieve version (as
reported in the logfiles;  cf. "1.48" for what turns out to be 1.47!),
your suggestion that a svn might be gotten into the logs would
seem to be more than welcome.  Guess nonstandard locations for
files/libraries in local cuda installations can't be helped;  I'm not
sure that I would have found which files/libraries to look for without
Greg's pointers.  This from a user not even knowing to run "make
clean" between compiles, sigh.
On the topic of the improvements in the cpu version of polynomial
selection, I've at last gotten semi-plausible data for a comparison
of cpu -vs- gpu performance in the current (well, as of
Code:
 375056 Apr 23 09:05 ../bdcuda/apr2011/msieve-trunk.tar.gz
) version of 1.49. Here's the end of two runs starting at 10M on
tessla C2050 and last summer's 6-core xeon:
Code:
tail se1c.err  [GPU]

poly 10074540 7171023307356927459403 22273628206374002664383812138745346023
poly 10074540 8862086865295791577771 22273628240550335956422948239244521734

polynomial selection complete
...
elapsed time 140:21:28
---

tail sc1b.err  [CPU]

poly 10062780 8561705142057132921367 22277964484927135130221272565080923617
poly 10062780 6140093274358072507183 22277964432815171540404493590438608677

...
polynomial selection complete
...
elapsed time 118:37:12
So the GPU got most of an extra day, and got to search
12K further. Even so, there's no comparison as to the relative
number of the stage1 hits:
Code:
size:

52569 May  1 09:13 msieve.dat1c.m.gz  [GPU]
1138446 Apr 28 07:53 msieve.dat1b.m.gz  [CPU]
---

hit count (wc -l):

1728 msieve.dat1c.m  [GPU]
40339 msieve.dat1b.m  [CPU]
The info I'm missing is whether the search parameters are incomparable
also. That is, should we expect the GPU hits to be of higher quality,
with more productive large flares in stage 2? I'm not sure that I've
understood the tunings, either GPU or CPU, but I had/have the impression
we're not entirely sure what makes a productive stage 2 hit before
running stage 2; and the current stage 1 objective is primarily
quantitative --- more is better --- but is it the case that the CUDA
and CPU code is sufficiently divergent (like, entirely different searches!)
that we're not sure?

Thanks, Bruce

[PS -- There's a similarly lopsided comparison from 2M, but the GPU
appears to be working substantially harder to get past its 50K range
Code:
poly 2036040 6025784662266701164261 29075740624793558936492499123394378516
randomizing rational coefficient: using piece #1072 of 2000
p = 72.40 bits, sieve = 103.18 bits
coeff 2036328 specialq 531625787 - 532363288 p1 2283162 - 3424744 p2 3424745 - 5137117 ...
another 30% left to go. ]

 2011-05-02, 01:20 #48 jasonp Tribal Bullet     Oct 2004 3·1,181 Posts Bruce: sorry about the build confusion. 'make' only determines that files need rebuilding when their timestamps are older than those of their dependencies, but when you switch between GPU and CPU builds no files actually change. Yet each build configuration needs new functions compiled, which aren't available because make won't build the files containing them. There's no easy way around this, except to leave a note about which configuration was built last and then redo everything if the current build doesn't match. I've committed changes in the repository that now make the library report the SVN rev at build time (it was embarrassingly easy). Regarding direct comparison between CPU and GPU results, I expect the quality of the stage 1 hits found to be similar but the hits themselves will pretty much always be different. Beyond a certain point the CPU and GPU codebases are completely distinct and incompatible. (Thanks all the emailed results!)
2011-05-02, 07:51   #49
xilman
Bamboozled!

"𒉺𒌌𒇷𒆷𒀭"
May 2003
Down not across

2·5·1,103 Posts

Quote:
 Originally Posted by jasonp Bruce: sorry about the build confusion. 'make' only determines that files need rebuilding when their timestamps are older than those of their dependencies, but when you switch between GPU and CPU builds no files actually change. Yet each build configuration needs new functions compiled, which aren't available because make won't build the files containing them. There's no easy way around this, except to leave a note about which configuration was built last and then redo everything if the current build doesn't match. I've committed changes in the repository that now make the library report the SVN rev at build time (it was embarrassingly easy). Regarding direct comparison between CPU and GPU results, I expect the quality of the stage 1 hits found to be similar but the hits themselves will pretty much always be different. Beyond a certain point the CPU and GPU codebases are completely distinct and incompatible. (Thanks all the emailed results!)
A standard trick is to have a make rule touch(1) an empty file and then use that file as a dependency for things which wouldn't otherwise be rebuilt.

Paul

 2011-05-02, 16:30 #50 Jeff Gilchrist     Jun 2003 Ottawa, Canada 3×17×23 Posts I tried running a CPU vs GPU comparison as well with the latest SVN version for a 50k range starting at 90M. The clock time is almost identical: GPU: using GPU 1 (Tesla T10 Processor) elapsed time 25:47:14 500 candidates found CPU: using a Core i5 760 elapsed time 25:47:11 11456 candidates found but the CPU version found many more candidates than the GPU as Bruce noted as well. Jeff.
2011-05-03, 00:21   #51
Christenson

Dec 2010
Monticello

70316 Posts

Quote:
 Originally Posted by xilman A standard trick is to have a make rule touch(1) an empty file and then use that file as a dependency for things which wouldn't otherwise be rebuilt. Paul
Another standard trick is to put the different targets (CPU vs GPU) in different directories, or not duplicating target filenames when the result is CPU or GPU specific.

Any of these saves lots of headaches.

 2011-05-03, 01:45 #52 EdH     "Ed Hall" Dec 2009 Adirondack Mtns 24×257 Posts My WinXP machine finally finished. Here is the last portion of the console output: Code: ... poly 19980 2968638303049930050577 62838998718516738575151081938325245091 hashtable: 69701 entries, 2.94 MB randomizing rational coefficient: using piece #10 of 28 p = 71.29 bits, sieve = 99.29 bits coeff 19992 specialq 40599436 - 43001461 other 6692872 - 10039308 aprogs: 73391 entries, 250128 roots poly 19992 3238439034368067505753 62832710731168887395228500309210239036 poly 19992 3082975092645704142463 62832710731248822203078702831587306354 poly 19992 2651816995017612624199 62832710731270189884422676383418308007 poly 19992 1985319223958537117593 62832710731222108494678183887586969840 poly 19992 2516076988091677444769 62832710731222082495429148535850154568 poly 19992 2513596494179902216577 62832710731214931879037865805099137850 poly 19992 2457110827088192334361 62832710731273445066184943118839111160 poly 19992 2872106930170073406641 62832710731234550111795499072857298856 poly 19992 3944238636256922892331 62832710731250080712625078424697415183 poly 19992 2261212715929960130711 62832710731197625224070064017978900785 poly 19992 2284738031862373963549 62832710731227270123936293146376584561 poly 19992 2345977517038137821491 62832710731187318437111697969703805125 poly 19992 2549237453966821479997 62832710731246437169267437828327302778 poly 19992 2114353419000919534807 62832710731224345779261331410979374822 poly 19992 3123765159946289681117 62832710731214853097636578554113429552 poly 19992 2231586572969713763471 62832710731228683166839543430880504652 poly 19992 3533273137430754384241 62832710731159785542872252562852947326 poly 19992 2206324822935024438781 62832710731201999478899054797320926138 poly 19992 2708624939081239032733 62832710731211868320849932996560789821 poly 19992 3358130330888872412971 62832710731219982068270676389897303619 hashtable: 161178 entries, 5.38 MB polynomial selection complete error generating or reading NFS polynomials elapsed time 263:13:36 The msieve.dat file says: Code: Thu Apr 21 17:14:12 2011 Thu Apr 21 17:14:12 2011 Thu Apr 21 17:14:12 2011 Msieve v. 1.48 Thu Apr 21 17:14:12 2011 random seeds: b9cd5ef0 56d2f1c8 Thu Apr 21 17:14:12 2011 factoring 1230186684530117755130494958384962720772853569595334792197322452151726400507263657518745202199786469389956474942774063845925192557326303453731548268507917026122142913461670429214311602221240479274737794080665351419597459856902143413 (232 digits) Thu Apr 21 17:14:21 2011 searching for 15-digit factors Thu Apr 21 17:14:24 2011 commencing number field sieve (232-digit input) Thu Apr 21 17:14:24 2011 commencing number field sieve polynomial selection Thu Apr 21 17:14:24 2011 searching leading coefficients from 10000 to 20000 Mon May 02 16:27:48 2011 polynomial selection complete Mon May 02 16:27:48 2011 elapsed time 263:13:36 And, the msieve.dat.m is at sendspace file link.
 2011-05-03, 01:59 #53 jasonp Tribal Bullet     Oct 2004 3·1,181 Posts Thanks to everyone who is pelting me with hits, they're really pouring in now. Christenson, the problem is not that make builds a bad binary, but that object files that need the CPU functions are not rendered stale when 'make' is invoked with 'CUDA=1'. I instead have to build a new object file that needs the GPU functions, which are then made available at link time.
 2011-05-03, 02:11 #54 Batalov     "Serge" Mar 2008 Phi(4,2^7658614+1)/2 2·3·7·229 Posts I submit a heretic notion that a sixth-degree gnfs polys can overcompete a fifth-degree at sizes lower than usually thought. Possibly even for the c197 input. So this thread is relavent to more than just RSA768...
 2011-05-03, 02:15 #55 jasonp Tribal Bullet     Oct 2004 DD716 Posts Well, I did try to run an early version of the degree-6 code on RSA200, but the 5th-degree poly used for the factorization absolutely destroyed it in throughput.

 Similar Threads Thread Thread Starter Forum Replies Last Post Max0526 NFS@Home 9 2017-05-20 08:57 jasonp Operation Kibibit 5 2014-09-07 11:02 jasonp Operation Kibibit 134 2013-09-03 22:08 fivemack Factoring 47 2009-06-16 00:24 CRGreathouse Factoring 2 2009-05-25 07:55

All times are UTC. The time now is 17:05.

Mon Dec 6 17:05:48 UTC 2021 up 136 days, 11:34, 0 users, load averages: 2.30, 1.88, 1.79