![]() |
![]() |
#45 |
Tribal Bullet
Oct 2004
32·5·79 Posts |
![]()
After spending 8 hours boiling down the first 223k stage 1 hits, the root sieve ran 37 times and the best polynomial found was
Code:
# norm 3.907218e-017 alpha -12.627485 e 4.061e-017 rroots 4 skew: 325308648.10 c0: 4541756129545864384960730407369094599176464560226285939040 c1: -5948272091293540014654958118926031775084977330112 c2: -401087617345026325354287460803504393810426 c3: 1769424544559462821421052544837 c4: 4150609116863184653468972 c5: -1225564956623636 c6: 4037328 Y0: -25940569391568647937086335710354295189 Y1: 7693840331322664974655 |
![]() |
![]() |
![]() |
#46 |
Sep 2009
11×89 Posts |
![]()
Result of two msieve -np1 runs over the A6=30000-40000 range (split in four pieces, one per core): http://www.sendspace.com/file/frmhq3
Last fiddled with by jasonp on 2011-05-01 at 15:36 Reason: got it, thanks! |
![]() |
![]() |
![]() |
#47 | |
Jun 2005
lehigh.edu
102410 Posts |
![]() Quote:
Code:
@Jason: Since all of the knowlegable users refer to mieve by the svn, and those of us that consider ourselves lucky when we manage to fiddle a Makefile into working have only the msieve version (as reported in the logfiles; cf. "1.48" for what turns out to be 1.47!), your suggestion that a svn might be gotten into the logs would seem to be more than welcome. Guess nonstandard locations for files/libraries in local cuda installations can't be helped; I'm not sure that I would have found which files/libraries to look for without Greg's pointers. This from a user not even knowing to run "make clean" between compiles, sigh. selection, I've at last gotten semi-plausible data for a comparison of cpu -vs- gpu performance in the current (well, as of Code:
375056 Apr 23 09:05 ../bdcuda/apr2011/msieve-trunk.tar.gz tessla C2050 and last summer's 6-core xeon: Code:
tail se1c.err [GPU] poly 10074540 7171023307356927459403 22273628206374002664383812138745346023 poly 10074540 8862086865295791577771 22273628240550335956422948239244521734 received signal 15; shutting down polynomial selection complete ... elapsed time 140:21:28 --- tail sc1b.err [CPU] poly 10062780 8561705142057132921367 22277964484927135130221272565080923617 poly 10062780 6140093274358072507183 22277964432815171540404493590438608677 received signal 15; shutting down ... polynomial selection complete ... elapsed time 118:37:12 12K further. Even so, there's no comparison as to the relative number of the stage1 hits: Code:
size: 52569 May 1 09:13 msieve.dat1c.m.gz [GPU] 1138446 Apr 28 07:53 msieve.dat1b.m.gz [CPU] --- hit count (wc -l): 1728 msieve.dat1c.m [GPU] 40339 msieve.dat1b.m [CPU] also. That is, should we expect the GPU hits to be of higher quality, with more productive large flares in stage 2? I'm not sure that I've understood the tunings, either GPU or CPU, but I had/have the impression we're not entirely sure what makes a productive stage 2 hit before running stage 2; and the current stage 1 objective is primarily quantitative --- more is better --- but is it the case that the CUDA and CPU code is sufficiently divergent (like, entirely different searches!) that we're not sure? Thanks, Bruce [PS -- There's a similarly lopsided comparison from 2M, but the GPU appears to be working substantially harder to get past its 50K range Code:
poly 2036040 6025784662266701164261 29075740624793558936492499123394378516 randomizing rational coefficient: using piece #1072 of 2000 p = 72.40 bits, sieve = 103.18 bits coeff 2036328 specialq 531625787 - 532363288 p1 2283162 - 3424744 p2 3424745 - 5137117 ... |
|
![]() |
![]() |
![]() |
#48 |
Tribal Bullet
Oct 2004
32·5·79 Posts |
![]()
Bruce: sorry about the build confusion. 'make' only determines that files need rebuilding when their timestamps are older than those of their dependencies, but when you switch between GPU and CPU builds no files actually change. Yet each build configuration needs new functions compiled, which aren't available because make won't build the files containing them. There's no easy way around this, except to leave a note about which configuration was built last and then redo everything if the current build doesn't match.
I've committed changes in the repository that now make the library report the SVN rev at build time (it was embarrassingly easy). Regarding direct comparison between CPU and GPU results, I expect the quality of the stage 1 hits found to be similar but the hits themselves will pretty much always be different. Beyond a certain point the CPU and GPU codebases are completely distinct and incompatible. (Thanks all the emailed results!) |
![]() |
![]() |
![]() |
#49 | |
Bamboozled!
"๐บ๐๐ท๐ท๐ญ"
May 2003
Down not across
2·3·29·67 Posts |
![]() Quote:
Paul |
|
![]() |
![]() |
![]() |
#50 |
Jun 2003
Ottawa, Canada
3·17·23 Posts |
![]()
I tried running a CPU vs GPU comparison as well with the latest SVN version for a 50k range starting at 90M. The clock time is almost identical:
GPU: using GPU 1 (Tesla T10 Processor) elapsed time 25:47:14 500 candidates found CPU: using a Core i5 760 elapsed time 25:47:11 11456 candidates found but the CPU version found many more candidates than the GPU as Bruce noted as well. Jeff. |
![]() |
![]() |
![]() |
#51 | |
Dec 2010
Monticello
179510 Posts |
![]() Quote:
Any of these saves lots of headaches. |
|
![]() |
![]() |
![]() |
#52 |
"Ed Hall"
Dec 2009
Adirondack Mtns
10100011110002 Posts |
![]()
My WinXP machine finally finished. Here is the last portion of the console output:
Code:
... poly 19980 2968638303049930050577 62838998718516738575151081938325245091 hashtable: 69701 entries, 2.94 MB randomizing rational coefficient: using piece #10 of 28 p = 71.29 bits, sieve = 99.29 bits coeff 19992 specialq 40599436 - 43001461 other 6692872 - 10039308 aprogs: 73391 entries, 250128 roots poly 19992 3238439034368067505753 62832710731168887395228500309210239036 poly 19992 3082975092645704142463 62832710731248822203078702831587306354 poly 19992 2651816995017612624199 62832710731270189884422676383418308007 poly 19992 1985319223958537117593 62832710731222108494678183887586969840 poly 19992 2516076988091677444769 62832710731222082495429148535850154568 poly 19992 2513596494179902216577 62832710731214931879037865805099137850 poly 19992 2457110827088192334361 62832710731273445066184943118839111160 poly 19992 2872106930170073406641 62832710731234550111795499072857298856 poly 19992 3944238636256922892331 62832710731250080712625078424697415183 poly 19992 2261212715929960130711 62832710731197625224070064017978900785 poly 19992 2284738031862373963549 62832710731227270123936293146376584561 poly 19992 2345977517038137821491 62832710731187318437111697969703805125 poly 19992 2549237453966821479997 62832710731246437169267437828327302778 poly 19992 2114353419000919534807 62832710731224345779261331410979374822 poly 19992 3123765159946289681117 62832710731214853097636578554113429552 poly 19992 2231586572969713763471 62832710731228683166839543430880504652 poly 19992 3533273137430754384241 62832710731159785542872252562852947326 poly 19992 2206324822935024438781 62832710731201999478899054797320926138 poly 19992 2708624939081239032733 62832710731211868320849932996560789821 poly 19992 3358130330888872412971 62832710731219982068270676389897303619 hashtable: 161178 entries, 5.38 MB polynomial selection complete error generating or reading NFS polynomials elapsed time 263:13:36 Code:
Thu Apr 21 17:14:12 2011 Thu Apr 21 17:14:12 2011 Thu Apr 21 17:14:12 2011 Msieve v. 1.48 Thu Apr 21 17:14:12 2011 random seeds: b9cd5ef0 56d2f1c8 Thu Apr 21 17:14:12 2011 factoring 1230186684530117755130494958384962720772853569595334792197322452151726400507263657518745202199786469389956474942774063845925192557326303453731548268507917026122142913461670429214311602221240479274737794080665351419597459856902143413 (232 digits) Thu Apr 21 17:14:21 2011 searching for 15-digit factors Thu Apr 21 17:14:24 2011 commencing number field sieve (232-digit input) Thu Apr 21 17:14:24 2011 commencing number field sieve polynomial selection Thu Apr 21 17:14:24 2011 searching leading coefficients from 10000 to 20000 Mon May 02 16:27:48 2011 polynomial selection complete Mon May 02 16:27:48 2011 elapsed time 263:13:36 |
![]() |
![]() |
![]() |
#53 |
Tribal Bullet
Oct 2004
DE316 Posts |
![]()
Thanks to everyone who is pelting me with hits, they're really pouring in now.
Christenson, the problem is not that make builds a bad binary, but that object files that need the CPU functions are not rendered stale when 'make' is invoked with 'CUDA=1'. I instead have to build a new object file that needs the GPU functions, which are then made available at link time. |
![]() |
![]() |
![]() |
#54 |
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2
89·113 Posts |
![]()
I submit a heretic notion that a sixth-degree gnfs polys can overcompete a fifth-degree at sizes lower than usually thought.
Possibly even for the c197 input. So this thread is relavent to more than just RSA768... ![]() |
![]() |
![]() |
![]() |
#55 |
Tribal Bullet
Oct 2004
32×5×79 Posts |
![]()
Well, I did try to run an early version of the degree-6 code on RSA200, but the 5th-degree poly used for the factorization absolutely destroyed it in throughput.
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Polynomial selection | Max0526 | NFS@Home | 9 | 2017-05-20 08:57 |
Improved NFS polynomial selection | jasonp | Operation Kibibit | 5 | 2014-09-07 11:02 |
Call for volunteers: RSA896 | jasonp | Operation Kibibit | 134 | 2013-09-03 22:08 |
2^877-1 polynomial selection | fivemack | Factoring | 47 | 2009-06-16 00:24 |
Polynomial selection | CRGreathouse | Factoring | 2 | 2009-05-25 07:55 |