20140302, 04:17  #1 
Aug 2006
3^{2}×5×7×19 Posts 
Estimating time needed for GNFS
I've written a page for my website which estimates, for a given number of digits and processor, how long it would take to run GNFS on that number. This is intended as a resource for people entirely unfamiliar with practical factorization, not for the experts here who will have wellhoned intuition.
Still, even with that relatively low goal, I'd like to tweak my tool before posting it before the world. Here is my simpleminded methodology:
For the first point, I took the 6 data points from GNFS Factoring Statistics of RSA100, 110, ..., 150. This means a Pentium 4 taking 15 hours for a C100, 53 hours for a C110, 219 hours for a C120, 643 hours for a C130, 1868 hours for a C140, and 5721 hours for a C150. Any ideas on how I could do this better? 
20140302, 07:46  #2 
Sep 2009
1111010001_{2} Posts 
The two first points of your methodology are probably reasonable, but I wonder whether generic benchmarks are a reliable base for scaling ? IOW, I wonder whether they test the exact combination of operations used for NFS, and how well they account for microarchitecture and memory bandwidth variations which affect NFS run time.
Having started integer factoring in 2009, I'm unfit for commenting on the 15 hours figure for a C100 on a P4. On 20112013 computers, such numbers take much less time, that's for sure Other data points: * in 2009, factoring a 512bit RSA key took an estimated 73 wall clock days on a thenold dualcore Athlon 64 (Benjamin Moody, triggering the "TI signing key controversy"); * in 2013, sieving another 512bit RSA key took 34 wall clock days on up to 42 cores of varying power, followed by less than 24 wall clock hours of postprocessing on 4 threads. 512bit RSA keys (GNFS 154155) are about twice harder than GNFS 150 tasks. 
20140302, 15:29  #3  
Aug 2006
3^{2}·5·7·19 Posts 
Quote:
Quote:
Quote:
It's hard to say much about this without knowing the types of machines used (or exact time). Call it 151 coredays, making the average core just as powerful as Moody's (but faster by having more). 

20140302, 16:03  #4  
Tribal Bullet
Oct 2004
3538_{10} Posts 
Quote:
Also, Kleinjung's announcement of improvements to GNFS polynomial selection in late 2008 made the sieving phase meaningfully faster, though it would be less than a factor of two. Edit: I believe the Aoki et. al. paper reported results using their own lattice siever, whereas all the results we see here use variants of the Franke/Kleinjung siever. That could explain the difference too; the siever we use incorporates cache management techniques that had not been reported in 2004. Last fiddled with by jasonp on 20140302 at 16:05 

20140302, 17:01  #5 
Aug 2006
3^{2}×5×7×19 Posts 
Great. So if I understand correctly, it sounds like the processor scaling seems decent enough but that all times should be reduced by a factor of ~2 because of sieving and postprocessing improvements. (Thanks for your part in that, by the way!)
To test the extrapolation I looked at RSA768. They report 1500 coreyears of Opterons at 2.2 GHz, which should be 375 processoryears on quadcore Opterons. My calculator reports 632.5 years, which is also close to the ~2 scaling you suggest for modern factorizations. (One wrinkle: they oversieved by a factor of 2, so perhaps this means they are actually 4 times faster rather than 2. But I'll leave this asis since such oversieving might still be needed.) Does anyone have some recent (last few years) timings on GNFS attempts I can use for calibration? Last fiddled with by CRGreathouse on 20140302 at 17:02 
20140302, 22:16  #6 
"Curtis"
Feb 2005
Riverside, CA
4789_{10} Posts 
Laptop i7quad (2600QM) @ 1.7Ghz (underclocked for heat management):
GNFS 108 took 0.6 threadhours to poly select, 10 threadhours to sieve, 1 threadhour for matrix. GNFS 116 took 2 GPUhours to poly select, 42 threadhours sieving, 2 threadhours matrix. GNFS 133 took 18 GPUhours poly select, 230 threadhours to sieve, 17 threadhours for matrix. GNFS 147 took 2 GPUdays poly select, 52 threaddays to sieve, 4 threaddays for matrix. The CPU had 8 threads running during all sieving phases; 4 on GGNFS, 4 on ECM or other tasks. Linear Algebra had 2 threads on msieve, 2 on ECM or other light tasks. Hyperthreaded sieving is about 2/3rds the speed per thread of one thread per core work, so 30% faster overall vs HT disabled. In other words, each thread gets about 1.1 Ghz worth of speed, but 8 * 1.1 > 4 * 1.7. Core2Quad @ 3.2Ghz, no HT: GNFS 139 took 24 GPUhrs poly select, 224 corehours sieving, 38 corehours matrix. 3 threads for sieve & matrix. 3 threads is too many for a Core2generation CPU for linalg see next entry. GNFS 143 took 24? GPUhrs poly select, 356 corehours sieving, 20 hours matrix. Sieve was 2 threads, matrix singlethreaded. Sievers are windows 64bit. Msieve versions varied over time, but most of these were before the recent 25% matrix speedup. Last fiddled with by VBCurtis on 20140302 at 22:24 Reason: i7 model number, sieve version 
20140302, 22:58  #7 
"Mr. Meeseeks"
Jan 2012
California, USA
3^{2}×241 Posts 
If you need any benchmarks, I have a Haswell 4670k available.

20140303, 23:13  #8 
Jun 2012
2^{3}·3·5^{3} Posts 
Have you looked here?
http://homepage2.nifty.com/m_kamada/math/graphs.htm I've got some personal data I will PM you tomorrow as well. 
20140303, 23:23  #9 
Jul 2003
So Cal
3^{2}·233 Posts 
It isn't too hard to extract meaningful data from NFS@Home. The time required for workunits doesn't vary too much for a given processor, so time based on the sieving range alone should be reasonably accurate.
Edit: See if this makes sense. These are the numbers with the e value in the poly file, and the sieve range is (q_end  q_start)/10^6. Code:
+++++++  name  difficulty  q_start  q_end  sieve_range  e_value  +++++++  2_1584P  168  45000000  305000000  260.0000  4.936e13   C162_126_83  162  20000000  140000000  120.0000  1.127e12   F2451  166  20000000  120000000  100.0000  6.331e13   F1953  166  20000000  130000000  110.0000  5.640e13   F1595  166  20000000  120000000  100.0000  5.919e13   L1354  163  20000000  115376000  95.3760  9.339e13   L1694  163  20000000  96000000  76.0000  9.137e13   L1911  167  20000000  120000000  100.0000  5.513e13   L2910  168  20000000  150000000  130.0000  3.942e13   L3335B  164  20000000  100000000  80.0000  7.898e13   L3945A  166  20000000  120000000  100.0000  5.890e13   L2975B  167  20000000  140000000  120.0000  5.068e13   C155_3408_1311  155  26000000  100000000  74.0000  3.230e012   C162_3408_1361  162  20000000  140000000  120.0000  1.164e012   C166_4788_5159  166  40000000  264000000  224.0000  5.314e013   C168_127_110  168  40000000  250000000  210.0000  4.765e013   C168_130_119  168  40000000  274000000  234.0000  5.320e013   C161_4788_5193  161  32000000  126000000  94.0000  1.406e012   C158_3366_2103  158  20000000  120000000  100.0000  1.950e012   C160_3408_1385  160  20000000  158000000  138.0000  1.333e012  +++++++ +++++++  name  difficulty  q_start  q_end  sieve_range  e_value  +++++++  G3p725  179.4  20000000  150000000  130.0000  1.047e13   C176_3270_677  176  15000000  160000000  145.0000  1.321e013   C178_3270_687  178  20000000  300000000  280.0000  9.700e14  +++++++ Last fiddled with by frmky on 20140303 at 23:59 
20140305, 00:02  #10 
"Carlos Pinho"
Oct 2011
Milton Keynes, UK
4940_{10} Posts 
NFS@Home postprocessing logs. All SNFS.
Last fiddled with by pinhodecarlos on 20140305 at 00:02 
20140305, 00:06  #11 
"Carlos Pinho"
Oct 2011
Milton Keynes, UK
2^{2}·5·13·19 Posts 
Excel sheet.

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Estimating time needed for GNFS  CRGreathouse  Factoring  0  20140302 04:18 
Predicting the needed time for high nvalues?  Rincewind  Sierpinski/Riesel Base 5  4  20090611 12:24 
GNFS poly search time limit oddity  Andi47  Msieve  13  20090225 12:33 
Discussion about CPU time needed and k's remaining  Siemelink  Conjectures 'R Us  41  20080711 23:05 
Running time for GNFS  koders333  Factoring  4  20060227 14:33 