20190504, 14:55  #1 
Apr 2019
Bay Area, CA
11 Posts 
Where in the RSA 180 woods am I?
I'm new to factoring but enjoying this forum a lot.
I bought a new RYZEN 2700x 16 core system and have been attacking the RSA 180 for a couple of days now. As a benchmark my system demolished the RSA 100 in 9 minutes. Below is my factor log. Is there any way to tell how far in the process I might be? At one point I hit a max threshold and it switched strategies. "error generating or reading NFS polynomials elapsed time of 186454.8044 seconds exceeds 67500 second deadline; poly select done nfs: commencing algebraic side lattice sieving over range: 41300000  41320000 nfs: commencing algebraic side lattice sieving over range: 41260000  41280000" Code:
********************************** Factor.log ******************************* 04/30/19 08:35:38 v1.34.5 @ ROCINANTE, **************************** 04/30/19 08:35:38 v1.34.5 @ ROCINANTE, Starting factorization of 191147927718986609689229466631454649812986246276667354864188503638807260703436799058776201365135161278134258296128109200046702912984568752800330221777752773957404540495707851421041 04/30/19 08:35:38 v1.34.5 @ ROCINANTE, using pretesting plan: normal 04/30/19 08:35:38 v1.34.5 @ ROCINANTE, no tune info: using qs/gnfs crossover of 95 digits 04/30/19 08:35:38 v1.34.5 @ ROCINANTE, **************************** 04/30/19 08:35:38 v1.34.5 @ ROCINANTE, rho: x^2 + 3, starting 1000 iterations on C180 04/30/19 08:35:38 v1.34.5 @ ROCINANTE, rho: x^2 + 2, starting 1000 iterations on C180 04/30/19 08:35:38 v1.34.5 @ ROCINANTE, rho: x^2 + 1, starting 1000 iterations on C180 04/30/19 08:35:38 v1.34.5 @ ROCINANTE, pm1: starting B1 = 150K, B2 = gmpecm default on C180 04/30/19 08:35:38 v1.34.5 @ ROCINANTE, current ECM pretesting depth: 0.00 04/30/19 08:35:38 v1.34.5 @ ROCINANTE, scheduled 30 curves at B1=2000 toward target pretesting depth of 55.38 04/30/19 08:35:38 v1.34.5 @ ROCINANTE, Finished 30 curves using Lenstra ECM method on C180 input, B1=2K, B2=gmpecm default 04/30/19 08:35:38 v1.34.5 @ ROCINANTE, current ECM pretesting depth: 15.18 04/30/19 08:35:38 v1.34.5 @ ROCINANTE, scheduled 74 curves at B1=11000 toward target pretesting depth of 55.38 04/30/19 08:35:41 v1.34.5 @ ROCINANTE, Finished 74 curves using Lenstra ECM method on C180 input, B1=11K, B2=gmpecm default 04/30/19 08:35:42 v1.34.5 @ ROCINANTE, current ECM pretesting depth: 20.24 04/30/19 08:35:42 v1.34.5 @ ROCINANTE, scheduled 214 curves at B1=50000 toward target pretesting depth of 55.38 04/30/19 08:35:48 v1.34.5 @ ROCINANTE, Finished 224 curves using Lenstra ECM method on C180 input, B1=50K, B2=gmpecm default 04/30/19 08:35:48 v1.34.5 @ ROCINANTE, pm1: starting B1 = 3750K, B2 = gmpecm default on C180 04/30/19 08:35:49 v1.34.5 @ ROCINANTE, current ECM pretesting depth: 25.35 04/30/19 08:35:49 v1.34.5 @ ROCINANTE, scheduled 430 curves at B1=250000 toward target pretesting depth of 55.38 04/30/19 08:36:38 v1.34.5 @ ROCINANTE, Finished 432 curves using Lenstra ECM method on C180 input, B1=250K, B2=gmpecm default 04/30/19 08:36:38 v1.34.5 @ ROCINANTE, pm1: starting B1 = 15M, B2 = gmpecm default on C180 04/30/19 08:36:44 v1.34.5 @ ROCINANTE, current ECM pretesting depth: 30.46 04/30/19 08:36:44 v1.34.5 @ ROCINANTE, scheduled 904 curves at B1=1000000 toward target pretesting depth of 55.38 04/30/19 08:43:16 v1.34.5 @ ROCINANTE, Finished 912 curves using Lenstra ECM method on C180 input, B1=1M, B2=gmpecm default 04/30/19 08:43:16 v1.34.5 @ ROCINANTE, current ECM pretesting depth: 35.56 04/30/19 08:43:16 v1.34.5 @ ROCINANTE, scheduled 2350 curves at B1=3000000 toward target pretesting depth of 55.38 04/30/19 09:29:45 v1.34.5 @ ROCINANTE, Finished 2352 curves using Lenstra ECM method on C180 input, B1=3M, B2=gmpecm default 04/30/19 09:29:45 v1.34.5 @ ROCINANTE, current ECM pretesting depth: 40.63 04/30/19 09:29:45 v1.34.5 @ ROCINANTE, scheduled 4480 curves at B1=11000000 toward target pretesting depth of 55.38 04/30/19 14:41:53 v1.34.5 @ ROCINANTE, Finished 4480 curves using Lenstra ECM method on C180 input, B1=11M, B2=gmpecm default 04/30/19 14:41:53 v1.34.5 @ ROCINANTE, current ECM pretesting depth: 45.73 04/30/19 14:41:53 v1.34.5 @ ROCINANTE, scheduled 7553 curves at B1=43000000 toward target pretesting depth of 55.38 05/01/19 22:53:28 v1.34.5 @ ROCINANTE, Finished 7568 curves using Lenstra ECM method on C180 input, B1=43M, B2=gmpecm default 05/01/19 22:53:28 v1.34.5 @ ROCINANTE, final ECM pretested depth: 50.86 05/01/19 22:53:28 v1.34.5 @ ROCINANTE, scheduler: switching to sieve method 05/01/19 22:53:28 v1.34.5 @ ROCINANTE, nfs: commencing nfs on c180: 191147927718986609689229466631454649812986246276667354864188503638807260703436799058776201365135161278134258296128109200046702912984568752800330221777752773957404540495707851421041 05/01/19 22:53:28 v1.34.5 @ ROCINANTE, nfs: commencing poly selection with 16 threads 05/01/19 22:53:28 v1.34.5 @ ROCINANTE, nfs: setting deadline of 67500 seconds 05/04/19 02:41:03 v1.34.5 @ ROCINANTE, nfs: completed 16 ranges of size 250 in 186454.8044 seconds 05/04/19 02:41:03 v1.34.5 @ ROCINANTE, nfs: best poly = # norm 1.442139e017 alpha 7.879891 e 7.667e014 rroots 5 05/04/19 02:41:03 v1.34.5 @ ROCINANTE, nfs: commencing lattice sieving with 16 threads 05/04/19 05:19:54 v1.34.5 @ ROCINANTE, nfs: commencing lattice sieving with 16 threads Alien 
20190504, 15:15  #2  
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2
10011100000111_{2} Posts 
Quote:
So a very rough estimate of your remaining time is t = 9 * 2^(80/5) = 589824 minutes ~= 9800 hours ~= 410 days Scaling from 9 minutes to 410 days is arguably imperfect. You may want to run a 140 digit challenge to get a better time estimate. Then a 150 digit. Both of these will be a tiny fraction of the time for the GNFS180. The analogy will also be imperfect in terms of memory needed to solve the resulting linear algebra. Search the forum for a past result in GNFS180. Something like 32Gb of memory may be needed; maybe more. In short, where in the woods are you? You are in the edge tree line; you haven't even fully entered the forest. 

20190504, 17:04  #3  
Apr 2019
Bay Area, CA
11 Posts 
Quote:


20190504, 17:27  #4 
"Curtis"
Feb 2005
Riverside, CA
2^{3}·5·139 Posts 
The journey of learning to factor is not best taken by jumping in to a 1year job (which has already been done, besides).
There are many settings to learn about, and the standard tools available publicly have good settings for the wellworn job sizes under 150 or 160 digits, but after that you can save 25% or more of the job time by changing some settings yourself. You don't learn more from a year on a C180 than you do from a week on a C150, but you get to learn a little bit each time you complete a job of whatever size. When you did your C100 you got to watch text go by indicating various phases of the job; on a C120 those go by ~15 times slower. 1. Does each phase scale by 15x? Which scale worse, or better? 2. Should they all scale similarly, or is this a setting you can adjust for yourself? 3. Do more cores affect each phase similarly? Which parts scale perfectly by number of cores, and which are limited to a small number of threads? 4. Do you have an Nvidia GPU, and is it set up to use CUDA software for the first phase? If the first phase is highly effective (read: lucky), the second phase can be as much as 30% faster. In fact, the bestscoring polynomial ever found for this size of job is just that: 30% faster. Your software spent just over 2 days on poly select, while a typical rule of thumb is to spend 5% of the estimated job length on that first phase. Like I said at the start, the tools don't have good settings for jobs this size, and you should do smaller jobs so as to not waste 3040% of your CPU time. Given the notgreat score of your poly, I'd add 25% to Batalov's estimate: 520 Days. You could spend 8 weeks doing a dozen C140160 jobs and perhaps learn enough to take 12 weeks off the time taken on this C180. I'm conservative, but I spent 4 *years* learning the tools and settings before I tackled a C180; it took about two months on a workstation of roughly the caliber of yours. 
20190505, 06:04  #5  
Apr 2019
Bay Area, CA
11 Posts 
Quote:
Thanks for the feedback and course correction. Do you have any recommended tuning links or can I more information in this forum? Thanks again, Alien 

20190505, 13:35  #6 
"Curtis"
Feb 2005
Riverside, CA
2^{3}·5·139 Posts 
Do you run linux or windows? Do you have a CUDAsupported GPU?
On Linux, CADONFS has easier to control settings, and a (lengthy) dedicated thread within this subforum for our collective discoveries about tuning the package. It's not available (well, not "easily" compiled) for Windows, alas. Are you presently using YAFU, or factmsieve.py? If you have the .poly and .fb files, you could compare the settings your software chose with the settings for inputs near C180 on the subforum "NFS@home" thread "queue management"; say, C177 to c183 should be relevant. But, to do that, you'd need to know what the abbreviations mean. I lack time this morning to explain that, but I'd start with the wiki entries on Integer Factorization and the General Number Field Sieve. The latter has some useful links. Someone in the forum has likely posted a guide at some point in the past, but I don't personally know where (in fact, maybe I'll write one!). 
20190505, 13:53  #7 
"Ben"
Feb 2007
111010001011_{2} Posts 
From the log in the first post it looks like YAFU is being used.
Another thing to learn is that for RSA numbers, where the approximate size of the factors is known, it is pointless to run ECM curves on them. ECM's effectiveness decreases with the size of the smallest *factor* of the input. By design, RSA numbers have very large smallest factors; thus, ECM will not be effective. YAFU's factor function is intended to be used where one has no knowledge of the input and therefore rho, p1, and ecm have a chance to pull out small factors. When you know beforehand that there are no small factors, you can skip directly to NFS (e.g., yafu "nfs(yourinput)" instead of yafu "factor(yourinput)"). This would have saved you 38 some hours of needless ECM in this case. As VBCurtis has mentioned, YAFU hasn't been tuned well (at all) for inputs this large. It may work, but a little manual intervention would likely save a lot of time. 
20190505, 21:46  #8  
Apr 2019
Bay Area, CA
11 Posts 
Thanks VBCurtis and bsquared for the help.
Quote:
USE_CUDA = False GPU_NUM = 0 MSIEVE_POLY_TIME_LIMIT = 0 MIN_NFS_BITS = 264 Quote:
Quote:


20190506, 00:44  #9 
"Curtis"
Feb 2005
Riverside, CA
15B8_{16} Posts 
GPU_NUM refers to the graphics card ID, for systems with multiple GPUs. You can safely leave it unaltered.
tl;dr version of poly select stepbystep: I've been using RSA130 as input for some timing tests for parameters I've been working on. Perhaps we'd both benefit from you factoring this same number once with YAFU and once with factmsieve.py, the better to see how things work. factmsieve.py is less automatic than YAFU; specifically, it requires you to find a polynomial manually using msieve. Msieve's poly select has 3 stages: np1, the part that is GPUenhanced, finds raw polynomial candidates. The GPU version is something like 50x faster than the CPU version (depending on GPU, of course; mine is an old 750ti but does just fine for this). nps is the "size optimization" step; this is where msieve tests each raw candidate for potential, as measured by "stage2_norm". This takes a fraction of a second per test, which is good since the np1 step can produce dozens of candidates per second. npr is the final "root optimization" step, where a slower process is used to 'polish' the bestscoring polynomials from the nps step. This is very timeconsuming, and best done on only the top, say, 100 scores from the nps step. Most folks use the "sort" and "head" commands to filter the text file output from nps and create an input file for the npr step, which is commonly done on its own. msieve expects the input number to reside on its own in worktodo.ini. Create that file, then at the command line: msieve np1 nps "1000,1000000" s testnumber1 1000,1000000 is the range of leading coefficients to search; this can be set to whatever range you like; one experiments to see how long a particular range takes or just sets something much too large and ctrlc's when enough time has passed. If you're using a GPU, the msieve default stage2_norm is too loose, and you'll generate excessive output to the testnumber1.ms file (like dozens of polynomials per minute, when you only want ~100 from the entire run). The workaround is to invoke the command above, kill it as soon as output fills the screen (there is a momentary pause while the program starts up, and then silly amounts of output). Open msieve.log, see what msieve chose as stage2norm in the log file. Divide this number by 30ish (if you have a fast GPU, divide by 35 or 40; if it's slow like a laptop GPU or the job is small like C130 or lower, divide by 25 or 20). Then, erase the testnumber1.ms file that was created from the few seconds of the first run, and invoke: msieve np1 nps "1000,1000000 stage2_norm={numberyoucalculated}" s testnumber1 Unfortunately, both of my CUDA installs are broken right now, so I cannot follow these directions myself to see what the output is. I would expect RSA130 to take about 10 hours on a quadcore machine; I equate a GPU to 4 CPU cores, and poly select should run for about 5% of the expected job time. 5% of 10 hours is not very long! About 30 minutes. After 25 or 30 minutes, ctrlc to stop msieve. Open testnumber1.ms text file, and see how many lines are in there. If you have more than 200, the divisor you chose was too small, and you should consider using sort to arrange the candidates in order from best to worst: sort g k 10 testnumber1.ms >testnumber1sorted.ms You can then open this new file, scroll down 100 lines or so, and delete the rest of the file. If your testnumber1.ms file has fewer than 50 lines, the divisor you chose was too large, and we didn't generate very many polynomials. That's not a horrible thing, as the best scores are the ones that are saved (that's what stage2_norm measures the quality of the candidates). It does tell you to use a smaller divisor in the future. I try to pick the magic divisor that gives me the right amount of output without having to muck about with "sort", but it's hardly important. some folks don't change the divisor at all; they just sort the thousands of polys in the output file, and keep the top 100. You'll notice that some lines in the .ms file are duplicates of each other. The "unique" command removes duplicates: uniq test1outputsorted.ms >test1unique.ms sort and uniq are unix commands. I forget if they're available in windows by default, or if I downloaded a package of free "unix commands for windows" set that provided them. Rootopt on a 130 digit number doesn't take very long, so omitting this step won't add much time to the npr step; it's just not that big a deal to rootopt a couple of polys multiple times. Anyway, you're now ready to do the final rootoptimization step: msieve npr s {filename after any sort or unique steps, without the .ms suffix} In this example, that would be: msieve npr s test1unique Lots of lines will go by, each one representing the statistics of a polished polynomial. msieve will save the bestscoring one under the file msieve.fb, and it will also print the best poly with its score to msieve.log. YAFU automates all of this. I've no idea what settings YAFU uses, how long it allows for np1 nps steps, how many it rootopts, etc. For a C130, it won't matter very much! That's why YAFU is so useful for small inputs; no manual labor, at the possible cost of 5% or 10% of total elapsed time. I suggest you let YAFU poly select RSA130, then kill the job and note the poly's score. In your pasted output for RSA180, the score is the number after e in the line: norm 1.442139e017 alpha 7.879891 e 7.667e014 rroots 5 Then, follow my recipe above but only let the np1 nps step run for 10 minutes. do rootopt, note the score of the poly that is output. Then, do it again but allow 30 or 40 minutes for np1 nps. Note the poly score. Sieving speed (the longest step by far of factorization) is proportional to that poly score, roughly speaking. So, if all the manual work finds a poly scoring 10% better than the one YAFU found, your manual efforts have saved 10% off the expected sieve time. 10% of a 40ishCPUhour job for RSA130 isn't all that meaningful. But, if you learn to use the tools as above and can find a 10% better scoring poly for RSA180, you just saved a month or more on your rig! I don't post tutorials because I'm not a very clear writer; please ask questions about any part in which I was confusing. Last fiddled with by VBCurtis on 20190506 at 00:48 Reason: edited some numbers that were in disagreement 
20190506, 01:26  #10 
"Curtis"
Feb 2005
Riverside, CA
2^{3}·5·139 Posts 
On RSA130:
msieve 1.53, GPUenabled binary RichD sent my way: stage1_norm 3.88e+19 stage2_norm 9.34e+17 msieve SVN 1028, selfcompiled with no GPU support: stage1_norm 2.05e+20 stage2_norm 6.44e+16 The latter was allowed to run on CPU for a few minutes; it seems to save 35 hits per minute to the .ms file. This is a bit too much output, as I'd want to give 2 CPUhours to the search (per the 5% of guesstimated total job length rule of thumb); after 2 hours I'd have 500800 hits. 9.34e+17 divided by 30 is 3e+16. That sounds about right for a GPUenabled search. msieve np1 nps "1000,1000000 stage2_norm=3e16" s testnumber1 if your msieve is GPUenabled and your CUDA install works, output you will have. 
20190506, 01:58  #11  
Apr 2019
Bay Area, CA
11 Posts 
Quote:
Thanks for the super detailed post! I'm trying the above but don't see much GPU usage yet. fyi, I don't have an 'msieve.exe' I only have 'msieve1.53SVN998win64core2.exe' 1) Do I have the wrong binary? if so do you know where can I get a different one? 2) Any idea why my GPUs aren't engaged? 3) Seeing a ton of values fly by in at the command prompt but no hits yet in the .ms file. 1584 12738790608913 16272135079752196210474712 1584 34337744182451 16272135081116971690678241 1584 7678710829783 16272135080281840469834043 1584 21358728303851 16272135080775711427898844 1584 39084391013831 16272135080079524779089986 1584 1384940045431 16272135080226697437126266 1584 27803826879721 16272135080642182299793035 1584 32828762326649 16272135080641014750871832 1584 1438736427581 16272135080225917153575710 1584 13460387837819 16272135080476752277635070 thanks again, Alien 
