20190403, 00:57  #1 
"Curtis"
Feb 2005
Riverside, CA
4678_{10} Posts 
Poly select and testsieving for RSA232
The path for mere mortals to contribute to a factorization of RSA896 or RSA1024 may include getting an empirical sense for how our current tools scale for projects of 230+ digits.
As a start on this path, I spent about two coreweeks on CADO poly select for RSA232: Code:
n: 1009881397871923546909564894309468582818233821955573955141120516205831021338528545374366109757154363664913380084917065169921701524733294389270280234380960909804976440540711201965410747553824948672771374075011577182305398340606162079 skew: 2344396.759 type: gnfs c0: 4444911653278229819370022568140110402793683027936 c1: 1601077621210143696221661846379560448537934 c2: 10503357361068234616583254671616966819 c3: 106292952091953896748493562554 c4: 2012141349873927583036795 c5: 21775742789901120 c6: 43338240 Y0: 48749394190388072676687368853974603055 Y1: 1395619071010682498582953 # MurphyE (Bf=5.498e+11,Bg=6.872e+10,area=8.590e+16) = 2.92e08 # found by revision 78c3a74 On our bestpolys table, an increase of 16 digits corresponds to an orderofmagnitude decrease in MurphyE. ~4.7e14 is the record for 184 digits, ~5.3e15 is the record for 200 digits, and ~4.6e16 (deg 6) is the record for 216 digits. So, this score at 5.4e17 is a decent baseline polynomial to do some parameterchoice work. I tried to feed the polynomial from the RSA768 factorization paper into msieve to compare scores, but it coredumped and I couldn't figure out my mistake. I'll report some testsieve results using 16f next. 
20190403, 02:59  #2 
Jun 2003
4861_{10} Posts 
Code:
searching for 15digit factors commencing number field sieve (232digit input) R0: 1291187456580021223163547791574810881 R1: 34661003550492501851445829 A0: 277565266791543881995216199713801103343120 A1: 18185779352088594356726018862434803054 A2: 6525437261935989397109667371894785 A3: 46477854471727854271772677450 A4: 5006815697800138351796828 A5: 1276509360768321888 A6: 265482057982680 skew 44000.00, size 7.072e017, alpha 7.298, combined = 7.024e017 rroots = 6 
20190403, 04:23  #3 
"Curtis"
Feb 2005
Riverside, CA
2·2,339 Posts 
Thank you!
It should be noted that axn posted the poly for RSA768; its score is 30% better after 20 coreyears of search (~eleven years ago) than my RSA232 after 2 coreweeks. Last fiddled with by VBCurtis on 20190403 at 04:24 
20190404, 18:42  #4 
"Curtis"
Feb 2005
Riverside, CA
2·2,339 Posts 
RSA768 was factored with the equivalent of I=16 on CADO (same sieve region as NFS@home's 16fV5 siever). I=17 may be faster, but has a memory footprint too large for most users (I failed to get a single instance to start on a 24GB system a few months ago). Alas, I haven't yet figured out how to testsieve within CADO; so instead I tried a little test with the 16f siever. "top" reported 3.5GB memory use for rlim=980M, alim=1640M. I also tested rlim=alim=1070M, but memory use only dropped to 2.9GB while yield was more than 10% worse and sec/rel was little changed.
My testsieve used mfbr=71, lpbr=37, mfba=110, lpba=40. These settings are a bit tighter than those used by the group that factored RSA768: They used smaller lim's to save memory but 40LP on both sides, mfbr=110, mfba=140 (4 large primes!). I initially tried 107/39 for aside, but 110/40 had approximately twice the yield. Code:
#3740/71110, lambda 2.7/3.7: dQ=500 #Q=100M 7736 rels, 0.296 sec/rel #Q=600M 5041 rels, 0.441 sec/rel #Q=1100M 5154 rels, 0.483 sec/rel #Q=1600M 3088 rels, 0.571 sec/rel #Q=2100M 1741 rels, 0.634 sec/rel #Q=2600M 2818 rels, 0.648 sec/rel #Q=3100M 2683 rels, 0.674 sec/rel #Q=3600M 3516 rels, 0.696 sec/rel #Q=4100M 1574 rels, 0.775 sec/rel Yield is over 10 from 100M to 1200M, good for ~12G relations. Yield is over 5 for the rest of the Qrange, say 1200M to 3800M for ~15G relations. So, as a proofofconcept, the 16f siever appears sufficient to crack RSA232. If I had hundreds of cores at my disposal, I'd aim the largememory cores to CADO with I = 17 to run Q=50M to 200M, and for mediummemory cores I'd use 16f with 3.5GB/core. Once I solve CADO testsieving, I'll report I=16 memory use and yield. Sec/rel is measured on a Haswell i75820, a 6core 3.3ghz CPU with 6 other tasks running. At half a second per relation, we're looking at 14 gigacoreseconds, or 450 coreyears. This is massively less than the 2000 CPUyears that RSA768 took, but those were measured on c. 2008 2.2ghz AMD processors. However, my CPU is not 4 times faster than theirs were, which suggests my testsieving is flawed in some way. I don't see any problem with yields with just 2RLP/3ALP, while the RSA768 team sieved Q=450M11G using 3RLP/4ALP; again, this suggests my testsieving is flawed. Then again, they used rlim=200M & alim =1100M to fit into 2GB/core, which may have impacted yield and efficiency more than I realize. Their matrix was ~190M by 190M, and better param choice + fewer large primes should lead to a yetsmaller matrix for RSA232. If anyone is interested in taking another baby step toward such a factorization, they are invited to find a better polynomial. Poly select has improved quite a lot in 10 years, so we should be able to beat the RSA768 score by at least 5%. If we do improve on my poly by 20% or more, a more detailed testsieve could demonstrate that we need fewer than 400 coreyears for sieving, a level possible within this forum. 
20190404, 18:51  #5 
"Curtis"
Feb 2005
Riverside, CA
2×2,339 Posts 
In case anyone is silly enough to try some polyselect, here is what I ran in CADO for my small search:
Code:
tasks.polyselect.degree = 6 tasks.polyselect.P = 40000000 tasks.polyselect.admax = 1e5 tasks.polyselect.adrange = 120 tasks.polyselect.incr = 60 tasks.polyselect.nq = 7776 # this is 6^5 tasks.polyselect.nrkeep = 36 tasks.wutimeout = 36000 # required for rootsieve in degree 6 Last fiddled with by VBCurtis on 20190404 at 18:51 
20190404, 20:39  #6 
"Carlos Pinho"
Oct 2011
Milton Keynes, UK
1303_{16} Posts 
I’ll help sieving and I can use up to 4GB/thread.

20190406, 19:13  #7 
"Max"
Jun 2016
Toronto
2·3·7·17 Posts 
RSA768
Usually because msieve doesn't properly process minuses copied from TEX/PDF documents ( − ). They should be manually changed into regular ones (  ).

20190406, 21:10  #8 
Sep 2008
Kansas
2·31·53 Posts 

20190406, 22:36  #9 
"Max"
Jun 2016
Toronto
2·3·7·17 Posts 

20190406, 23:15  #10  
"Curtis"
Feb 2005
Riverside, CA
2×2,339 Posts 
Quote:
I don't mind buying an extra disk for my workstation and running an internetfacing CADO server on that disk, which would allow us to collect relations at our leisure. Do 15G relations fit into 1GB? Do I need double the disk space (e.g. 2TB dedicated disk) of the expected relation set to handle things like uniqueing, compression, filtering? Also, thanks to Max for pointing out the minussign problem with copypaste! I did exactly as he suspected. Last fiddled with by VBCurtis on 20190406 at 23:16 

20190406, 23:22  #11 
"Curtis"
Feb 2005
Riverside, CA
1001001000110_{2} Posts 
I think this particular candidate falls into a noman'sland, where the lack of possible record factorization acts against a possible XSEDE proposal yet it's too big for any single machine we have access to. I'm willing to spend the $200 to upgrade my 20core xeon to 128GB, which is likely enough to solve a matrix from a GNFS215 or GNFS220 project, but insufficient for RSA232. Perhaps a pair of infiniband cards would allow me to solve RSA232 personally in under a year, but that sounds more like a hope than a plan!
However, if we went after RSA240, we might be able to get XSEDE to grant cluster time for the matrix. That should take on the order of 1000 coreyears of sieving, way too much for forumites alone! In the interests of gaining experience with big jobs, I've volunteered to postprocess a C206 for swellman that Greg is sieving on 16f; fivemack/Greg are doing the same for the C205 from Aliq 276. If we're serious as a group about doing some quixotic teamsieve, I suggest we find a GNFS in 210 to 218 range to teamfactor, so that we can collectively iron out details of data management etc. If my time estimates are correct, we can knock one out with 50 to 100 coreyears of sieve, and less than a machineyear of postprocessing; if they're not, I'd rather learn I'm wrong by 50 years than 500! There are cloudservice machines with big memory that could be rented for this matrix, which would well and truly fit a publication of this factorization using "publicly available tools!" If that goes well, we can decide whether to attack RSA232 or RSA240? In the meantime, there's no harm in polyselecting for either one. I have msieve aimed at RSA232 right now, and I'll start another thread for RSA240 poly select sometime this month. 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
YAFU Poly Select Deadline  amphoria  YAFU  22  20160917 09:47 
msieve poly select: choosing Stage1norm  VBCurtis  Msieve  0  20160411 21:33 
Starting NFS skipping poly select  jux  YAFU  5  20160102 01:01 
Poly Search vs Sieving times  EdH  Factoring  10  20131014 20:00 
Test Sieving Questions  nstaab1  Lounge  15  20130306 13:48 