mersenneforum.org Using aliqueit logs to determine siqs/gnfs crossover point.
 Register FAQ Search Today's Posts Mark Forums Read

 2020-01-12, 23:27 #1 PFPoitras   "Patrick Poitras" Oct 2019 2×33 Posts Using aliqueit logs to determine siqs/gnfs crossover point. Hello all, I did the calibration test with Yafu in order to try to check when it becomes advantageous to use gnfs rather than siqs, and I wanted to check to make sure that the test was accurate. As such I had the idea to use the aliqueit log file that is automatically generated and to write a simple script that parsed the information contained in it to see where the crossover point actually is. The script is located on github here, and produces one of two graphs. Both of them are scatter plots with log10(n) on the x axis and time on the y axis. There is a non-zoomed version and a version that zooms in on the crossover point. The script is using Python 3 with numpy/scipy and matplotlib. I got all three through Anaconda, which includes a bunch of useful Python modules. It's free, and can be downloaded here. I have attached example graphs. The crossover point is closer to 97 digits in practice, rather than the Yafu suggested 95. Although it seems to me that the difference is not very important, and I have not quantified the difference. Either way, I thought this would maybe prove useful to some, and if not useful, perhaps entertaining. I'm also looking to improve the tool if anyone has some ideas, I'd be happy to help. Edit: For the generated files to be attachable here, the DPI must be reduced to 600 Attached Thumbnails     Last fiddled with by PFPoitras on 2020-01-12 at 23:29
 2020-01-13, 01:59 #2 VBCurtis     "Curtis" Feb 2005 Riverside, CA 22×7×132 Posts CADO is sufficiently faster than YAFU/ggnfs to perhaps justify someone adapting aliqueit to call CADO and read results. I'll see about running some tests on the current CADO git, but I think the crossover from YAFU-siqs to CADO is down around 92 digits. I've claimed I would do a detailed comparision previously, but I don't find my own notes from a year ago. Your post is a nice reminder to (1) get this comparison done, and (2) to maybe encourage someone to tackle an aliqueit-CADO adaptation.
 2020-01-13, 03:02 #3 PFPoitras   "Patrick Poitras" Oct 2019 2×33 Posts CADO seems really promising, but I've not been able to get it to compile. If we can get some binaries, or otherwise get it working, I'm willing to help. I think it should be rather straightforward to implement. Aliqueit or yafu calls ggnfs' executable, so if we can substitute CADO at that point there shouldn't even need recompilation. If we would need CADO to accept different inputs, we could either modify aliqueit, or have a bootstrapper program that converts the input and then calls CADO.
 2020-01-13, 06:29 #4 VBCurtis     "Curtis" Feb 2005 Riverside, CA 22×7×132 Posts CADO accepts the input number on the command line, and the last line of its prodigious output are the factors. Aliqueit/YAFU would not be expected to manage poly select nor the matrix, and msieve would never be called. I don't have any advice on getting it compiled on windows, but on linux I've only ever had a compile fail recently when trying a December'19 git on a core-2-era CPU; it failed with an error saying a vector instruction was missing. No modern setup has failed to compile for me. Some of the notes that come with CADO suggest that getting the siever compiled in windows should be fairly straightforward, but the postprocessing steps are nigh hopeless.
 2020-01-14, 15:41 #5 EdH     "Ed Hall" Dec 2009 Adirondack Mtns 2×7×263 Posts I haven't looked lately, but Aliqueit should run whatever script you would like in place of factmsieve.py in the aliqueit.ini file, so a wrapper for CADO-NFS should be fairly easy to write, having it return whatever format factmsieve.py returned. I did something similar for the ecm.py entry to run ECM across several machines, although that was quite some time ago. BTW, the 95 digits crossover shown by YAFU in the trunk version is possibly not the crossover from "tune." Check the second to last value in the tune_info line in the yafu.ini file for the calculated value. e.g.: Code: tune_info= Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz,LINUX64,1.26308e-05,0.203376,0.337341,0.100415,98.9965,3392.63 One additional note: yafu.ini has to be in the calling directory for it to be found. @VBCurtis: Thanks for verifying my Core 2 troubles with the latest git version. I was going to ask here after a few more trials. My older machines are running an earlier version, so they're still productive.
 2020-01-14, 19:32 #6 VBCurtis     "Curtis" Feb 2005 Riverside, CA 127C16 Posts I located my notes from 11 months ago, where I ran some single-threaded factorizations on YAFU-siqs versus CADO. At 92 digits, the timings were identical, 31 minutes for each on a 5820k@3.3ghz. I didn't finish finding the multi-threaded crossover size, but I believe it's even lower. I'll see if I can finish those tests by this weekend and report back here. As for the core2 compilation problem, I suppose I should post to their support mailing list and ask if there is a known point at which core2 support failed; I tried reverting to an older CADO on those old machines, and then found that my clients running the current CADO wouldn't fetch work from the old version. May have been a networking (or some other) issue, but the 3 core2 nodes all connected easily while the Z600 didn't.
 2020-01-14, 21:16 #7 henryzz Just call me Henry     "David" Sep 2007 Cambridge (GMT/BST) 16DE16 Posts Is CADO getting to the point where NFS@home should possibly consider switching to it?
 2020-01-15, 02:38 #8 VBCurtis     "Curtis" Feb 2005 Riverside, CA 22·7·132 Posts What do you think the best way is to go about comparing speed? Comparing personal projects on the same machine (i7-5820 Haswell-E, 6 core at stock speed), my fastest C156 with a modified factmsieve script and ggnfs was 7 days, while on CADO I've done a C155 in a tick over 4 days (100 hours). If we allow 15% for 1 digit of difficulty, we're talking ~115 hours for CADO vs 170 for ggnfs. Such a comparison doesn't consider the variety of hardware used for NFS@home, though. Roughly, CADO makes GNFS jobs about 3 digits easier in the range of the 14e queue, and CADO's 14e is preferred all the way up to ~180 digits (I've heard of someone using I=14 to factor GNFS190!) It would be nice to know how much faster CADO is now, but that doesn't address the substantial effort it would take to BOINCify the sieving client. Thoughts?
 2020-01-15, 10:09 #9 henryzz Just call me Henry     "David" Sep 2007 Cambridge (GMT/BST) 10110110111102 Posts If I was processing on a boinc project that wasn't upgrading to a client that was ~1.5x faster I would be irritated. It is a waste of resources. There are also memory considerations. While I believe that CADO uses more memory, I believe a 4 core client would use less. This could help switch effort towards I=16. It is also worth bearing in mind that By adjusting A it is possible to adjust I in half steps. As far as I see it, ggnfs is no longer developed and CADO is continuing to get faster. We are probably going to want to do it at some point so why not now. Does the binary itself have to be modified or can it be called by a script?
 2020-01-15, 16:08 #10 VBCurtis     "Curtis" Feb 2005 Riverside, CA 22·7·132 Posts Everything in CADO is called / managed by python scripts, so a BOINC effort should be no different if desired. There is only one sieving binary, named las; siever area is a parameter passed to the siever. Memory use is higher, but as you note only one copy needs be run per socket so overall memory use may remain similar to ggnfs. I=15 at ~205 digits -> 2.5GB per multithreaded process. I=15 at ~186 digits -> under 2GB. I=16 is 4x as much memory, but since CADO handles large Q so much better than ggnfs, I=16 can be easily extended to huge projects, say GNFS-230. If las were used today on NFS@home, I'd use I=14 up to 183 digits, and I=15 up to 207 digits. I haven't gotten the A parameter to work yet, perhaps in a coming-soon commit.
 2020-01-15, 22:51 #11 PFPoitras   "Patrick Poitras" Oct 2019 2×33 Posts I have pushed a new update to the repo to fix a bug where the program would crash if there hadn't been either siqs or nfs done in the directory. Please let me know if you encounter any other problems. I have the suspicion that the grep-like section of the code will fail at the first encountering of any weird outputs in the logs.

 Similar Threads Thread Thread Starter Forum Replies Last Post bsquared Factoring 24 2016-01-25 05:09 VBCurtis Factoring 11 2015-03-09 07:01 EdH Aliquot Sequences 6 2011-12-13 18:58 fivemack Factoring 7 2009-04-21 07:59 svempasnake Software 2 2002-09-09 21:32

All times are UTC. The time now is 10:57.

Tue Apr 13 10:57:36 UTC 2021 up 5 days, 5:38, 1 user, load averages: 1.34, 1.64, 1.60