mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Aliquot Sequences

Reply
 
Thread Tools
Old 2020-01-12, 23:27   #1
PFPoitras
 
"Patrick Poitras"
Oct 2019

2·33 Posts
Default Using aliqueit logs to determine siqs/gnfs crossover point.

Hello all,

I did the calibration test with Yafu in order to try to check when it becomes advantageous to use gnfs rather than siqs, and I wanted to check to make sure that the test was accurate. As such I had the idea to use the aliqueit log file that is automatically generated and to write a simple script that parsed the information contained in it to see where the crossover point actually is.

The script is located on github here, and produces one of two graphs. Both of them are scatter plots with log10(n) on the x axis and time on the y axis. There is a non-zoomed version and a version that zooms in on the crossover point.

The script is using Python 3 with numpy/scipy and matplotlib. I got all three through Anaconda, which includes a bunch of useful Python modules. It's free, and can be downloaded here.

I have attached example graphs. The crossover point is closer to 97 digits in practice, rather than the Yafu suggested 95. Although it seems to me that the difference is not very important, and I have not quantified the difference. Either way, I thought this would maybe prove useful to some, and if not useful, perhaps entertaining.

I'm also looking to improve the tool if anyone has some ideas, I'd be happy to help.

Edit: For the generated files to be attachable here, the DPI must be reduced to 600
Attached Thumbnails
Click image for larger version

Name:	graph_zoomed.png
Views:	57
Size:	243.3 KB
ID:	21600   Click image for larger version

Name:	graph.png
Views:	62
Size:	389.8 KB
ID:	21601  

Last fiddled with by PFPoitras on 2020-01-12 at 23:29
PFPoitras is offline   Reply With Quote
Old 2020-01-13, 01:59   #2
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

10DF16 Posts
Default

CADO is sufficiently faster than YAFU/ggnfs to perhaps justify someone adapting aliqueit to call CADO and read results. I'll see about running some tests on the current CADO git, but I think the crossover from YAFU-siqs to CADO is down around 92 digits.

I've claimed I would do a detailed comparision previously, but I don't find my own notes from a year ago. Your post is a nice reminder to (1) get this comparison done, and (2) to maybe encourage someone to tackle an aliqueit-CADO adaptation.
VBCurtis is offline   Reply With Quote
Old 2020-01-13, 03:02   #3
PFPoitras
 
"Patrick Poitras"
Oct 2019

1101102 Posts
Default

CADO seems really promising, but I've not been able to get it to compile. If we can get some binaries, or otherwise get it working, I'm willing to help.

I think it should be rather straightforward to implement. Aliqueit or yafu calls ggnfs' executable, so if we can substitute CADO at that point there shouldn't even need recompilation. If we would need CADO to accept different inputs, we could either modify aliqueit, or have a bootstrapper program that converts the input and then calls CADO.
PFPoitras is offline   Reply With Quote
Old 2020-01-13, 06:29   #4
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

7·617 Posts
Default

CADO accepts the input number on the command line, and the last line of its prodigious output are the factors. Aliqueit/YAFU would not be expected to manage poly select nor the matrix, and msieve would never be called.

I don't have any advice on getting it compiled on windows, but on linux I've only ever had a compile fail recently when trying a December'19 git on a core-2-era CPU; it failed with an error saying a vector instruction was missing. No modern setup has failed to compile for me.

Some of the notes that come with CADO suggest that getting the siever compiled in windows should be fairly straightforward, but the postprocessing steps are nigh hopeless.
VBCurtis is offline   Reply With Quote
Old 2020-01-14, 15:41   #5
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

2×7×239 Posts
Default

I haven't looked lately, but Aliqueit should run whatever script you would like in place of factmsieve.py in the aliqueit.ini file, so a wrapper for CADO-NFS should be fairly easy to write, having it return whatever format factmsieve.py returned. I did something similar for the ecm.py entry to run ECM across several machines, although that was quite some time ago.

BTW, the 95 digits crossover shown by YAFU in the trunk version is possibly not the crossover from "tune." Check the second to last value in the tune_info line in the yafu.ini file for the calculated value. e.g.:
Code:
tune_info=        Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz,LINUX64,1.26308e-05,0.203376,0.337341,0.100415,98.9965,3392.63
One additional note: yafu.ini has to be in the calling directory for it to be found.



@VBCurtis: Thanks for verifying my Core 2 troubles with the latest git version. I was going to ask here after a few more trials. My older machines are running an earlier version, so they're still productive.
EdH is offline   Reply With Quote
Old 2020-01-14, 19:32   #6
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

7×617 Posts
Default

I located my notes from 11 months ago, where I ran some single-threaded factorizations on YAFU-siqs versus CADO. At 92 digits, the timings were identical, 31 minutes for each on a 5820k@3.3ghz. I didn't finish finding the multi-threaded crossover size, but I believe it's even lower. I'll see if I can finish those tests by this weekend and report back here.

As for the core2 compilation problem, I suppose I should post to their support mailing list and ask if there is a known point at which core2 support failed; I tried reverting to an older CADO on those old machines, and then found that my clients running the current CADO wouldn't fetch work from the old version. May have been a networking (or some other) issue, but the 3 core2 nodes all connected easily while the Z600 didn't.
VBCurtis is offline   Reply With Quote
Old 2020-01-14, 21:16   #7
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

10110010110102 Posts
Default

Is CADO getting to the point where NFS@home should possibly consider switching to it?
henryzz is offline   Reply With Quote
Old 2020-01-15, 02:38   #8
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

7×617 Posts
Default

What do you think the best way is to go about comparing speed?

Comparing personal projects on the same machine (i7-5820 Haswell-E, 6 core at stock speed), my fastest C156 with a modified factmsieve script and ggnfs was 7 days, while on CADO I've done a C155 in a tick over 4 days (100 hours). If we allow 15% for 1 digit of difficulty, we're talking ~115 hours for CADO vs 170 for ggnfs. Such a comparison doesn't consider the variety of hardware used for NFS@home, though. Roughly, CADO makes GNFS jobs about 3 digits easier in the range of the 14e queue, and CADO's 14e is preferred all the way up to ~180 digits (I've heard of someone using I=14 to factor GNFS190!)

It would be nice to know how much faster CADO is now, but that doesn't address the substantial effort it would take to BOINCify the sieving client.

Thoughts?
VBCurtis is offline   Reply With Quote
Old 2020-01-15, 10:09   #9
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

165A16 Posts
Default

If I was processing on a boinc project that wasn't upgrading to a client that was ~1.5x faster I would be irritated. It is a waste of resources.
There are also memory considerations. While I believe that CADO uses more memory, I believe a 4 core client would use less. This could help switch effort towards I=16. It is also worth bearing in mind that By adjusting A it is possible to adjust I in half steps.

As far as I see it, ggnfs is no longer developed and CADO is continuing to get faster. We are probably going to want to do it at some point so why not now. Does the binary itself have to be modified or can it be called by a script?
henryzz is offline   Reply With Quote
Old 2020-01-15, 16:08   #10
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

7·617 Posts
Default

Everything in CADO is called / managed by python scripts, so a BOINC effort should be no different if desired.

There is only one sieving binary, named las; siever area is a parameter passed to the siever. Memory use is higher, but as you note only one copy needs be run per socket so overall memory use may remain similar to ggnfs.

I=15 at ~205 digits -> 2.5GB per multithreaded process. I=15 at ~186 digits -> under 2GB.

I=16 is 4x as much memory, but since CADO handles large Q so much better than ggnfs, I=16 can be easily extended to huge projects, say GNFS-230. If las were used today on NFS@home, I'd use I=14 up to 183 digits, and I=15 up to 207 digits.

I haven't gotten the A parameter to work yet, perhaps in a coming-soon commit.
VBCurtis is offline   Reply With Quote
Old 2020-01-15, 22:51   #11
PFPoitras
 
"Patrick Poitras"
Oct 2019

2×33 Posts
Default

I have pushed a new update to the repo to fix a bug where the program would crash if there hadn't been either siqs or nfs done in the directory.

Please let me know if you encounter any other problems. I have the suspicion that the grep-like section of the code will fail at the first encountering of any weird outputs in the logs.
PFPoitras is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
QS/NFS crossover points bsquared Factoring 24 2016-01-25 05:09
29 to 30 bit large prime SNFS crossover VBCurtis Factoring 11 2015-03-09 07:01
Using Several Instances of Aliqueit for a large gnfs job EdH Aliquot Sequences 6 2011-12-13 18:58
32/33 and 15e/16e crossover point fivemack Factoring 7 2009-04-21 07:59
Can I move an exponent near a FFT crossover to my P III? svempasnake Software 2 2002-09-09 21:32

All times are UTC. The time now is 11:54.

Thu Sep 24 11:54:52 UTC 2020 up 14 days, 9:05, 0 users, load averages: 2.13, 1.89, 1.70

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.