mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   CADO-NFS (https://www.mersenneforum.org/forumdisplay.php?f=170)
-   -   How to test-sieve (https://www.mersenneforum.org/showthread.php?t=27014)

bur 2021-07-18 06:02

How to test-sieve
 
How is test sieving done with CADO? I thought to just start it normally and have it sieve for a defined time or percentage, but that doesn't measure how strongly yield will decrease at larger q. How is it usually done?

wombatman 2021-07-18 17:25

You can set the minimum q-value in the parameters along with the number of q-values in each workunit. When I was test-sieving different parameters, I would just start a "new" run of the number with the updated parameter values and see how many relations per q I got.

You should be able to do the same thing, but using different minimum q-values to probe as desired.

charybdis 2021-07-18 17:47

[QUOTE=bur;583403]How is test sieving done with CADO? I thought to just start it normally and have it sieve for a defined time or percentage, but that doesn't measure how strongly yield will decrease at larger q. How is it usually done?[/QUOTE]

Depends how thoroughly you're test-sieving :smile:

If you're not going to be changing the parameters much then what wombatman suggests should be fine. If you're going to be testing lots of different settings, especially on large jobs, you might get frustrated at the length of time CADO takes to generate free relations at the start of each job, when they aren't even used until filtering. In this case you may want to test-sieve manually. Here's how to do that.

You will need to use the [C]makefb[/C] and [C]las[/C] binaries, located in the cado-nfs/build/[machine name]/sieve directory.

First run a command like

[code]makefb -poly [path to poly file] -lim [largest lim1 you might use] -maxbits [largest I you might use] -out [path to output file] -t [threads][/code]to generate the factor base file. The output file name should be something like jobname.roots1.gz. You will only need to run this command once.

Then, to test-sieve, run

[code]las -poly [path to poly file] -I [I] -q0 [start of Q range] -q1 [end of Q range] -lim0 [lim0] -lim1 [lim1] -lpb0 [lpb0] -lpb1 [lpb1] -mfb0 [mfb0] -mfb1 [mfb1] -fb1 [path to factor base file] -out [path to output file] -t [threads] -stats-stderr[/code]You can add more commands like -lambda0, -ncurves0, -adjust-strategy if you like. CADO usually outputs gzipped relation files, but if you don't give the output file a .gz ending it won't do this and you'll be able to read the stats at the end of the file without decompressing. I'd give the output file a name that makes it clear to you which parameters you used.

bur 2021-07-19 09:14

Ah, thanks. Is there any good advice on which q-ranges to test? If the dependence of relation yield to q is linear, just the beginning and end should suffice, is it like that?

charybdis 2021-07-19 13:40

[QUOTE=bur;583516]Ah, thanks. Is there any good advice on which q-ranges to test? If the dependence of relation yield to q is linear, just the beginning and end should suffice, is it like that?[/QUOTE]

Unfortunately it's not linear. It depends on the degree of the polynomial and which side you're sieving on (which reminds me: -sqside is another important optional command line flag for las), and there should be some effect from the size of the norms too. I would pick, say, 5 different Q values and interpolate between those. Note that very low Q values give very high yields but will lead to high duplication rates, so there is a trade-off.

chris2be8 2021-07-19 15:57

I'm not sure if this applies to CADO-NFS, but with the ggnfs sievers used by NFS@Home yield often peaks or at least changes trend when special-Q is the same as the factor base size on the side you are sieving (the siever cuts the factor base size down to special-Q when sieving below factor base size). So you should always sieve one range at the factor base size as well as the start and end of the range you hope to sieve on (and as many intermediates as you have time for), then you can assume yield varies linearly between points where you test sieved.

It's not exact since yield isn't precisely linear but usually near enough to do the job. And gets more accurate if you sieve more ranges with smaller gaps between them.

Chris

VBCurtis 2021-07-19 16:15

CADO, like the "f" sievers from GGNFS, can sieve Q below the factor base without altering the factor base. So, this advice is not necessary when test-sieving in CADO.

bur 2021-07-19 16:28

Ok, thanks everyone. The sieving times for 1281979*2^n+1 suddenly increased strongly from about 5 h at n = 578 to 9 h for n = 590. In terms of time that places the SNFS-174 as a GNFS-135 while the SNFS-177 was already like a GNFS-139.


Rational side sieving is still faster, the switch to vbcurtis's c135 parameters helped a bit, but not significantly.


Would it make sense to test the number of required relations? For c135 it's 71,000,000 and for c130 only 46,000,000. Or do SNFS and GNFS behave similar in that regard?

charybdis 2021-07-19 16:38

[QUOTE=bur;583545]Ok, thanks everyone. The sieving times for 1281979*2^n+1 suddenly increased strongly from about 5 h at n = 578 to 9 h for n = 590. In terms of time that places the SNFS-174 as a GNFS-135 while the SNFS-177 was already like a GNFS-139.[/quote]

That's surprising. Can you post the params files you used for each of these jobs?

[quote]Would it make sense to test the number of required relations? For c135 it's 71,000,000 and for c130 only 46,000,000. Or do SNFS and GNFS behave similar in that regard?[/QUOTE]

The number of relations required is determined mainly by lpb0/lpb1: raising both by 1 roughly doubles the number of relations needed. The mfb values have an effect too but it is smaller. In addition, starting at a lower Q value will increase the number of raw relations required because the duplication rate is higher so you need more raw relations to get the same number of uniques.

bur 2021-07-20 07:19

So having an SNFS-177 take as long as a GNFS-139 really seems a bit long? I thought it might be due to the large coeffcient but the steep increase surprised me.

I was using these (slightly modified c130 params):
[CODE]lim0 = 5500000
lim1 = 5500000
lpb0 = 30
lpb1 = 29
tasks.sieve.mfb0 = 56
tasks.sieve.mfb1 = 54
tasks.sieve.ncurves0 = 21
tasks.sieve.ncurves1 = 19
tasks.sieve.lambda0 = 1.81
tasks.sieve.lambda1 = 1.80
tasks.I = 13
tasks.sieve.qrange = 5000
tasks.qmin = 120000
tasks.sieve.rels_wanted = 46000000
tasks.sieve.sqside = 0[/CODE]

And then when sieving started to take so long, I switched to these (slightly modified c135 params):
[CODE]lim0 = 13000000
lim1 = 13000000
lpb0 = 30
lpb1 = 30
tasks.sieve.mfb0 = 58
tasks.sieve.mfb1 = 56
tasks.sieve.ncurves0 = 21
tasks.sieve.ncurves1 = 17
tasks.sieve.lambda0 = 1.91
tasks.sieve.lambda1 = 1.82
tasks.A = 26
tasks.sieve.qrange = 10000
tasks.qmin = 500000
tasks.sieve.rels_wanted = 71000000
tasks.sieve.sqside = 0[/CODE]

I managed to build a matrix using msieve with 60M relations. From my experience msieve tends to require more relations than cado to build a matrix (why is that?), so for the current factorization I decreased rels_wanted to 60M.

bur 2021-07-20 09:46

Btw, I had 75% uniques with the c135 params for one number, I think this is relatively high? It's good, but maybe it still gives a hint what could be optimized?


All times are UTC. The time now is 13:48.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.