Register FAQ Search Today's Posts Mark Forums Read

2020-12-19, 09:46   #837
henryzz
Just call me Henry

"David"
Sep 2007
Cambridge (GMT/BST)

2·2,909 Posts

Quote:
 Originally Posted by ryanp Thanks! Almost certainly CADO'ing; it's my go-to now. Haven't used the GGNFS sievers for a while.
Would your max sieve size be I=16/A=31 or do your machines have the memory for more?

 2020-12-22, 02:15 #838 VBCurtis     "Curtis" Feb 2005 Riverside, CA 111028 Posts I did some testing with GGNFS 16f to test a bunch of possible parameters, and narrowed down LP choice to 33/35 or 33/36. Then I ran two CADO instances, each using 12 threads for a single las process on a 12-core xeon. I'm using 2e9 rels wanted for 33/35, 2.65e9 for 33/36 (30-33% more for a one-sided LP increase). MFBs are 66 and 101/103, lims 400M and 900M. I tested I=16 first. Yield at Q=100M is 2.84 (50kQ tested) for 33/35, 4.2ish (27kQ tested) for 33/36. At Q=100M, 33/36 has ETA of 86 core-years, while 33/35 has ETA of 90 core-years (that's ETA multipllied by 6 cores, not 12 threads). I am next moving to Q=500M to repeat the trial. After the first trial, I don't think the extra data management and possible larger matrix is worth saving ~5% sieve time for 33/36. I should have Q=500M data Tuesday morning, with yield estimate that should allow a prediction of sieve range. I=16 looks like enough here, pending a measure of yield at higher Q. Last fiddled with by VBCurtis on 2020-12-22 at 02:28
 2020-12-22, 04:49 #839 VBCurtis     "Curtis" Feb 2005 Riverside, CA 467410 Posts Initial results at Q=500M have lousy yield: 33/35 is around 1.2, 33/36 1.8 or so. ETA for 33/36 is about 17% less than 33/35. I'll let this run until morning, then I'll test A=32 against I=16 with both on 33/36. This job might be best with e.g. 100-300M at A=32 and 300M-end on I=16. Or, perhaps my lim's are too small? Memory use is right at 10GB according to 'top', so I could go 30-40% bigger?
2020-12-22, 05:14   #840
charybdis

Apr 2020

2·5·19 Posts

Quote:
 Originally Posted by VBCurtis Initial results at Q=500M have lousy yield: 33/35 is around 1.2, 33/36 1.8 or so. ETA for 33/36 is about 17% less than 33/35. I'll let this run until morning, then I'll test A=32 against I=16 with both on 33/36. This job might be best with e.g. 100-300M at A=32 and 300M-end on I=16. Or, perhaps my lim's are too small? Memory use is right at 10GB according to 'top', so I could go 30-40% bigger?
Ouch, that's a big dropoff in yield. At that rate I'm not sure you'd ever get enough relations with those parameters. I reckon whatever params are eventually chosen, the sieve range will need to extend well beyond 1000M, so it would be worth test-sieving at very large Q.

The lims could well be too small. Maybe they should be even more skewed too? If we're contemplating large prime bounds as asymmetrical as 33/36, why shouldn't we consider lims like 500/2000M? (I think the CADO siever can't go beyond lim=2^31)

Last fiddled with by charybdis on 2020-12-22 at 05:16

 2020-12-22, 20:23 #841 henryzz Just call me Henry     "David" Sep 2007 Cambridge (GMT/BST) 2×2,909 Posts There are a couple of options for las that may be useful to help yield: -bkthresh1 1e8 turns on two-level bucket sieving which reduces memory usage a lot. There seems to be a speed/memory trade-off although I am not sure off is optimal speed wise. For A=32, off uses ~16GB, 1e8 uses ~6GB and 3.9e8(has to be below lowest FB bound) uses ~3.7 GB. 3.9e8 is 2/3rds the speed while 1e8 is slightly faster than off. This could allow higher A using the same memory. I think this parameter could do with setting differently for each side although that isn't an option. -sieve-adjust provides a few strategies for determining I and J based on A. The more dynamic options can provide more relations. Caution is required though as it can use more memory if it chooses small I and large J. I think this is more likely to happen for very small special q.
 2020-12-23, 00:27 #842 VBCurtis     "Curtis" Feb 2005 Riverside, CA 2×3×19×41 Posts I took 33/35 to Q=1100M, yield is 0.75-0.8. ETA is also way up. I changed 33/36 to Q=550M and A=32. ETA is about the same as the above run, but yield is NOT better: Q=500M on I=16 is 1.84, while Q=550M on A=32 is about 1.8 (only 10kQ measured so far). so, I now wonder if I should try A=30 and we just expect Q to go from like 200M to 2000M. I'll get an answer to this with the 33/35 run shortly. Haven't tried different lim's yet, because a new run takes ~7hrs to find the free relations.
2020-12-23, 00:54   #843
charybdis

Apr 2020

2×5×19 Posts

Quote:
 Originally Posted by VBCurtis Haven't tried different lim's yet, because a new run takes ~7hrs to find the free relations.
Oh, you're using the script rather than invoking las directly? There's no need for the free relations until postprocessing.
The factor base file does need to be made in advance, but you can make one that will be enough for all your runs. You need -lim to be at least as big as the largest lim1 you might want to test:

Code:
makefb -poly ../../../../cadojobs/octic/octic.poly -lim 2000000000 -maxbits 16 -out ../../../../cadojobs/octic/octic.roots1.gz -t 6
The command line for the siever will then look like:
Code:
las -poly ../../../../cadojobs/octic/octic.poly -I 16 -q0 500000000 -q1 500010000 -lim0 400000000 -lim1 1500000000 -lpb0 33 -lpb1 35 -mfb0 66 -mfb1 101 -fb1 ../../../../cadojobs/octic/octic.roots1.gz -out ../../../../cadojobs/octic/octic.500000000-500010000.gz -t 6 -stats-stderr
(add -lambda0, -ncurves0 etc if you like)

Last fiddled with by charybdis on 2020-12-23 at 00:54

 2020-12-23, 04:34 #844 VBCurtis     "Curtis" Feb 2005 Riverside, CA 2×3×19×41 Posts Correct, I am in fire-and-forget mode for CADO testing still. I Agree that freerels are unnecessary for testing, and was too lazy to invoke all the relevant flags for las. Now that you speel it out, I suppose maybe next time I'll give it direct test-sieving a try. Meantime, I did test A=30, and yield at Q=1100M & 1200M was 1/sqrt2 times the yield with I=16. Not sure why A=32 remains less than 1.4x that of I=16. Memory use is 18.5GB/10.0GB/5.8GB for A=32/31/30 (I=16 was used instead of A=31, in case they're not the same for the variants mentioned by henryzz). A 25kQ test at A=32 gives yield within 3% of the same as I=16. So, I think our choice is: I=16 lpb0=33 lpb1=36 mfb0=66 mfb1=104 (I tested 103, but we need yield!) ncurves0 = 30? ncurves1 = 20? qmin = 150 or 200M? lower Q's will produce more duplicate relations. rels_wanted=2.8e9, as a guess I'm guessing 1000M rels from 150-500M, 900M rels from 500-1100M, and I didn't test Q above 1200M but I'll stab at yield of 1 so Q-max around 2000M. Good luck, Ryan!
 2020-12-23, 05:45 #845 charybdis   Apr 2020 2768 Posts Final(?) thoughts: I think I=17 could be useful for the smallest Q. Reasoning: let's think about what happens to the norms when we increase Q. The numbers we need to be smooth are rnorm and anorm/Q. If we increase Q by a factor k, the area of the sieve region in the (a,b)-plane increases by k, so a and b increase by a factor sqrt(k). Hence rnorm goes up by a factor sqrt(k), and anorm/Q goes up by a factor sqrt(k)^8/k = k^3. This must be why yield drops off so much: for a sextic anorm/Q would only increase by a factor k^2. If we double a and b (going from I=16 to I=17), we increase anorm by a factor 256 and rnorm by a factor 2. The resulting increase in the product of the norms is equivalent to the increase in rnorm*anorm/Q when Q goes up by a factor 512^(1/3.5) which is around 6. So going up from I=16 to I=17 at the bottom of the range might well produce more extra relations than sieving further with I=16 at the top, given that maxQ will probably be > 6*minQ. A possible reason for A=32 being inefficient is that it doesn't increase a and b by a factor sqrt(2), it just increases b by a factor 2. If a and b both increased by a factor sqrt(2), then anorm would go up by a factor 16. However, doubling b alone sometimes increases anorm by substantially more, since one of the terms in the expression for anorm gets multiplied by 256! Essentially we're paying for the fact the sieve region isn't square - yes, 2^16 x 2^16 sounds square, but it's morally 2^16 x 2^17 because we don't need to consider negative values of b. Curtis, would it be possible to post your test-sieving relation files here? There are a couple of things I'd like to look at, in particular with regards to the choice of lims.
 2020-12-23, 06:10 #846 VBCurtis     "Curtis" Feb 2005 Riverside, CA 2×3×19×41 Posts The relations are too big to post to the forum, but I'll be happy to cat them together and email them to you. PM me email and I'll get 'em out. also, I left lim's off my suggested params- as mentioned earlier, I tested lim0 400M and lim1 900M, but charybdis observed bigger may be better.
2020-12-23, 11:29   #847
henryzz
Just call me Henry

"David"
Sep 2007
Cambridge (GMT/BST)

2·2,909 Posts

Quote:
 Originally Posted by VBCurtis Not sure why A=32 remains less than 1.4x that of I=16.
If you look at the J that is used it often barely increases for even A and sieve-adjust 0(I=65536; J=40287 seems to be one example). Sieve-adjust can help with that.

We could always do the smaller q with a larger sieve region which will reduce duplication.

 Similar Threads Thread Thread Starter Forum Replies Last Post Xyzzy GPU Computing 1 2017-05-17 20:22 Mark Rose GPU Computing 52 2016-07-02 12:11 firejuggler GPU Computing 12 2016-02-23 06:55 Elhueno Homework Help 5 2008-06-12 16:37 jchein1 Factoring 30 2005-05-30 14:43

All times are UTC. The time now is 13:57.

Mon Mar 1 13:57:06 UTC 2021 up 88 days, 10:08, 0 users, load averages: 3.88, 2.76, 2.51