mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Factoring

Reply
 
Thread Tools
Old 2020-12-19, 09:46   #837
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

2×41×71 Posts
Default

Quote:
Originally Posted by ryanp View Post
Thanks!



Almost certainly CADO'ing; it's my go-to now. Haven't used the GGNFS sievers for a while.
Would your max sieve size be I=16/A=31 or do your machines have the memory for more?
henryzz is online now   Reply With Quote
Old 2020-12-22, 02:15   #838
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

2·2,339 Posts
Default

I did some testing with GGNFS 16f to test a bunch of possible parameters, and narrowed down LP choice to 33/35 or 33/36.

Then I ran two CADO instances, each using 12 threads for a single las process on a 12-core xeon. I'm using 2e9 rels wanted for 33/35, 2.65e9 for 33/36 (30-33% more for a one-sided LP increase). MFBs are 66 and 101/103, lims 400M and 900M.

I tested I=16 first. Yield at Q=100M is 2.84 (50kQ tested) for 33/35, 4.2ish (27kQ tested) for 33/36.

At Q=100M, 33/36 has ETA of 86 core-years, while 33/35 has ETA of 90 core-years (that's ETA multipllied by 6 cores, not 12 threads).

I am next moving to Q=500M to repeat the trial. After the first trial, I don't think the extra data management and possible larger matrix is worth saving ~5% sieve time for 33/36.

I should have Q=500M data Tuesday morning, with yield estimate that should allow a prediction of sieve range. I=16 looks like enough here, pending a measure of yield at higher Q.

Last fiddled with by VBCurtis on 2020-12-22 at 02:28
VBCurtis is offline   Reply With Quote
Old 2020-12-22, 04:49   #839
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

467810 Posts
Default

Initial results at Q=500M have lousy yield: 33/35 is around 1.2, 33/36 1.8 or so.
ETA for 33/36 is about 17% less than 33/35.

I'll let this run until morning, then I'll test A=32 against I=16 with both on 33/36.
This job might be best with e.g. 100-300M at A=32 and 300M-end on I=16. Or, perhaps my lim's are too small? Memory use is right at 10GB according to 'top', so I could go 30-40% bigger?
VBCurtis is offline   Reply With Quote
Old 2020-12-22, 05:14   #840
charybdis
 
Apr 2020

110000012 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
Initial results at Q=500M have lousy yield: 33/35 is around 1.2, 33/36 1.8 or so.
ETA for 33/36 is about 17% less than 33/35.

I'll let this run until morning, then I'll test A=32 against I=16 with both on 33/36.
This job might be best with e.g. 100-300M at A=32 and 300M-end on I=16. Or, perhaps my lim's are too small? Memory use is right at 10GB according to 'top', so I could go 30-40% bigger?
Ouch, that's a big dropoff in yield. At that rate I'm not sure you'd ever get enough relations with those parameters. I reckon whatever params are eventually chosen, the sieve range will need to extend well beyond 1000M, so it would be worth test-sieving at very large Q.

The lims could well be too small. Maybe they should be even more skewed too? If we're contemplating large prime bounds as asymmetrical as 33/36, why shouldn't we consider lims like 500/2000M? (I think the CADO siever can't go beyond lim=2^31)

Last fiddled with by charybdis on 2020-12-22 at 05:16
charybdis is offline   Reply With Quote
Old 2020-12-22, 20:23   #841
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

582210 Posts
Default

There are a couple of options for las that may be useful to help yield:

-bkthresh1 1e8 turns on two-level bucket sieving which reduces memory usage a lot. There seems to be a speed/memory trade-off although I am not sure off is optimal speed wise. For A=32, off uses ~16GB, 1e8 uses ~6GB and 3.9e8(has to be below lowest FB bound) uses ~3.7 GB. 3.9e8 is 2/3rds the speed while 1e8 is slightly faster than off. This could allow higher A using the same memory. I think this parameter could do with setting differently for each side although that isn't an option.

-sieve-adjust provides a few strategies for determining I and J based on A. The more dynamic options can provide more relations. Caution is required though as it can use more memory if it chooses small I and large J. I think this is more likely to happen for very small special q.
henryzz is online now   Reply With Quote
Old 2020-12-23, 00:27   #842
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

2×2,339 Posts
Default

I took 33/35 to Q=1100M, yield is 0.75-0.8. ETA is also way up.

I changed 33/36 to Q=550M and A=32. ETA is about the same as the above run, but yield is NOT better: Q=500M on I=16 is 1.84, while Q=550M on A=32 is about 1.8 (only 10kQ measured so far).

so, I now wonder if I should try A=30 and we just expect Q to go from like 200M to 2000M. I'll get an answer to this with the 33/35 run shortly.

Haven't tried different lim's yet, because a new run takes ~7hrs to find the free relations.
VBCurtis is offline   Reply With Quote
Old 2020-12-23, 00:54   #843
charybdis
 
Apr 2020

193 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
Haven't tried different lim's yet, because a new run takes ~7hrs to find the free relations.
Oh, you're using the script rather than invoking las directly? There's no need for the free relations until postprocessing.
The factor base file does need to be made in advance, but you can make one that will be enough for all your runs. You need -lim to be at least as big as the largest lim1 you might want to test:

Code:
makefb -poly ../../../../cadojobs/octic/octic.poly -lim 2000000000 -maxbits 16 -out ../../../../cadojobs/octic/octic.roots1.gz -t 6
The command line for the siever will then look like:
Code:
las -poly ../../../../cadojobs/octic/octic.poly -I 16 -q0 500000000 -q1 500010000 -lim0 400000000 -lim1 1500000000 -lpb0 33 -lpb1 35 -mfb0 66 -mfb1 101 -fb1 ../../../../cadojobs/octic/octic.roots1.gz -out ../../../../cadojobs/octic/octic.500000000-500010000.gz -t 6 -stats-stderr
(add -lambda0, -ncurves0 etc if you like)

Last fiddled with by charybdis on 2020-12-23 at 00:54
charybdis is offline   Reply With Quote
Old 2020-12-23, 04:34   #844
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

2·2,339 Posts
Default

Correct, I am in fire-and-forget mode for CADO testing still.

I Agree that freerels are unnecessary for testing, and was too lazy to invoke all the relevant flags for las. Now that you speel it out, I suppose maybe next time I'll give it direct test-sieving a try.

Meantime, I did test A=30, and yield at Q=1100M & 1200M was 1/sqrt2 times the yield with I=16. Not sure why A=32 remains less than 1.4x that of I=16.
Memory use is 18.5GB/10.0GB/5.8GB for A=32/31/30 (I=16 was used instead of A=31, in case they're not the same for the variants mentioned by henryzz).

A 25kQ test at A=32 gives yield within 3% of the same as I=16. So, I think our choice is:
I=16
lpb0=33
lpb1=36
mfb0=66
mfb1=104 (I tested 103, but we need yield!)
ncurves0 = 30?
ncurves1 = 20?
qmin = 150 or 200M? lower Q's will produce more duplicate relations.
rels_wanted=2.8e9, as a guess

I'm guessing 1000M rels from 150-500M, 900M rels from 500-1100M, and I didn't test Q above 1200M but I'll stab at yield of 1 so Q-max around 2000M.

Good luck, Ryan!
VBCurtis is offline   Reply With Quote
Old 2020-12-23, 05:45   #845
charybdis
 
Apr 2020

3018 Posts
Default

Final(?) thoughts:

I think I=17 could be useful for the smallest Q. Reasoning: let's think about what happens to the norms when we increase Q. The numbers we need to be smooth are rnorm and anorm/Q. If we increase Q by a factor k, the area of the sieve region in the (a,b)-plane increases by k, so a and b increase by a factor sqrt(k). Hence rnorm goes up by a factor sqrt(k), and anorm/Q goes up by a factor sqrt(k)^8/k = k^3. This must be why yield drops off so much: for a sextic anorm/Q would only increase by a factor k^2.
If we double a and b (going from I=16 to I=17), we increase anorm by a factor 256 and rnorm by a factor 2. The resulting increase in the product of the norms is equivalent to the increase in rnorm*anorm/Q when Q goes up by a factor 512^(1/3.5) which is around 6. So going up from I=16 to I=17 at the bottom of the range might well produce more extra relations than sieving further with I=16 at the top, given that maxQ will probably be > 6*minQ.

A possible reason for A=32 being inefficient is that it doesn't increase a and b by a factor sqrt(2), it just increases b by a factor 2. If a and b both increased by a factor sqrt(2), then anorm would go up by a factor 16. However, doubling b alone sometimes increases anorm by substantially more, since one of the terms in the expression for anorm gets multiplied by 256! Essentially we're paying for the fact the sieve region isn't square - yes, 2^16 x 2^16 sounds square, but it's morally 2^16 x 2^17 because we don't need to consider negative values of b.

Curtis, would it be possible to post your test-sieving relation files here? There are a couple of things I'd like to look at, in particular with regards to the choice of lims.
charybdis is offline   Reply With Quote
Old 2020-12-23, 06:10   #846
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

124616 Posts
Default

The relations are too big to post to the forum, but I'll be happy to cat them together and email them to you. PM me email and I'll get 'em out.

also, I left lim's off my suggested params- as mentioned earlier, I tested lim0 400M and lim1 900M, but charybdis observed bigger may be better.
VBCurtis is offline   Reply With Quote
Old 2020-12-23, 11:29   #847
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

2·41·71 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
Not sure why A=32 remains less than 1.4x that of I=16.
If you look at the J that is used it often barely increases for even A and sieve-adjust 0(I=65536; J=40287 seems to be one example). Sieve-adjust can help with that.

We could always do the smaller q with a larger sieve region which will reduce duplication.
henryzz is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Passive Pascal Xyzzy GPU Computing 1 2017-05-17 20:22
Tesla P100 — 5.4 DP TeraFLOPS — Pascal Mark Rose GPU Computing 52 2016-07-02 12:11
Nvidia Pascal, a third of DP firejuggler GPU Computing 12 2016-02-23 06:55
Calculating perfect numbers in Pascal Elhueno Homework Help 5 2008-06-12 16:37
Factorization attempt to a c163 - a new Odd Perfect Number roadblock jchein1 Factoring 30 2005-05-30 14:43

All times are UTC. The time now is 21:24.

Sun Mar 7 21:24:06 UTC 2021 up 94 days, 17:35, 0 users, load averages: 3.14, 2.53, 2.46

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.