Go Back > New To GIMPS? Start Here! > Information & Answers

Thread Tools
Old 2018-04-24, 17:08   #23
VBCurtis's Avatar
Feb 2005
Riverside, CA

13CA16 Posts

Originally Posted by chris2be8 View Post
I *think* that a multithreaded siever could share the factor base among all threads instead of each process having it's own copy. Whether that cuts memory use enough is another matter.

This is indeed what CADO does. A single 4-threaded instance of the I=17 siever will use the same memory as 1-threaded; I haven't tried more than 4 threads.

I had in mind that large-memory folks would use I=17 on the smallish Q values, while others would use I = 16 over a much larger range (say, Q=2G to Q=12G). Since this is SNFS and a poly is obvious, I may test yields this weekend for the sake of this thread and my curiosity.
VBCurtis is offline   Reply With Quote
Old 2018-04-25, 09:42   #24
(loop (#_fork))
fivemack's Avatar
Feb 2006
Cambridge, England

22·32·179 Posts

Because I am a curious person:

With 33-bit large primes (3 on rational side), 16e siever, and the polynomial 8x^7-1, the yield is 29 relations in 2031 seconds for Q=4e8 .. 4e8+1e3

Which is not zero, but is obviously not enough; suggests that a few thousand CPU-years might actually suffice.

(the septic is better than the sextic here, though the rational side is still generally a good deal larger than the algebraic; I haven't tried degree-8 algebraic)

I will leave further investigation to VBCurtis this weekend

Last fiddled with by fivemack on 2018-04-25 at 09:42
fivemack is offline   Reply With Quote
Old 2018-04-26, 06:22   #25
VBCurtis's Avatar
Feb 2005
Riverside, CA

2×17×149 Posts

I got curious, too, so I tested the same poly as fivemack:
16f siever, prime Q only, lambda = 3.8, alim=rlim=536M, 39-bit LP:
mfbr/a = 112 (3 large primes on each side)
Q=400M 4062 relations 1.64 sec/rel
Q=4000M 1757 relations 2.15 sec/rel
At an average 2.5 sec/relation, 40G relations would take 100 gigathreadseconds, or roughly 3100 CPU-years. Sieve performance at large Q is unknown in CADO, and a factor of two or three is fairly likely. Still, 10k CPU-years for sieving should be a generous estimate.

Changing mfba to 78 and alambda to 2.8 cut time to 1.0 sec/rel, but yield was 2000 or so at Q=400M.

Over the 16f siever's Q-range of 50M to 4.3G (I believe Q is limited to 32 bits), an average yield that may be close to 3.0 means we can get roughly 12e9 raw relations, while 40e9 are likely needed. CADO has been used for Q up to 10G (see, where RSA-704 was factored with I=15!!!!).

I'll test some more mbfr/mfba and alim/rlim settings this weekend, and report back.
If anyone has advice for test-sieving with the CADO siever, please let me know!

Edit to add: 16f memory use was around 1.7GB for one thread.

Last fiddled with by VBCurtis on 2018-04-26 at 06:39
VBCurtis is offline   Reply With Quote
Old 2018-04-26, 06:55   #26
Just call me Henry
henryzz's Avatar
Sep 2007
Cambridge (GMT/BST)

134618 Posts

It sounds to me like it would be necessary to use CADO for this factorization with either I >16 or very high Qs.
Did the max q only increase from 2^31 to 2^32 between lasieve4 and lasieve5? Is it higher when composite q are considered?
henryzz is offline   Reply With Quote
Old 2018-04-26, 18:22   #27
chris2be8's Avatar
Sep 2009

22×32×61 Posts

Does 16f support -j 16 ? And if it does how much memory does it use and how much does it raise yield?

chris2be8 is offline   Reply With Quote
Old 2018-04-27, 05:09   #28
VBCurtis's Avatar
Feb 2005
Riverside, CA

10011110010102 Posts

I tried that in past experiments, and -J 16 increases yield 40%, as expected. However, it is quite unstable; something like 25% of the test-sieves crashed (dQ=1000 per test). One candidate had no crashes, another had many, so it's possible that it is stable with certain polys.

This poly did not crash in a 30-min test, and memory use was the same as 16f without the -J 16 flag. So, it's worth testing further. sec/rel for the short time I tested was slightly better than regular 16f.

Edit: Naturally, I post after one test, and the next test with -J 16 crashes immediately.

Last fiddled with by VBCurtis on 2018-04-27 at 05:15
VBCurtis is offline   Reply With Quote
Old 2018-05-06, 03:04   #29
VBCurtis's Avatar
Feb 2005
Riverside, CA

2×17×149 Posts

I did some more test-sieving, enough to figure out that lambda isn't quite what I thought it was: it seems that lambda is the multiple of the lim (in bits) that a cofactor is allowed to be, rather than a multiple of the LP bound as I thought it was.

As evidence, I tested 39LP with MFB of 112, 113, 114 all with lambda = 3.8, and all were within 1% of each other for time and relation count was identical (114 was 4063 rels, 112 was 4062). When I bumped lambda to 4.0 with mfb=114, time per relation doubled while yield improved by 9-10%.

I also tried 4LP on the r side, by setting mfbr to 125 and lambda to 4.2; yield increased a fair bit, but at 10 seconds per relation (cough) vs 2.5-3 for most of my other tests. I also tried mfbr at 135 or so, too high for the siever (a single special Q took over 30 minutes, with about 60 relations in the .dat when I killed the process, and lots of errors about cofactors).

rlim=800m, alim=1072M, 39LP, lambda 3.9 on both sides, mfbr=113, mfba = 112 produced 2.52 sec/rel and 5103 relations for a 1kQ test at 400M. This is 25% better yield in 50% more sec/rel than my original estimate a few posts ago. I ran out of curiosity before testing a range of Q values.
VBCurtis is offline   Reply With Quote
Old 2018-05-14, 01:09   #30
Jan 2013

23×7 Posts

Octic is not supported in cado.
kosta is offline   Reply With Quote
Old 2021-08-28, 17:43   #31
firejuggler's Avatar
Apr 2010
Over the rainbow

268210 Posts

I don't know much about gpu-ecm but does this
could help the ecm phase?
firejuggler is online now   Reply With Quote
Old 2021-08-28, 18:14   #32
Batalov's Avatar
Mar 2008

3×3,203 Posts

Irrelevant fodder moved to "I am experimenting with heuristic sieves."
Batalov is offline   Reply With Quote
Old 2021-08-28, 20:52   #33
SethTro's Avatar
Apr 2019

2×191 Posts

Originally Posted by firejuggler View Post
I don't know much about gpu-ecm but does this
could help the ecm phase?
Because VBCurtis was curious and having fun I also had some fun.

CGBN based ECM is possible helpful but only reduces overall ecm time by 1.5-2x so still many GPU/CPUs years left for t80.

Using the advice from paul's params.html t80 is 265K curves @ B1=25e9 for t80. Assuming we want to do 30% (50% of a t80 - 20% already done). that's 80K curves.

Using a tuned version for 1536 BITS my GPU (1080ti). processes 1792 curves @ B1=1e5 in 17.3 seconds.
Using a tuned version for 1312 BITS takes 16.4 seconds.
Using a tuned version for 1408 BITS takes 16.0 seconds.
Using a tuned version for 1280 BITS takes 12.0 seconds. (CGBN seems to need 4 carry bits, but maybe I could squeeze it down to 2)

Scaling is very linear so I project CGBN based ECM could theoretically finish the 80K curves in 80000/1792 * 25e9/1e5 * 12 seconds = 4.24 1080ti GPU-years!

Stage 2 on my system uses B2=2.5e15, I timed 2e5e12 at 362 seconds and used 4.2x per 10 larger B2, so an estimated 26788 seconds. Naively this is 80,000 * 26788 = 68 CPU core years.

This is a 16 CPU/GPU ratio so it's conceivable that a single machine could finish 30% of t80 in 4 years (or <2 years with a top end current gen GPU).

I heard 7% chance of a factor. 30% of 7% is 2.1% chance of factoring. At ~400 watts * 4 years = $1600 or an amortized $75,000 to factor.

Last fiddled with by SethTro on 2021-08-28 at 20:53
SethTro is offline   Reply With Quote

Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Do you know this method to factorize? Godzilla Miscellaneous Math 28 2017-10-31 18:14
mathematica7.0 can easily factorize 10^67+1111 aaa120 Factoring 14 2008-12-07 13:14

All times are UTC. The time now is 14:02.

Mon Nov 29 14:02:56 UTC 2021 up 129 days, 8:31, 0 users, load averages: 1.58, 1.49, 1.31

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.