mersenneforum.org  

Go Back   mersenneforum.org > New To GIMPS? Start Here! > Information & Answers

Reply
 
Thread Tools
Old 2018-04-24, 17:08   #23
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

33×5×37 Posts
Default

Quote:
Originally Posted by chris2be8 View Post
I *think* that a multithreaded siever could share the factor base among all threads instead of each process having it's own copy. Whether that cuts memory use enough is another matter.

Chris
This is indeed what CADO does. A single 4-threaded instance of the I=17 siever will use the same memory as 1-threaded; I haven't tried more than 4 threads.

I had in mind that large-memory folks would use I=17 on the smallish Q values, while others would use I = 16 over a much larger range (say, Q=2G to Q=12G). Since this is SNFS and a poly is obvious, I may test yields this weekend for the sake of this thread and my curiosity.
VBCurtis is offline   Reply With Quote
Old 2018-04-25, 09:42   #24
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

3×19×113 Posts
Default

Because I am a curious person:

With 33-bit large primes (3 on rational side), 16e siever, and the polynomial 8x^7-1, the yield is 29 relations in 2031 seconds for Q=4e8 .. 4e8+1e3

Which is not zero, but is obviously not enough; suggests that a few thousand CPU-years might actually suffice.

(the septic is better than the sextic here, though the rational side is still generally a good deal larger than the algebraic; I haven't tried degree-8 algebraic)

I will leave further investigation to VBCurtis this weekend

Last fiddled with by fivemack on 2018-04-25 at 09:42
fivemack is offline   Reply With Quote
Old 2018-04-26, 06:22   #25
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

33·5·37 Posts
Default

I got curious, too, so I tested the same poly as fivemack:
16f siever, prime Q only, lambda = 3.8, alim=rlim=536M, 39-bit LP:
mfbr/a = 112 (3 large primes on each side)
Q=400M 4062 relations 1.64 sec/rel
Q=4000M 1757 relations 2.15 sec/rel
At an average 2.5 sec/relation, 40G relations would take 100 gigathreadseconds, or roughly 3100 CPU-years. Sieve performance at large Q is unknown in CADO, and a factor of two or three is fairly likely. Still, 10k CPU-years for sieving should be a generous estimate.

Changing mfba to 78 and alambda to 2.8 cut time to 1.0 sec/rel, but yield was 2000 or so at Q=400M.

Over the 16f siever's Q-range of 50M to 4.3G (I believe Q is limited to 32 bits), an average yield that may be close to 3.0 means we can get roughly 12e9 raw relations, while 40e9 are likely needed. CADO has been used for Q up to 10G (see https://eprint.iacr.org/2012/369.pdf, where RSA-704 was factored with I=15!!!!).

I'll test some more mbfr/mfba and alim/rlim settings this weekend, and report back.
If anyone has advice for test-sieving with the CADO siever, please let me know!

Edit to add: 16f memory use was around 1.7GB for one thread.

Last fiddled with by VBCurtis on 2018-04-26 at 06:39
VBCurtis is offline   Reply With Quote
Old 2018-04-26, 06:55   #26
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

3·1,973 Posts
Default

It sounds to me like it would be necessary to use CADO for this factorization with either I >16 or very high Qs.
Did the max q only increase from 2^31 to 2^32 between lasieve4 and lasieve5? Is it higher when composite q are considered?
henryzz is offline   Reply With Quote
Old 2018-04-26, 18:22   #27
chris2be8
 
chris2be8's Avatar
 
Sep 2009

217610 Posts
Default

Does 16f support -j 16 ? And if it does how much memory does it use and how much does it raise yield?

Chris
chris2be8 is offline   Reply With Quote
Old 2018-04-27, 05:09   #28
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

33×5×37 Posts
Default

I tried that in past experiments, and -J 16 increases yield 40%, as expected. However, it is quite unstable; something like 25% of the test-sieves crashed (dQ=1000 per test). One candidate had no crashes, another had many, so it's possible that it is stable with certain polys.

This poly did not crash in a 30-min test, and memory use was the same as 16f without the -J 16 flag. So, it's worth testing further. sec/rel for the short time I tested was slightly better than regular 16f.

Edit: Naturally, I post after one test, and the next test with -J 16 crashes immediately.

Last fiddled with by VBCurtis on 2018-04-27 at 05:15
VBCurtis is offline   Reply With Quote
Old 2018-05-06, 03:04   #29
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

33·5·37 Posts
Default

I did some more test-sieving, enough to figure out that lambda isn't quite what I thought it was: it seems that lambda is the multiple of the lim (in bits) that a cofactor is allowed to be, rather than a multiple of the LP bound as I thought it was.

As evidence, I tested 39LP with MFB of 112, 113, 114 all with lambda = 3.8, and all were within 1% of each other for time and relation count was identical (114 was 4063 rels, 112 was 4062). When I bumped lambda to 4.0 with mfb=114, time per relation doubled while yield improved by 9-10%.

I also tried 4LP on the r side, by setting mfbr to 125 and lambda to 4.2; yield increased a fair bit, but at 10 seconds per relation (cough) vs 2.5-3 for most of my other tests. I also tried mfbr at 135 or so, too high for the siever (a single special Q took over 30 minutes, with about 60 relations in the .dat when I killed the process, and lots of errors about cofactors).

rlim=800m, alim=1072M, 39LP, lambda 3.9 on both sides, mfbr=113, mfba = 112 produced 2.52 sec/rel and 5103 relations for a 1kQ test at 400M. This is 25% better yield in 50% more sec/rel than my original estimate a few posts ago. I ran out of curiosity before testing a range of Q values.
VBCurtis is offline   Reply With Quote
Old 2018-05-14, 01:09   #30
kosta
 
Jan 2013

3816 Posts
Default

Octic is not supported in cado.
kosta is offline   Reply With Quote
Old 2021-08-28, 17:43   #31
firejuggler
 
firejuggler's Avatar
 
Apr 2010
Over the rainbow

11×241 Posts
Default

I don't know much about gpu-ecm but does this https://www.mersenneforum.org/showthread.php?t=27103
could help the ecm phase?
firejuggler is online now   Reply With Quote
Old 2021-08-28, 18:14   #32
Batalov
 
Batalov's Avatar
 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2

5×1,913 Posts
Default

Irrelevant fodder moved to "I am experimenting with heuristic sieves."
Batalov is offline   Reply With Quote
Old 2021-08-28, 20:52   #33
SethTro
 
SethTro's Avatar
 
"Seth"
Apr 2019

22·7·13 Posts
Default

Quote:
Originally Posted by firejuggler View Post
I don't know much about gpu-ecm but does this https://www.mersenneforum.org/showthread.php?t=27103
could help the ecm phase?
Because VBCurtis was curious and having fun I also had some fun.

CGBN based ECM is possible helpful but only reduces overall ecm time by 1.5-2x so still many GPU/CPUs years left for t80.

Using the advice from paul's params.html t80 is 265K curves @ B1=25e9 for t80. Assuming we want to do 30% (50% of a t80 - 20% already done). that's 80K curves.

Using a tuned version for 1536 BITS my GPU (1080ti). processes 1792 curves @ B1=1e5 in 17.3 seconds.
Using a tuned version for 1312 BITS takes 16.4 seconds.
Using a tuned version for 1408 BITS takes 16.0 seconds.
Using a tuned version for 1280 BITS takes 12.0 seconds. (CGBN seems to need 4 carry bits, but maybe I could squeeze it down to 2)

Scaling is very linear so I project CGBN based ECM could theoretically finish the 80K curves in 80000/1792 * 25e9/1e5 * 12 seconds = 4.24 1080ti GPU-years!

Stage 2 on my system uses B2=2.5e15, I timed 2e5e12 at 362 seconds and used 4.2x per 10 larger B2, so an estimated 26788 seconds. Naively this is 80,000 * 26788 = 68 CPU core years.

This is a 16 CPU/GPU ratio so it's conceivable that a single machine could finish 30% of t80 in 4 years (or <2 years with a top end current gen GPU).

I heard 7% chance of a factor. 30% of 7% is 2.1% chance of factoring. At ~400 watts * 4 years = $1600 or an amortized $75,000 to factor.

Last fiddled with by SethTro on 2021-08-28 at 20:53
SethTro is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Do you know this method to factorize? Godzilla Miscellaneous Math 28 2017-10-31 18:14
mathematica7.0 can easily factorize 10^67+1111 aaa120 Factoring 14 2008-12-07 13:14

All times are UTC. The time now is 02:00.


Thu Oct 21 02:00:39 UTC 2021 up 89 days, 20:29, 1 user, load averages: 1.02, 1.19, 1.26

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.