mersenneforum.org > YAFU three large primes
 Register FAQ Search Today's Posts Mark Forums Read

 2020-05-22, 04:37 #34 LaurV Romulan Interpreter     Jun 2011 Thailand 100010000111102 Posts Wow! Ben, that is freaking wonderful! And you kept it for yourself for such a long time! Now I will have to upgrade yafu... (grrr... still using old versions here and there, you know, if it works, don't fix it...) (unfortunately, since I discovered pari, I did most of my "numerology" in it, so the loops and screws came somehow too late, but be sure I will give it a try, I still have some ideas in a far side corner of the brain, which I never had the time/knowledge/balls to follow...) Last fiddled with by LaurV on 2020-05-22 at 04:39
2020-05-22, 16:24   #35
bsquared

"Ben"
Feb 2007

1100110110102 Posts

Quote:
 Originally Posted by LaurV Wow! Ben, that is freaking wonderful! And you kept it for yourself for such a long time! Now I will have to upgrade yafu... (grrr... still using old versions here and there, you know, if it works, don't fix it...) (unfortunately, since I discovered pari, I did most of my "numerology" in it, so the loops and screws came somehow too late, but be sure I will give it a try, I still have some ideas in a far side corner of the brain, which I never had the time/knowledge/balls to follow...)
You'll probably have to get the latest wip-branch SVN, so put it somewhere that it won't bother existing installs. I have had some success getting it to work, but chances are it won't work well But if you give it a shot, let me know how it breaks :)

Last fiddled with by bsquared on 2020-05-22 at 16:24

 2020-05-27, 18:48 #36 bsquared     "Ben" Feb 2007 CDA16 Posts I've been spending a little time with the TLP variation again... integrating jasonp's batch factoring code and investigating parameters. The batch factoring code provides a huge speedup for TLP, easily twice as fast as before at C110 sizes. The crossover with DLP quadratic sieve now appears to be around C110, although I'm still not convinced I have found optimal parameters for TLP. There are a lot of influential parameters. So as of now, TLP is still slower than regular DLP in sizes of interest to the quadratic sieve (C95-C100). Code: C110 48178889479314834847826896738914354061668125063983964035428538278448985505047157633738779051249185304620494013 80 threads of cascade-lake based xeon: DLP: 1642 seconds for sieving TLP: 1578 seconds for sieving I've also revisited some of the core sieving code for modern instruction sets (AVX512). Inputs above C75 or so are about 10-20% faster now (tested mostly on cascade-lake xeon system).
2020-07-12, 16:06   #37
Till

"Tilman Neumann"
Jan 2016
Germany

2·32·23 Posts

Quote:
 Originally Posted by bsquared I found this thesis comparing Kleinjung's algorithm with traditional SIQS: https://prism.ucalgary.ca/bitstream/...=2&isAllowed=y The re-explanation of the algorithm is very helpful. Lots of comparisons are presented on up to 480-bit discriminants. One thing it appears to be lacking, however, is a comparison of a well-tuned traditional siqs with a well-tuned Kleinjung approach. Of course, traditional siqs will be very slow in comparison with Kleinjung when the parameters are the same (e.g., block size 2048). And vice versa (Kleinjung does not do well with large sieve area). I am most curious to see how they compare when each are tuned to their respective strengths. Still, it is a very nice thesis report. I can see better now that Kleinjung's algorithm is compatible with the rest of siqs factorization (e.g., polynomial generation, filtering and matrix steps), which might ease integration of the new approach with existing implementations.
Instead of comparing a well-tuned Kleinjung SIQS to a well-tuned polynomial-major sieve, I did an intermediate step:
Compare a polynomial-major SIQS to a prime-major SIQS. (both single-threaded)

The conversion of my old (polynomial-major) SIQS to a prime-major SIQS took two steps:
1. Compute the solution arrays for all (a, b)-polynomials of one a-parameter outside the sieve.
2. In the sieve, have outer loop over primes, inner loop over polynomials.

At first, performance was very bad. I had to switch two parameters to improve that:
1. Increase the sieve array size.
2. Reduce qCount = the number of q's whose product gaves the a-parameter.

These measures improved performance a lot, but the result is still bad.
Here is a comparison to the polynomial-major sieve:

Code:

200 bit: Poly-mj      4s, 789ms, Prime-mj      11s, 224ms, ratio = 2.34370
210 bit: Poly-mj      8s, 22ms,  Prime-mj      21s, 269ms, ratio = 2.65133
220 bit: Poly-mj     19s, 609ms, Prime-mj      55s, 826ms, ratio = 2.84695
230 bit: Poly-mj     40s, 232ms, Prime-mj  2m, 12s, 960ms, ratio = 3.30483
240 bit: Poly-mj     55s, 305ms, Prime-mj  3m, 12s, 622ms, ratio = 3.48290, Prime-mj sieveArraySize = 119552
250 bit: Poly-mj 2m, 15s, 526ms, Prime-mj  8m, 11s, 63ms,  ratio = 3.62338, Prime-mj sieveArraySize = 159744
260 bit: Poly-mj 4m, 49s, 561ms, Prime-mj 43m, 30s, 551ms, ratio = 9.01554, Prime-mj sieveArraySize = 206848
All timings of the prime-major SIQS were taken with qCount=6; using qCount=7 at 260 bit took 47m, 17s, 109ms.

For smaller N, the performance gap stems mostly from additional array accesses and another loop for the largest primes that is not necessary in the polynomial-major version.
The performance break-in at 260 bit might point out too many L3 cache misses.

Kleinjungs SIQS might remedy many of the problems of a silly prime-major sieve. But maybe we can see here what the challenge is.

-----------------------------------------------

@Jason: It seems that Colin Percivals stuff has never been published? The only thing I found were some comments in another forum...

Last fiddled with by Till on 2020-07-12 at 16:24

 2020-07-13, 15:52 #38 jasonp Tribal Bullet     Oct 2004 352910 Posts It was never published. When I asked him about it in ~2007 he said he was too busy with his startup company to work on it.

 Similar Threads Thread Thread Starter Forum Replies Last Post Trilo Riesel Prime Search 3 2013-08-20 00:32 jasonp Msieve 24 2010-06-01 19:14 jasonp Factoring 4 2007-12-04 18:32 fivemack Factoring 18 2007-05-10 12:14 Prime Monster Lounge 34 2004-06-10 18:12

All times are UTC. The time now is 11:18.

Fri Sep 18 11:18:37 UTC 2020 up 8 days, 8:29, 0 users, load averages: 2.02, 1.81, 1.71