View Single Post
Old 2020-05-27, 18:48   #36
bsquared's Avatar
Feb 2007

23·419 Posts

I've been spending a little time with the TLP variation again... integrating jasonp's batch factoring code and investigating parameters. The batch factoring code provides a huge speedup for TLP, easily twice as fast as before at C110 sizes. The crossover with DLP quadratic sieve now appears to be around C110, although I'm still not convinced I have found optimal parameters for TLP. There are a lot of influential parameters. So as of now, TLP is still slower than regular DLP in sizes of interest to the quadratic sieve (C95-C100).

C110 48178889479314834847826896738914354061668125063983964035428538278448985505047157633738779051249185304620494013
80 threads of cascade-lake based xeon:
DLP: 1642 seconds for sieving
TLP: 1578 seconds for sieving
I've also revisited some of the core sieving code for modern instruction sets (AVX512). Inputs above C75 or so are about 10-20% faster now (tested mostly on cascade-lake xeon system).
bsquared is offline   Reply With Quote