20221011, 17:41  #78  
Sep 2016
2×5×37 Posts 
Quote:
I think this is well into the region where NTTs are gonna win simply based on memory usage. And a poweroftwo convolution length does not require any irrational base shenanigans. 

20221011, 22:23  #79 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5^{3}·59 Posts 
Not sure what the minimim feasible timing is for current hardware under US$20k system cost.
Or how a block of time using a supercomputer might change things, or what that would cost. But here are some datapoints for disparate affordable hardware, to place an upper bound on feasible run time. Ernst probably has a better idea of what's currently possible. Maybe Drkirkby's dual28core system can do better than any of the CPU data below. Mlucas.cfg excerpt for dual Xeon E52697V2, in ubuntu atop WSL, system cost me ~$1.5K years ago: Code:
491520 msec/iter = 1452.54 ROE[avg,max] = [0.248991718, 0.281250000] radices = 960 16 16 32 32 0 0 0 0 0 524288 msec/iter = 1597.16 ROE[avg,max] = [0.248288750, 0.312500000] radices = 1024 16 16 32 32 0 0 0 0 0 #above were done with 24 cores, 0:23 If scaling by Geekbench 3 64 bit multicore can be trusted, e52697v2 52150 vs e52698v4 73848 implies ~308. years on a dual e52698V4. https://cpubenchmark.org/cpu/intelxeone52697v2/ https://cpubenchmark.org/cpu/intelxeone52698v4/ See also end of https://www.mersenneforum.org/showpo...6&postcount=14 for timing on a tenth generation Intel laptop, ~$600, a millenium plus to run. Comparing via CPUID CPUZ benchmark, an AMD Ryzen 5950x might be around 120 years. From https://www.mersenneforum.org/mayer/README.html, a Xeon Phi 7250 can do ~552.5msec/iter at 512M, extrapolating to ~150.5 years for F33 Pepin test. (At one time that would have been a $5K system.) In 2017, at https://mersenneforum.org/showpost.p...&postcount=178 Ernst wrote of a 32core Xeon that he estimated could do a gigadigit mersenne primality test in ~20 years. F33 is considerably longer; ~2.5^2.1 ~6.85 times as long, 137. years. 8core Ryzen: ~58 years for gigadigit Mersenne; extrapolates to ~397. years for F33. https://mersenneforum.org/showpost.p...&postcount=182 Looking a bit beyond the current state of the art: To attempt F33 with gpuowl, it would need to be extended to handle >32 bit exponents, add 512M ffts, and implement the Pepin test. Looking at it the other way around, to get F33 PRP run time under 10 years, would require an iteration time < 10 * 365 * 24 * 3600 / 2^33 ~ 0.0367 seconds; 36.7msec. 
20221011, 23:31  #80 
Sep 2016
2·5·37 Posts 
The best implementation I have needs about ~0.65 seconds to do a ~2^33bit multiply convolution on my 7950X with memory @ 4400.
Squaring would certainly be faster, though I don't have benchmarks for that. Either way, still way too slow. Not an applestoapples comparison anyway since it can't do exactly 2^33bit convolution length, but it can probably be modified to match it exactly with no overhead to avoid irrational base weights. And yes, it is 100% memorybound. 
20221011, 23:42  #81 
Sep 2016
2×5×37 Posts 
If we want to look at theoreticals, a 2^33bit convolution using the optimal NTT method will need ~4GB of ram to hold the transform data.
Assuming standard 2pass approach will full passmerging on both ends, each iteration will require reading and writing 8 GB of memory. Zen4 with 6000 memory seems to be around 70 GB copy bandwidth. So we can put a lowerbound of ~120ms using a hypothetical memoryoptimal implementation. (which goes to show how far off from optimal my own implementation still is) Last fiddled with by Mysticial on 20221011 at 23:47 
20221011, 23:54  #82 
If I May
"Chris Halsall"
Sep 2002
Barbados
2B54_{16} Posts 

20221012, 00:04  #83 
Sep 2016
370_{10} Posts 
Actually I lied. It's not 4GB to hold the transform data, it's just 2GB. I forgot the factor of two from the wraparound.
So it's read+write 4GB of memory per iteration, or about ~60ms lowerbound on a Zen4 box. 
20221012, 00:46  #84 
If I May
"Chris Halsall"
Sep 2002
Barbados
10101101010100_{2} Posts 

20221013, 14:23  #85  
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5^{3}×59 Posts 
dept. of corrections
Oops. (Case of mistaken identity.)
Quote:


20221014, 17:13  #86 
"Daniel Jackson"
May 2011
14285714285714285714
2F1_{16} Posts 
In other words, it would require a large cluster of PCs to get this factored in a reasonable time, say <10 years.

20221014, 21:41  #87 
Tribal Bullet
Oct 2004
DE3_{16} Posts 
Ancient history trying P1 on F31/F33 (start at post #13)

20221205, 10:52  #88 
Oct 2016
3^{2} Posts 
A C code for educational purposes
Hey there,
I've written a C code that uses GMP to check if there is any factor less than sqrt of a Mersenne number. It's using trial division and couldn't finish checking 2^1271 in two days so it is SLOW! :) But maybe someone else needs a similar thing or can find it useful for his/her needs. You can run it after compile like ./jumper 47 and it will check 2^471 I've named it "jumper", because it's jumping 2*p after founds a good point to start. Any comment/commit is welcome, C is not a language that I've experienced so maybe there could be improvements. https://github.com/tanaydin/jumper/ Last fiddled with by tanaydin on 20221205 at 10:53 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Do you know this method to factorize?  Godzilla  Miscellaneous Math  28  20171031 18:14 
mathematica7.0 can easily factorize 10^67+1111  aaa120  Factoring  14  20081207 13:14 