20200107, 01:15  #1 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
3·7·317 Posts 
Off topic
This thread is for posts that I make that don't fit the GIMPS reference character of most threads in this blog.
This is not a discussion thread. Don't post here. Comments posted in this thread may be moved or removed without notice or recourse. Post in the discussion thread at https://www.mersenneforum.org/showthread.php?t=23383
Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 20210116 at 16:33 Reason: Added the beast 
20200107, 01:23  #2 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
3×7×317 Posts 
About me
I'm not really comfortable with putting info about myself under "hall of fame"; seems presumptuous at best. Not sure where else to put this, so here it sits, at least for now.
It's titled simply "About me", because "who is this guy kriesel anyway, and why does he think he knows anything about GIMPS or Mersennes or anything related to them, after becoming a mersenne forum member only since March 2017 while GIMPS has been around since early 1996" is way too long. Among my skills are great attention to detail and persistence, technical writing, programming & testing, system and network management, organization of data, and being misunderstood and confusing or annoying people without intending to. I'm also pretty good with a chainsaw and other tools. Although my formal education was in mechanical engineering, with some skills in electrical engineering, programming, etc, I was a floater. After my few years in an automotive research startup, where I was the only degreed engineer, instrumentation engineer, electronics designer, programmer, and also did some HR and safety, I switched employers and led computerization of engineering analysis, drafting, and machining, and was also IT manager for a small department, and was a member of the UWMadison Network Advisory Group. I've done real time motion control programming in macro assembler, of the 6.4 meter / 21foot x travel 4axis (xyz & rotate) robot I designed, for lightsout 24/7 remote OSintegrated optical disk jukebox operation, and other technically demanding and sometimes safetycritical work. (Imagine a submm 3kw electron beam aimed at someone's head, with less than 1mm of metal, 2mm of water, and 3mm of plastic between them. Inside the radiation shielding the xray flux was lethal dose at about a minute. Or designing and producing space rated hardware with zero chance of snagging or puncturing an astronaut's suit. Or designing extreme precision mechanisms for operation for decades in ultra high vacuum where no oil or grease or fingerprint is allowed. Or creating a package of hardware and documentation and tools to be shipped to the South Pole for installation there by a physics professor. Two sets of documentation; one for him to read on the planes and lose, and one in the crate with the hardware.) As someone raised on a dairy farm by Depressionera parents, and trained as an engineer, waste and ignorance are the enemy, the war is lifelong. Knowledge is good, clarity is good, accuracy is good, efficiency is good, more of each is better; iterate. As far as Mersennes go, it all started when I read of one of Slowinski's discoveries in a science magazine. (You know, those things people made out of flat pieces of ground up dead trees coated with white clay and marked with various colored inks, and.sold at retail stores or through subscriptions via the mail, actual delivery by the USPS) In retrospect, the article was woeful, not mentioning the Lucas Lehmer test or any other number theory term. "TF79LL86GIMPS96gpu17" means I started with trial factoring in 1979 on a programmable calculator with <50 program step limit so all odds < sqrt(Mp) were trial factors, later wrote my own simple slow integer based combined LL and TF with smallprime sieving of candidate factors software "ll" beginning around 1986 in Fortran and later c, joined GIMPS as soon as I heard of it in 1996, provided Woltman a list of residues produced from "ll" as data he requested for confirming prime95's accuracy, & expanded into gpu computing in 2017. At the peak, I was as I recall ranked #2 in GIMPS for assignments completed and #3 for computing time, back when such levels were easier to reach because there were fewer participants. Some of my current GIMPS activities, beyond production TF, P1, LL, PRP, and DC at the GIMPS wavefronts:
Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 20210829 at 20:25 
20200107, 01:39  #3 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
3×7×317 Posts 
Personal bests in GIMPS
The reference info accumulated in this blog seems like an accomplishment, and has reached book size.
Producer rankings highest ranking reached in top producers: #3 by P90years #2 by exponents LL tests completed and reported (that was a very long time ago, 1997?, well before gpu usage was begun in GIMPS, or the PRP test adopted, and with the aid of a university small department's PC fleet in addition to my own, and when there were far fewer GIMPS participants than there are now) More recently, from https://www.mersenne.org/report_top_...ate=&end_date= & similar: All work types #6 Nov 28 2020 Trial factoring #4 Nov 7 2020 P1 factoring #1 Feb 21 2021 All first primality tests #7 April 16 2022 LL first primality test #24 or better (Feb 15 2021) PRP first primality test #7 April 16 2022; #7 May 10 2022 All doublechecks #6 Dec 23 2020 LL double check #10 Feb 9 2020 PRP double check #2 May 10 2022 TF highest TF bit level completed in p<10^{9}: 86, https://www.mersenne.org/report_expo...0000029&full=1 highest GHzD TF bit level attempt completed in p<10^{9}: 7914.88, https://www.mersenne.org/report_expo...0000029&full=1 (higher in project OBD; M3321928319 from 2^{91} to 2^{92} for example, 150,963 GhzD) largest factor found by TF: factor 9661162819438172378751929, exponent 887184833, k 444842190758108, 83 bits, 953. GhzD https://www.mersenne.org/report_expo...exp_hi=&full=1 P1 largest factors found by P1: factor 1850779449499999281951841008388233436343137 for M104336179 (140.409 bits), k value 8869308169220952982914205667758992 = 2^{4} * 3 * 31 * 53 * 71 * 79 * 1327 * 2017 * 2659 * 4211 * 215483 * 3104789 (113.357 bits) factor 9355626768004649694886231056436430053831 for M104351167 (132.781 bits); k value 44827609680707498435960141473245 = 3 * 5 * 23 * 227 * 5393 * 223037 * 227663 * 324743 * 6436667 (105.144 bits) Found 20210612 on a Radeon VII GPU running Gpuowl V6.11380 during a lucky run of 25 exponents yielding 5 factored, with mersenne.ca GPU72 recommended bounds B1=650,000, B2=24,000,000. exponent 101837677 factor 517050261390286174198141026167350661719 (128.604 bits) exponent 102535079 factor 408813325950827664880164653670091063297 (128.265 bits) with k value 1 993529 092374 462718 657312 653312 = 2^{16} × 3^{3} × 13 × 31 × 137 × 9689 × 18367 × 328381 × 349187. Found 20210206 14:42:16 UTC on a Radeon VII gpu running gpuowl V6.11380 on firsttest wavefront exponents with default 1M, 30M bounds. exponent 101273659 factor 375324496211112358228921438201039041097 (128.141 bits) exponent M102590167 factor 288065752848991018613790448294738221721 (127.760 bits) with k value 1403963758285874603429539442580 2^{2} × 3^{2} × 5 × 13 × 13999 × 235661 × 349133 × 594641 × 876011 exponent M103781683 factor 548489556611245305222319636879155313 (118.723 bits) exponent 102429427 factor 241863833036789833487894403779180959 (117.542 bits) with k value 1180636464151995273232829877 = 3 × 11 × 13 × 241 × 6,323 × 372,473 × 751,901 × 6,448,567 exponent 97875719, larger factor 6248386517510450606520265730011457, 112.3 bits, k value 31 920003 149659 879415 650912 = 2^{5} × 233 × 503 × 1427 × 2591 × 120907 × 19 039091 https://www.mersenne.org/report_expo...7875719&full=1 Most of these are still small compared to what it takes to get onto the top 100 largest P1 factors found, at ~134.4 bits: https://www.mersenne.ca/pm1user/1 or top500 at ~121.3 bits. The current #1 in that list is a 173.2 bit factor. highest k value for a factor found, see preceding entry, M104336179 highest exponent P1 factored: exponent 901000031, factor 66276073654639633298220609, k value 36779173903624223 = 37 × 7538693 × 131857303 https://www.mersenne.org/report_expo...1000031&full=1 (with prime95 v29.8b6 on an i78750H Windows 10 laptop, in 57 days) highest exponent P1 completed both stages to GPU72 labeled bounds on mersenne.ca https://www.mersenne.ca/exponent/1000000007 (On Radeon VII GPU, GPU time ~91.4 hours, scattered over total elapsed time ~7.5 days, with reduced power for power efficiency) https://www.mersenne.org/report_expo...9999937&full=1 (On Tesla P100, Google Colab, several sessions) Primality testing highest exponent primality triple check LL: 332194529 (Fan Ming's DC matched) highest exponent primality double checks LL: 200000069 my highest LL DC that was confirmed by someone else's TC: 145500007 PRP: 187046537 332897017 sort of a DC; residue types not matched, but both returned composite My highest first test PRP that has been dc verified: 187046537 By someone else: 121596589 highest exponent primality firsttest: LL 402143717 PRP with proof, with Cert by someone else: 843112609 (Thanks, George!) PRP with proof, and successful cert 999299999 PRP without proof, awaiting PRP DC 852348659 But still, number of Mersenne primes discovered: 0; number of competitivespeed GIMPS programs I've written: 0. Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 20220510 at 15:20 Reason: highest rankings & PRPs completed updated 
20200107, 01:44  #4 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
1A01_{16} Posts 
Landau's fourth problem
Are there an infinite number of primes of the form x^{2}+1 for integer x? How seductively simple a question that seems.
Posed in 1912 as one of Landau's 4 problems, it remains unanswered still. Per https://en.wikipedia.org/wiki/Landau%27s_problem, several bounds have been placed around it. Iwaniec: at most two prime factors; Ankeny: x^{2}+y^{2} with y=O(log x) rather than y=1; Deshouillers and Iwaniec: greatest prime factor at least x^{1.2} rather than x^{2}; Conversely, the Brun sieve establishes an upper limit on the density of such primes, such that ALMOST all numbers of the form n^{2}+1 are composite. Since odd x squared is odd, x^{2}+1 for odd x>1 is composite. A few low values: x l=x^{2}+1 1 2 2 5 3 10 even 4 17 6 37 8 65 composite (semiprime) 10 101 prime 12 145 semiprime 14 197 A list of the first 10,000 such primes is at https://oeis.org/A002496/b002496.txt, which includes a lot of references also. A few others are http://nntdm.net/papers/nntdm22/NNTDM2247377.pdf http://mathworld.wolfram.com/LandausProblems.html https://www.mersenneforum.org/showthread.php?t=21600 http://devalco.de/quadr_Sieb_x%5E2+1.php (with a few short programs) Wolf searched for up to 10^{20} in 2008; https://arxiv.org/pdf/0803.1456v1.pdf revised 2009 https://arxiv.org/pdf/0803.1456v2.pdf revised 2010 https://arxiv.org/abs/0803.1456 Wolf and Gerbicz computed a table up to 10^{25} in 2010. Grantham followed with up to 6.25x10^{28} in 2016. His effort is described at https://drive.google.com/file/d/0B_r...1UcXdQUEU/view and makes use of sieving. Per Grantham (slide 13) "If x^{2} + 1 is divisible by an (odd) prime, that prime must be 1 mod 4. We want to sieve by these primes." That list is in turn sieved by primes up to sqrt(x). THAT list is sieved by sieve of Eratosthenes. Triple level sieving. So to get to 10^{28}, sieving would be no higher than 10^{14}, 10^{7}, and 10^{3.5}. A sieve bit map of 4k+1 covering 10^{14} is 10^{14}/4 = 2.5 10^{13} bits, 3.125 10^{12} bytes, rather large compared to common system memory or disk space. So the sieving is probably done to a limited extent or in segments. Grantham does the computation in parallel distributing the large sieve to multiple processors. The known counts of such primes to powers of ten up to 10^{28} are listed at https://oeis.org/A083844 I wrote a simple, directprimality testing program (no presieving) in perl. It uses singleword math through 10^{14} and then switches to necessary but slower multiword math, and becomes computationally expensive around x^{2}+1 = 10^{20}. It produces matching counts of primes to that level, so it's not proven to be incorrect, and may even be correct, albeit slow and inefficient. Let j be any natural number, out to positive infinity. Considering each 10^{(j1)} < x^{2}+1 < 10^{j} 1 decade range as a separate interval, the interval size is 0.9 x10^{j}, the density of primes of form x^{2}+1 is ~0.55 10^{j} and the count for the interval is ~0.495 10^{j} / j. If this holds up the number line, the sum of interval counts is infinite. That's a big if, which no one's been able to prove or disprove. If however, the number of such primes is finite, there could be only a finite number of decade intervals containing such primes. Excluding a decade as empty is much less computationally intensive than counting the primes in the decade; finding just one such prime suffices. For the total number of near square primes to be finite, it is not enough to show a single 10fold empty interval, or even many of them. It requires that the nonempty intervals be finite in number, and that requires the empty intervals be infinitely many. A modified version of my little program to search for possibly empty decades has been run up through x=10^{2000} and found such a prime in each decade, so no empty decade in any tried in x^{2}+1 up to <10^{3999}. Again, it becomes computationally expensive, limiting it to around x=10^{2000}. Faster hardware and faster software could move that a bit, but the run time scaling is VERY steep. Like so many other things, this topic too is apparently not immune to a bit of crankery. https://www.quora.com/SinceLandaus...ngbyathread https://www.scienceforums.net/topic/...nsfornewton/ https://math.stackexchange.com/quest...roblemcorrect Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 20200418 at 13:47 
20200107, 08:04  #5 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
3·7·317 Posts 
Blogger blunder backup blackout blues
It started innocently enough; move the "About me" post from the hall of fame thread to the off topic thread. Well, threads are ordered by date when first posted, which put About me ahead of the intro/table of contents of the offtopic thread. Simple, copy the text, delete the post, add it to the end as a reply, right? Wrong. Deleting the first post deletes the thread.
Well, I thought, luckily Xyzzy backs up the forum. Contact him to have the thread restored from backup. He had seemingly dismissed my concerns about separate backup for the blog long ago, so his backups must be pretty good, right? If a server disk crashes and the whole forum needs to be restored, yes. But he tells me it does not support restores at the post or thread level (or presumably the blog level); only a full forum restore. So, off to look for usable copies of the lost content, through various systems' open browsers, drives, caches, Google cache, the internet archive (Wayback Machine), etc. Google cache only contained about one sentence of the lost content. The Internet Archive only captures the top level page of the mersenneforum.org site, nothing beneath. My local systems yielded nothing useful. I had previously "cleaned off duplicate content". Gone: https://www.mersenneforum.org/showthread.php?t=24735 Off topic https://www.mersenneforum.org/showpo...01&postcount=2 Landau's fourth problem
So, bloggers beware, backup elsewhere. Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 20200107 at 08:28 
20201004, 16:54  #6  
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
3×7×317 Posts 
Fermat numbers
Fermat numbers, F_{n}=2^{2^n}+1, are prime for n=1 to 4. Fermat numbers five through 32 have been determined composite through finding factors or performing Pepin tests for primality. These numbers get large very fast. F33 is ~8 billion bits long and is beyond the current capability to Pepin test in reasonable time except perhaps on the faster supercomputers. https://en.wikipedia.org/wiki/Fermat_number
Ernst Mayer's well written article in Odroid is worth a read for general background. https://magazine.odroid.com/article/...ticalhistory/ For F33 size is ~2^{33} bits as a compact integer; primality testing involves repeated powers of 3 mod F33. https://en.wikipedia.org/wiki/P%C3%A9pin%27s_test Such huge numbers are processed using ffts. FFTs are usually most efficient when based on powers of 2, or other sizes that are quite smooth (having only very small factors), and no larger than necessary for accuracy. Lists of 7smooth possible sizes are here: https://www.mersenneforum.org/showpo...75&postcount=7 Recently gpuowl has been extended to use fft lengths that are 11 or 13smooth. If I've read the source correctly, Mlucas uses even 31smooth. Maximum usable bits/ DP word declines as fft length or exponent increase. For example, in gpuowl P1, a 24M Mersenne requires 1280K, for 18.31 bits/word, while a near 1G Mersenne requires 57344K fft length, 17.03 bits/word. CUDALucas, gpuowl, mprime, etc all need about 56M words or more for fft multiplication up to 10^{9} or 2^{30}. The number of usable bits/word declines slowly as operand size/fft length grows. More space is needed for accurate carry accumulation, or something like that. To estimate how large an fft length would be required for F33, consider that at a gigadigit exponent in gpuowl V6.584 which supported such large Mersennes, it was 16.5 bits/word, declining at about 0.7 bits/word per factor of ten on exponent. https://www.mersenneforum.org/showth...384#post546384, which extrapolates to ~16.2 bits/word at F33 size. Primality testing Per https://www.mersenne.ca/credit.php?f...30&worktype=LL the effort to primality test Fermat numbers is: F30 62610 GHzD F29 14175 F28 3206 This seems to be slightly steeper than the ~p^{2.1} rule I've seen in Mersenne primality tests. (That would give 58923 for F30 extrapolating from F28's figure. The above figures seem to be p^{2.14}.) Extrapolating, F33 would be about 8^{2.14} times longer than F30, or if the code existed, and the hardware resources were adequate, and there was not significant performance drop due to data size (less effective caching, or whatever). 8^{2.14} x 62610 = 86.3 x 62610 = 5.4E6 GhzD. There's a LucasLehmer test completed but not doublechecked for M999999937 using CUDALucas. It took ~six months on the fastest available gpu at the time, an NVIDIA GV100 and is listed as 47431 GhzD. See https://mersenneforum.org/showpost.p...54&postcount=8 Comparing these figures indicates Fermats take ~13% more effort than Mersennes of equivalent exponent. In the following I assume the effort is the same. I think the Pepin test for Fermats would take similar time per iteration and have similar iteration count to Mersenne PRP for operands of the same size. Gpuowl does not handle Fermat numbers. Gpuowl does not support the required fft length, which would be around 8G/(16.2 bits/word)=506 Mwords in DPfloats. The current max gpuowl fft length 120M handles a little over 2^{31}, and does not support gigadigit Mersennes, 3.32G exponent, much less 8G exponents. It would need to be extended to at least ~506M, if not 512M or more. Ernst's statement about Pepin testing F29 in 30 and 32M ffts confirms this estimated requirement. https://www.mersenneforum.org/showpo...7&postcount=48 Also F30 at 60 & 64M; 68 & 57 ms/iter https://www.mersenneforum.org/showpo...5&postcount=55 CUDALucas is a gpu implementation of the LucasLehmer test for Mersenne primes on NVIDIA gpus. It does not support a sufficiently large fft length for 2^{33}bit tests, Mersenne exponents larger than 2^{31} 1 in its signed integer exponent variable, or Jacobi check for the LL computation. It does not implement PRP, or GEC check, or the Pepin test for Fermat numbers. CUDALucas uses the CUDA fft library, so as I recall, the fft length is hard capped at 256M words, for 1d DP ffts, until NVIDIA decides to extend the library. Then it would require someone extend CUDALucas code to match. Or a top layer of Karatsuba could be used, doing 3 muls and some adding and subtracting to accomplish doublelength squaring and work around the current CUDA library limit. And in either case, implementing the Pepin test. Mlucas supports testing Fermat numbers. https://www.mersenneforum.org/mayer/README.html The largest Mersenne number testable with it (ignoring runtime concerns) is that with the largest unsigned 32bit prime exponent 4294967291, which is near F32. It also appears, based on a light reading of the V19 source code, to include support for 512M fft length and F33. And the extension to 512M fft is described here; large fft benchmarks on Knights Landing here. When Ernst Mayer tested F29 and F30, he made two runs and saved files from both at 10Miteration intervals. Duplicating runs may be avoidable by implementing Robert Gerbicz' error check for Pepin tests. https://www.mersenneforum.org/showthread.php?t=22510 Maximum fft length implemented in mprime/prime95 depends on cpu type; 32M for SSE2, 50M for FMA3, 64M for AVX512. These fall far short of what F33 would require. https://www.mersenneforum.org/showpo...74&postcount=8 A Radeon VII on Linux with ROCm driver and good overclockableto1200Mhz gpu memory can reach 510GhzD/day in gpuowl at 5M fft length. However, on Windows, I've observed considerable performance rolloff at much higher fft lengths (1448M). And Ernst Mayer has reported markedly lower performance at 8M fft length than at 5M on Linux. Assuming that extended precision for the exponent and other effects provide ~half the performance of 5M fft, 5.4E6 GhzD/ (510/2 GhzD/day) =~ 21190 days =~ 58 years on Radeon VII. From the rolloff I've seen, and the considerable further fft length extension needed, that may be optimistic. A very rough estimate of the gpu ram required is >12GB; maybe 16GB on a Radeon VII is enough. https://www.mersenneforum.org/showpo...93&postcount=7 The Knights Landing 64core system on which F29 was partly tested was based on KNL Xeon Phi, a manycore Intel cpu design, discontinued in 2018. KNL's stated peak DP performance spec was almost comparable to that of a Radeon VII. But its memory bandwidth is ~2.5 times slower. https://en.wikipedia.org/wiki/Xeon_Phi The 32core AVX512 on which F30 verification was planned to be concluded gave a completion time that is consistent with a 2 year entire run for F30, so for F33, centuries. https://www.mersenneforum.org/showpo...1&postcount=75 Estimating save file size by Mersenne gpuowl scaling, F33 would be about 1GB, manageable. https://www.mersenneforum.org/showpo...7&postcount=22 https://www.mersenneforum.org/showpo...3&postcount=52 in fact says 64MB each for F29. That would scale to 1GB for F33. Saving every 10M iteration checkpoint of 2 parallel runs as Ernst did for F30 would add up. 1GB x 2 x 2^{33}/10^{7} = 1.72TB, manageable. Factoring Before tackling such a monster test, a lot of factoring is probably in order. MMFF is a possibility for TF. So is Mfactor. http://www.fermatsearch.org/index.html lists ecm software. http://www.fermatsearch.org/factors/programs.php lists several varieties. http://www.fermatsearch.org/download.php has links but to older versions (prime95 v28.10 for example, while current is v30.3) Not sure which would support such a large number; prime95 would not because of fft length. F33 is 2^{2^33}+1, or in k,b,n,c form for worktodo files, 1,2,8589934592,1 A worktodo line would be ECM2=1,2,8589934592,1,B1,B2,curves_to_run[,"commaseparatedlistofknownfactors"] Pminus1=1,2,8589934592,1,B1,B2[,"commaseparatedlistofknownfactors"] PRP=1,2,8589934592,1[,how_far_factored,tests_saved][,prp_base,residue_type][,"commaseparatedlistofknownfactors"] After I select bounds rather arbitrarily, for time estimating on a 4core AVX512 capable i51035G1 processor, prime95 V30.3b6 interprets these F33 related worktodo lines as involving p=2: ECM2=N/A,1,2,8589934592,1,250000,25000000,1 Pminus1=N/A,1,2,8589934592,1,250000,25000000 PRP=N/A,1,2,8589934592,1,100,0 Trying F32, it interprets these also as involving p=2; ECM2=N/A,1,2,4294967296,1,250000,25000000,1 Pminus1=N/A,1,2,4294967296,1,250000,25000000 PRP=N/A,1,2,4294967296,1,100,0 Trying F31, crashes prime95 when test, status selected. Retrying them individually, ECM2=N/A,1,2,2147483648,1,250000,25000000,1 this line crashes prime95 Pminus1=N/A,1,2,2147483648,1,250000,25000000 1.5 years estimated PRP=N/A,1,2,2147483648,1,100,0 this line crashes prime95 Trying F30: p=1073741824, on i51035G1, which supports 64M AVX512 ffts and ~1.168G max Mersenne exponent, prime95 V30.3b6 yields these estimates: ECM2=N/A,1,2,1073741824,1,250000,25000000,1 13 days for one curve, Pminus1=N/A,1,2,1073741824,1,250000,25000000 4 days, PRP=N/A,1,2,1073741824,1,100,0 1 year 2 weeks. Extrapolation of those run times to a hypothetical capability not yet included would be 8^{2.14} = 85.6 times longer, or 3 years/ecm curve, 11 months for P1, 90 years for PRP test. The PRP time estimate could be taken as a rough estimate for Mlucas run time on the same hardware. As I recall Ernst Mayer indicated on the Mlucas readme page that Mlucas performance was comparable to mprime on AVX512. GP2 on factoring attempts on F29 vs F33: "it takes about 120GB of memory and about a week to do a single ECM curve for F29." on one core; later clarified to 118 hours (~5 days). https://mersenneforum.org/showpost.p...28&postcount=2 Others comment they've run multicore on 32GB ram in less time. Mersenne number P1 run time scaling runs on Gpuowl V6.113xx and Radeon VII show number of buffers declining such that, extrapolating from 400M to 1G, it appears stage 2 would drop below 1 buffer and fail, by 5G exponent. The Radeon Pro VII won't help here, because while it's faster at DP, it has the same amount of gpu ram. See the Radeon VII attachment at https://www.mersenneforum.org/showpo...5&postcount=17 Estimated run time is, by extrapolation from 500M and 1G, 0.95 and 4.02 days respectively for both stages, c p^{2.08}. Rough extrapolated F33 P1 time is then 4.02 days x 8.59^{2.08} = 352 days = 11.6 months = 0.965 years. Existing P1 code I'm aware of (before Gpuowl v7.x) has essentially no error checking. It would need to be rewritten substantially to include a considerable amount of error detection and recovery for such a long run. Estimated stage 2 save file size is ~4 GB. In Gpuowl V7, P1 factoring is combined with PRP computations in stage 1, lowering the combined cost for Mersenne numbers https://mersenneforum.org/showpost.p...0&postcount=19, and I think that means that GEC on the PRP may protect its computation somewhat. (If nothing else, unreliable hardware could be detected by a GEC error in the PRP computations.) A Jacobi check is done at start or load and at 1M iteration intervals of P1 stage 1. https://mersenneforum.org/showpost.p...2&postcount=82 Preda describes stage 2 error resistance in https://mersenneforum.org/showpost.p...7&postcount=98 In https://mersenneforum.org/showpost.p...6&postcount=31 he seems to say that enough gpu memory for at least 24 buffers is required. Its stage 2 save file size is tiny. Here's an example for 957156667 stage 2 in progress: Code:
OWL P2 2 957156667 8310000 200235109 Quote:
Ernst writes about doing F33 P1 stage 1 twice with different systems as a reliability check, before using the stage 1 residue for parallel stage 2 runs on distinct B2 ranges. Stage 1 parallel runs would probably also use differing fft lengths as an error check. It will be interesting to see how many of the error checks used in Gpuowl v7.x P1 factoring on Mersenne numbers (PRP/GEC interim results multiplied together; Jacobi symbol check), or from prime95/mprime, or otherwise (post 9, post 10) are productively adapted to P1 factoring or Pepin testing of Fermat numbers. Numerous questions are posted in the PRP for F33 thread at https://www.mersenneforum.org/showpo...5&postcount=13. No pertinent responses yet as of 20210421. Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 20210421 at 23:10 Reason: minor edit /update of last date 

20210116, 16:31  #7 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
1101000000001_{2} Posts 
What is this beast called?
I looked for something related to this by starting with the OEIS for Mersenne exponents for known primes https://oeis.org/A000043, but didn't find anything that seemed related to this exploding sequence.
(If this is already a known sequence, please PM me with a link, and I'll add it and adapt terminology here to match.) Consider the sequence M* (i) such that for element i>0, it is recursion i times of M*i= 2^p1 where p=M* (i1). That is, a recursive expression for computing the position in the list of Mersenne primes. It doesn't seem of any practical use, since we have already found all 3 terms i>0 possible to find. Let the seed i=0, M*0 = 2. Then for i=1 M*1 = Mp2 = 3 (the first Mersenne prime has exponent 2 and value 3) i=2 M*2 = Mp3 = 31 (the third Mersenne prime has exponent 5 and value 31) i=3 M*3 = Mp31 = 2^{216091 }1 is the thirty first in the ordered list of Mersenne primes i=4 M*4 = Mp(2^{216091} 1) = ? The fourth element is the Mp31'th Mersenne prime, sure to be never found by humans, and sure to be enormous if it exists. This sequence's terms grow far faster than even Catalan's C(n+1) = 2^{Cn}  1. Finding M*4 requires finding over 216,000 additional Mersenne primes. Assuming the conjecture about Mersenne primes' distribution is correct, M*4 is uncomputable, since we'll run out of matter on which to perform and store the computation and run out of lifetime of the universe to test the immense number of increasingly huge composites or primes. Estimating the size of M*4, by Mn+1 conjectured as on average having exponent 1.47576 times Mn, and it seems reasonable to expect a LOT of averaging in over 216,000 intervals; M*4 =~ Mp31 * 2^(1.47576*(2^{216091} 1  31)) = ? =~10^{65049.87} * 10 ^ (.30103*1.47576 * 10^{65049.87}  1.47576*9.33*.30103) =~10^{65049.87} * 10 ^ ( 10^{65049.52} 4.14) =~10^ ( 10^{65049.52} + 65049.87 4.14 ) =~10^ ( 10^{65049.52} + 65045.73 ) =~10^ ( 10^{65049.52} ) * 10^{65045.73 } =~10^ ( 10^{10^4.813242} ) * 10^{10^4.813219} The larger term in the product above is large compared to numbers approximating a googolplex, 10^10^{100}, which are unstorable in integer form. For comparison, the estimated number of subatomic particles in the known universe is about 10^{80}, or 10^{89} including photons; no more than about 10^10^{1.95}. So storage of work in progress becomes a problem. Very early, as a percentage of completion of identifying M*4. Execution time is also a problem, due both to distances across the storage volume (c l = t =~ 1/frequency), and to the enormous number of iterations required for primality testing of enormous exponents. M*4 is far larger than Catalan's C4 or C5: C4 = M127 = 170141183460469231731687303715884105727 ~1.7x10^{38} (prime) C5 > 10^{51217599719369681875006054625051616349} ~10^10^{37.7} (is C5 prime ?) M*4 is prime by definition if it exists, since it is defined as an element in an ordered list of Mersenne primes. https://primes.utm.edu/notes/faq/NextMersenne.html https://primes.utm.edu/mersenne/ Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 20210118 at 14:11 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Off topic  board games on the web  robert44444uk  Lounge  3  20190305 08:51 
Side Topic: 'It's Not a Toomah'  R.D. Silverman  Math  68  20151115 04:21 
Topic of Peepholes Friendship :)  coffee1054  Lounge  7  20120217 03:38 
OffTopic: Spurious IRQ Interrupt?  moo  Hardware  4  20050326 19:38 
AMD vs. Intel topic  Pjetro  Hardware  11  20021104 21:00 