mersenneforum.org Best Practices
 Register FAQ Search Today's Posts Mark Forums Read

 2020-01-16, 15:07 #1 kriesel     "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 3×2,161 Posts Best Practices Ernst Mayer recently posted in multiple threads, the idea of having reachable within 2 clicks of the first post in a hardware or software specific how-to thread, best practices for the given application. But what are the best practices? This thread is for discussion of that question. There are practices that are generally applicable, and others that will be application specific at least in the details.
 2020-01-16, 15:08 #2 kriesel     "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 3×2,161 Posts A first draft of suggested general best practices In general, what would constitute best practices for GIMPS effort? My draft proposal: Use the most efficient software for the task and hardware (example: gpuowl not cllucas for AMD gpu primality testing) Select the most efficient hardware for the task (examples: use an RTX20xx for TF, use a good cpu or Radeon VII for PRP or P-1; use gpus with relatively greater single precision performance compared to double precision for TF, those with relatively greater double precision performance compared to single precision for PRP, LL or P-1 factoring. Don't use cpus for TF, since gpus are so much more effective at it.) Use a very recent version of the chosen software. Use the most effective settings for the given software (examples: PRP not LL first tests; optimal throughput by benchmarking prime95/mprime for throughput versus number of cores/worker versus various fft lengths, analyze, and reconfigure when appropriate) Use judiciously chosen inputs, for reasonable run time and feasibility of completing the task accurately. A run that takes years is not only likely to expire before completion, it is unlikely to complete accurately unless it is protected by the GEC. Always log the runs. Some applications have logging built in. Others will need tee or redirection. Tune the application for the specific software version and hardware involved and exponents being run. Run at least one double-check, and a memory test, to test the reliability of the hardware & software combination, before beginning production running. Regularly review the logs for errors. Either manually or with an analysis tool. Repeat double-check or self-test and memory test at least annually. Hardware reliability changes over time. Retune if substantially changing the exponents being run, or when a new version of the software is deployed. Reserve assignments first. Don't poach the assignments of others. Select work types and assignments appropriately to the capabilities of the hardware and software, so that assignments complete in a reasonable amount of time. (In most cases that will be under two months.) Contribute at least about 1/5 of your primality testing effort as DC. Prioritize advancing the GIMPS wavefronts of TF, P-1, first primality testing, and double-checking. This is the most effective at advancing the state of knowledge about Mersenne primes. When P-1 factoring, use the full PrimeNet bounds given for the exponent at mersenne.ca when possible. (That is the most effective strategy while exponents are being both primality tested and verified, as they currently are.) What else?
 2020-01-16, 15:15 #3 kriesel     "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 195316 Posts Best apps to use My opinion: TF: mfaktc on NVIDIA, mfakto on AMD or Intel IGP (Mfactor or Factor5 only for exponents beyond the reach of gpu apps) P-1: mprime/prime95 on cpu, Gpuowl V6.11 if it will run on the gpu, CUDAPm1 v0.20 on NVIDIA gpus within narrow limits if they can't run Gpuowl primality testing: mprime, prime95, mlucas, gpuowl (CUDALucas v2.06 on NVIDIA gpus that can't run gpuowl, or when specifically running LL DC on NVIDIA gpu) Last fiddled with by kriesel on 2020-01-16 at 16:13
 2020-01-16, 18:36 #4 Uncwilly 6809 > 6502     """"""""""""""""""" Aug 2003 101×103 Posts 10,529 Posts Why not make a table? I have attached an idea. Across the top is the hardware type, down the side is the test type. The software available for that combination are named and linked to the MF thread. Bolded ones are the recommended use for that hardware. Greyed or strikethroughs are uses advised against. Plain are ok uses, but not the best. All of that can be done in a Code box. Attached Thumbnails
2020-01-16, 20:55   #5
ewmayer
2ω=0

Sep 2002
República de California

32×1,303 Posts

Quote:
 Originally Posted by Uncwilly Why not make a table? I have attached an idea. Across the top is the hardware type, down the side is the test type. The software available for that combination are named and linked to the MF thread. Bolded ones are the recommended use for that hardware. Greyed or strikethroughs are uses advised against. Plain are ok uses, but not the best. All of that can be done in a Code box.
Code box is problematic due to lack of clickable links - as Ken noted, I want users to be able to view Post #1 in some master thread (likely the one mentioned below), quickly see where to get code/instructions for their target platform and preferred worktype, and 1-click to get there. I had the idea of taking the table at top of Ken's current Mersenne prime hunting software PDF and inlining it, along with relevant links, in Post #1 of the thread housing it. Problem - Mike has disabled HTML-style table markups in the forum due to security concerns. So I suggest a "linearized table" consisting of worktype-based main categories, each of which is followed by a set of brief which-program-to-use-for-platform-X entries, each with an embedded link. Example:

Basic summary of currently relevant GIMPS clients: Once you have used the tables below to figure out which client to use for your desired worktype and platform, click the relevant client link:

1. Primality testing: Note that this includes the recently-introduced PRP (probably-prime) test type, which unlike the traditional rigorous-primality LL test permits a strong form of residue integrity checking, the so-called Gerbicz error check. This is preferable, when available in the relevant client(s) on typical consume hardware lacking ECC memory and fast-but-fault-prone hardware such as GPUs. Any user discovering a likely-prime via PRP testing which is confirmed by the standard subsequent LL-test verification runs will get the same discovery credit as an LL-test user would.

o x86 (Intel and AMD) CPUs: Prime95/mprime (current version: 29.8b6): George Woltman's famous Mersenne-prime-hunting program: Prime95 is the Windows client, mprime the Linux. Does primality testing (both the traditional LL test and the more-recently added PRP-test with Gerbicz error check), Trial Factoring (but use of GPU clients now recommended for that worktype), p-1 and ECM factoring.

(Users who wish to run on x86 in non-networked mode due to security concerns can use Mlucas, which needs no network connection, but is not as efficient on x86 as Prime95/mprime.)

o ARM-based and other non-x86 CPUs: Mlucas (current version: 19.0; dedicated subforum here): Ernst Mayer's program: Can be built under Windows via built-in Linux shell, but is *nix oriented. Supports both LL-testing and PRP-test with Gerbicz-check. Has optimized assembly code for 128-bit ARMv8 SIMD instructions and also for x86 SIMD (128,256 and 512-bit versions) but as noted Prime95/mprime are more efficient on the latter. Also supports a generic-C build mode for platforms lacking vector arithmetic support or ones with SIMD but not of the ARMv8/x86 variety.

o nVidia GPUs: [description-of/links-to CuLu and OpenCL-built GpuOwl]

o [Other clients folks may be using]

2. Trial Factoring
...

3. p-1 Factoring
...

We can probably omit a separate category for ECM factoring, since AFAIK Prime95 is the only GIMPS client supporting it. Or include it, with description-of/links-to Prime95 and GMP-ECM.

2020-01-16, 21:33   #6
Uncwilly
6809 > 6502

"""""""""""""""""""
Aug 2003
101×103 Posts

10,529 Posts

Quote:
 Originally Posted by ewmayer Code box is problematic due to lack of clickable links
Horse hockey
Code:
		Hardware 1	Hardware 2	Hardware 3
TF		GPUOwl		None		None

Last fiddled with by Uncwilly on 2020-01-16 at 22:34 Reason: Tested a stricken URL and such

2020-01-16, 21:46   #7
ewmayer
2ω=0

Sep 2002
República de California

32·1,303 Posts

Quote:
 Originally Posted by Uncwilly Horse hockey Code:  Hardware 1 Hardware 2 Hardware 3 Primality google Here None TF GPUOwl None None
I stand corrected. :) I must've confused this forum with some other one I've posted to in the past. So we start with a 2-D table which fits in a single browser frame, and we can add the longer descriptions below that.

Technical question: What is the difference between Horse hockey and Bull pucky? The preferred fodder, quality of the meat, what?

Last fiddled with by ewmayer on 2020-01-16 at 21:48

2020-01-16, 21:56   #8
Dylan14

"Dylan"
Mar 2017

24·37 Posts

Quote:
 Originally Posted by kriesel My opinion: TF: mfaktc on NVIDIA, mfakto on AMD or Intel IGP (Mfactor or Factor5 only for exponents beyond the reach of gpu apps) P-1: mprime/prime95 on cpu, Gpuowl V6.11 if it will run on the gpu, CUDAPm1 v0.20 on NVIDIA gpus within narrow limits if they can't run Gpuowl primality testing: mprime, prime95, mlucas, gpuowl (CUDALucas v2.06 on NVIDIA gpus that can't run gpuowl, or when specifically running LL DC on NVIDIA gpu)
Shouldn't the bolded bit be CUDAPm1 v0.22, as that is the newest version of that software?

2020-01-16, 22:23   #9
ewmayer
2ω=0

Sep 2002
República de California

32·1,303 Posts

Quote:
 Originally Posted by Dylan14 Shouldn't the bolded bit be CUDAPm1 v0.22, as that is the newest version of that software?
Whatever non-PDF-form table gets created will have to be continually updated ... we're discussing general-layout issues at present.

2020-01-17, 00:55   #10
kriesel

"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

3×2,161 Posts

Quote:
 Originally Posted by Dylan14 Shouldn't the bolded bit be CUDAPm1 v0.22, as that is the newest version of that software?
In my opinion, no, because it introduced more severity of issues than it resolved. Like drivers and occasionally operating systems or automobiles, newest is not always the best.

Last fiddled with by kriesel on 2020-01-17 at 00:56

2020-01-17, 05:32   #11
kriesel

"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

195316 Posts

Quote:
 Originally Posted by Uncwilly Why not make a table? I have attached an idea.
Intel IGP / primality testing or P-1 probably = "NA". Current gpuowl PRP or P-1 has AMD and NVIDIA code paths, no Intel code path. There's certainly no CUDALucas to do LL or CUDAPm1 to do P-1 on an OpenCL nonCUDA device. And CUDALucas won't do PRP.
Result of attempting a Gpuowl 6.5 run on an Intel UHD630 IGB:
Code:
2020-01-16 19:16:58 Note: no config.txt file found
2020-01-16 19:16:58 config: -device 0 -fft +0 -carry long -use ORIG_X2
2020-01-16 19:16:58 87398387 FFT 5120K: Width 256x4, Height 64x4, Middle 10; 16.67 bits/word
2020-01-16 19:16:58 using long carry kernels
2020-01-16 19:16:58 OpenCL args "-DEXP=87398387u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=10u -DFRAC=12357831637820925542ul -DWEIGHT_STEP=0xa.0e81d99e13ac8p-3 -DIWEIGHT_STEP=0xc.ba55dbe3e5aep-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DINVWEIGHT_LIMIT=0xc.cccccccccccdp-29 -DORIG_X2=1  -I. -cl-fast-relaxed-math -cl-std=CL2.0"
2020-01-16 19:17:32 OpenCL compilation in 33528 ms
2020-01-16 19:17:36 87398387.owl loaded: k 87000000, block 1000, res64 d2d69bc89926f0a4
2020-01-16 19:20:47 87398387 EE loaded: 87000000, blockSize 1000, c89b639632165de5 (expected d2d69bc89926f0a4)
2020-01-16 19:20:47 Exiting because "error on load"
2020-01-16 19:20:47 Bye
TF boundary between mfaktx and Mfactor/Factor 5 is not 1B (109), it's 232. The mersenne.org/mersenne.ca boundary is 109. DC versus first-test primality is irrelevant to selecting software. Amazon, Google GCE, Colab, and Kaggle all typically involve linux apps. I'm unpersuaded that each environment needs its own column to repeat the same app names, any more than the various flavors of Windows or distros of linux do.
Attached Thumbnails

Last fiddled with by kriesel on 2020-01-17 at 05:46

 Similar Threads Thread Thread Starter Forum Replies Last Post chalsall Software 9 2019-12-11 22:46 Darin Information & Answers 7 2012-08-02 11:02 SPWorley Programming 10 2009-07-28 13:50

All times are UTC. The time now is 14:39.

Tue May 24 14:39:10 UTC 2022 up 40 days, 12:40, 0 users, load averages: 2.13, 2.40, 2.67