![]() |
|
|
#1 |
|
"David Kirkby"
Jan 2021
Althorne, Essex, UK
2×229 Posts |
I changed the OS on my Dell 7920 from CentOS 7.8 to Ubuntu 20.04 LTS today. That allowed me to build gpuowl, which was looking impossible on CentOS since glibc was too old.
The setup here is * Dell 7920 tower workstation. (I say tower, as there's a rackmount version of the Dell 7920 too) * 2 x Intel Xeon Platinum 8167M CPU (non-standard OEM units, 26-cores, 2.0 GHz ) * 384 GB RAM, not well configured due to what I believe is a motherboard fault - on one CPU, not all 6 memory channels are used, despite having 6 DIMMs. (Two memory channels have two DIMMs each). * Nvidia Quadro P2200 graphics card, which is driving my monitor (only 1920 x 1200 @ 60 Hz). The card has 1290 Cuda cores and 5120 MB RAM. (I attached a screen shot showing some information reported by the Nvida tool in Ubuntu.) I run gpuowl on the Nvidia graphics card, using a PRP test of 103750501. I chose that exponent, as I I have it allocated to run on the main CPUs, so I thought I would compare the two. The estimated times are mprime Code:
[Worker #1 May 11 18:57] Iteration: 100000 / 103750501 [0.09%], ms/iter: 1.560, ETA: 44:55:09 Code:
2021-05-11 19:27:55 Quadro P2200-0 103750501 OK 200000 0.19% 604684b5784bb06d 10196 us/it + check 4.44s + save 0.33s; ETA 12d 05:16 For what it is worth, the Xeons cost me 300 GBP ($425 USD at today's exchange rate), which is exactly the same as what the Nvidia Quadro P2200 cost me. Both the graphics card and the CPUs were used. Here's how the test was run with gpuowl. I've not got any configuration file or anything - I need to work out how to use the program. Code:
drkirkby@jackdaw:~/GPU$ gpuowl -maxAlloc 4096M -prp 103750501
2021-05-11 18:54:09 GpuOwl VERSION
2021-05-11 18:54:09 GpuOwl VERSION
2021-05-11 18:54:09 Note: not found 'config.txt'
2021-05-11 18:54:09 config: -maxAlloc 4096M -prp 103750501
2021-05-11 18:54:09 device 0, unique id ''
2021-05-11 18:54:09 Quadro P2200-0 103750501 FFT: 5.50M 1K:11:256 (17.99 bpw)
2021-05-11 18:54:09 Quadro P2200-0 103750501 OpenCL args "-DEXP=103750501u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=11u -DWEIGHT_STEP=0.0070585858722830054 -DIWEIGHT_STEP=-0.007009111457174139 -DIWEIGHTS={0,-0.013969095270929188,-0.02774305491917008,-0.041324604812826758,-0.054716432742092071,-0.067921188951161587,-0.080941486662717207,-0.093779902595084258,} -DFWEIGHTS={0,0.014166995379082403,0.028534694516235748,0.043105940780673195,0.05788361782360639,0.072870650148920399,0.088070003691933282,0.10348468640635508,} -cl-std=CL2.0 -cl-finite-math-only "
2021-05-11 18:54:11 Quadro P2200-0 103750501
2021-05-11 18:54:11 Quadro P2200-0 103750501 OpenCL compilation in 2.12 s
2021-05-11 18:54:11 Quadro P2200-0 103750501 trig table : 65 points, cos 73.98 bits, sin 73.34 bits
2021-05-11 18:54:11 Quadro P2200-0 103750501 trig table : 353 points, cos 73.48 bits, sin 73.05 bits
2021-05-11 18:54:12 Quadro P2200-0 103750501 trig table : 360449 points, cos 72.52 bits, sin 72.42 bits
2021-05-11 18:54:12 Quadro P2200-0 103750501 maxAlloc: 4.0 GB
2021-05-11 18:54:12 Quadro P2200-0 103750501 P1(0) 0 bits
2021-05-11 18:54:12 Quadro P2200-0 103750501 PRP starting from beginning
2021-05-11 18:54:17 Quadro P2200-0 103750501 OK 0 on-load: blockSize 400, 0000000000000003
2021-05-11 18:54:17 Quadro P2200-0 103750501 validating proof residues for power 8
2021-05-11 18:54:17 Quadro P2200-0 103750501 Proof using power 8
2021-05-11 18:54:30 Quadro P2200-0 103750501 OK 800 0.00% 6c8aa8e618891740 10038 us/it + check 4.18s + save 0.25s; ETA 12d 01:17
Last fiddled with by drkirkby on 2021-05-11 at 19:35 |
|
|
|
|
|
#2 |
|
"David Kirkby"
Jan 2021
Althorne, Essex, UK
2·229 Posts |
Here's the data sheet on the Nvidia Quadro P2200.
Intel will release no information about the Xeon Platinum 8167Ms, but it has around 35 MB cache. It's single-threaded performance is pretty poor, even for a 2 GHz CPU, but if one can use all 26 cores actively, then it offers quite a bit of performance for the money. Last fiddled with by drkirkby on 2021-05-11 at 19:49 |
|
|
|
|
|
#3 | |
|
If I May
"Chris Halsall"
Sep 2002
Barbados
2·112·47 Posts |
Quote:
CentOS 7.9 is, intentionally, slow in upgrading the SW stacks (including GCC). Discussion about RedHat's decision to make CentOS sub-optimal for decision-makers is left for another thread. Ubuntu 20.04 LTS is more "bleeding edge", but also compiles more code-stacks. Trying to explain the subtle differences to Pointy Haired bosses can become a full-time job for those who don't constrain such discussions... I hope that makes sense. |
|
|
|
|
|
|
#4 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
24×3×163 Posts |
Please run and submit GPU benchmarks at
https://www.mersenne.ca/mfaktc.php for TF https://www.mersenne.ca/cudalucas.php since the P2200 is not in either list. |
|
|
|
|
|
#5 |
|
Aug 2002
7·1,237 Posts |
P2200: https://www.techpowerup.com/gpu-spec...ro-p2200.c3442
GP106: https://www.techpowerup.com/gpu-specs/nvidia-gp106.g797 It should be about the same speed as a GTX 1060.
|
|
|
|
|
|
#6 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
1E9016 Posts |
FP64 (double) performance 119.4 GFLOPS (1:32)
In other words, good reason to be slow in GpuOwl, slower in ClLucas. Really only suitable for TF. Could have bought a 6GB GTX1060 for less cost. (Search eBay for "GTX1060 6GB") Last fiddled with by kriesel on 2021-05-11 at 23:00 |
|
|
|
|
|
#7 | |
|
"David Kirkby"
Jan 2021
Althorne, Essex, UK
2×229 Posts |
Quote:
Lots of the posts on this topic are rather old. Where do I get mfaktc and CUDALucas from? I see several places for the same named program (e.g. Sourceforge and github). I found binaries of both mfackct and CUDALucas, but neither would run. Code:
drkirkby@jackdaw:~/mfaktc-0.21$ ./mfaktc.exe ./mfaktc.exe: error while loading shared libraries: libcudart.so.6.5: cannot open shared object file: No such file or directory drkirkby@jackdaw:~/CUDALucus$ ./CUDALucas ./CUDALucas: error while loading shared libraries: libcufft.so.10: cannot open shared object file: No such file or directory Dave |
|
|
|
|
|
|
#8 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
24·3·163 Posts |
1) Heinrich will accept benchmarks for GPUs from GpuOwL. It says so on one of the pages posted earlier.
2) Gpuowl builds without vendor libraries because rather than using a vendor provided fft library, Preda programmed his own in the gpuowl source code. Also, GpuOwl is an OpenCL application, so has no need of or use for CUDA. 3) <broken record mode>Use the reference info.</mode> Perhaps you already did. https://mersenneforum.org/showthread.php?t=24607 Bookmark that. Browser search for "software" finds it: Available Mersenne Prime hunting software http://www.mersenneforum.org/showpos...91&postcount=2 Download and save the attachment from that linked post. And bookmark the post, since the attachment gets updated after I discover the software has. The CUDA .dll part there is more subtle and has been updated / expanded today. It also mentions the download mirror at mersenne.ca, which includes a few linux builds. 4) NVIDIA CUDA Windows dlls or Linux .so files (or GPU drivers for that matter) generally must be acquired separately for CUDA applications. CUDA DLLs are available at the mirror. But not apparently .so files for the various versions of the various Linux distros, so off to NVIDIA for a big free download, and extract what you need afterward; https://developer.nvidia.com/cuda-downloads or the download archive for previous versions, at https://developer.nvidia.com/cuda-toolkit-archive I have a libcufft.so.6.5.14 but it is far too big to post, even compressed. And I'm unsure what Linux variants which .so files are compatible with. It's old, so very unlikely to like Ubuntu 20.x. Searching https://mersenneforum.org/showthread.php?t=24607 for compatibility leads to CUDA Toolkit compatibility vs CUDA level https://www.mersenneforum.org/showpo...1&postcount=11 Perhaps someone who uses Linux far more than I, could assist or advise regarding .so files. Last fiddled with by kriesel on 2021-05-12 at 02:04 |
|
|
|
|
|
#9 | |
|
Jul 2009
Germany
70610 Posts |
Quote:
https://mersenneforum.org/showthread...22204&page=246 |
|
|
|
|
|
|
#10 | |
|
"David Kirkby"
Jan 2021
Althorne, Essex, UK
2·229 Posts |
Quote:
https://www.ansys.com/content/dam/it...es-2019-r2.pdf but not the consumer-grade gaming cards. Hence for those applications, a Quadro is a more suitable card. Conversely, for games, the Quadro is a poor choice as the games are not optimised for them. I bought this computer for engineering applications - about the only games I play is chess and minesweeper, neither of which would benefit from a better graphics card. Hence my choice of graphics cards. Can anyone give me any sort of ideas of what graphics cards could do a PRP test of 103750501 in under 44 hours, which is what one of my Xeons does? With the pair of Xeons I'm averaging 22 hours (roughly one a day), for exponents around that value. If an affordable graphics card could do significantly better I might be persuaded to put my hands in my pocket and buy one for GIMPS. Dave Last fiddled with by drkirkby on 2021-05-12 at 06:17 |
|
|
|
|
|
|
#11 | |
|
Jul 2009
Germany
70610 Posts |
Quote:
https://drive.google.com/file/d/10fC...enkBdAaRP/view A Vega64 are at 1,755 ms/it for PRP.103750501. I think you will need minimum a AMD RX 6700 XT, Last fiddled with by moebius on 2021-05-12 at 08:33 |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| I wonder if there is a single precision version LL-test for Nvidia GPU computing | Neutron3529 | GPU Computing | 45 | 2022-04-18 19:41 |
| GpuOwl Nvidia Hardware Accelerated Scheduling Test (Windows) | xx005fs | GpuOwl | 4 | 2020-07-02 04:38 |
| A dream, will stay a dream ( new Nvidia Quadro) | firejuggler | GPU Computing | 0 | 2018-03-28 16:02 |
| NVIDIA Quadro K4000 speed results benchmark | sixblueboxes | GPU Computing | 3 | 2014-07-17 00:25 |
| How to stress test nvidia gpu in Windows 7 64-bit | RickC | GPU Computing | 5 | 2012-10-15 09:19 |