mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2021-05-12, 09:12   #12
drkirkby
 
"David Kirkby"
Jan 2021
Althorne, Essex, UK

4768 Posts
Default

Quote:
Originally Posted by moebius View Post
Have a look at this sheet:
https://drive.google.com/file/d/10fC...enkBdAaRP/view
A Vega64 are at 1,755 ms/it for PRP.103750501. I think you will need minimum a AMD RX 6700 XT,
A used Vega64 is more expensive than a used Intel Xeon Platinum 8167M, but the Xeon is faster at around 1.5 ms/iteration. An AMD RX 6700 XT is well over twice the price of the Xeon.

I am fairly confident that the Dell 7920 can be speeded up considerably, as I think the RAM is running at 1200 MHz, not the 2400 MHz supported by the CPUs. At the moment I can’t get more than 4 DIMMs to work in the appropriate 6 DIMM sockets for the first CPU. So only 4 of the 6 memory channels are in used. (Two memory channels have 2 DIMMs, so although both CPUs have 6 DIMMs, they are not configured optimally). The CPU utilisation never goes above about 45% when running PRP tests, but hits 100% on trial factoring to small numbers at least. When I run the memory tests of the Passmark website, my memory is in the bottom 5% of systems submitted!!! The memory performance is appalling. In other words 95% are better.

You might reasonably ask why I don’t get the PC sorted out. Unfortunately the Dell warranty only covers the system with the original parts (2 x 8 GB DIMMs). I don’t have enough Dell DIMMs to reproduce the fault, so Dell will not do anything about it. The cost of buying Dell DIMMs is huge - I am contemplating whether it’s less hassle to just buy another motherboard from eBay. I can get a Dell 7920 sealed reconditioned motherboard for less than the cost of a single 8 GB DIMM from Dell. I feel pretty peed off about it to be honest. I don’t see why I should buy a motherboard and fix a computer under warranty, but it is looking the most cost effective way to get it repaired.

Irrespective of whether I can speed up the computer with a new motherboard or not, I don’t think buying a graphics card would be good value.

Dave.

Last fiddled with by drkirkby on 2021-05-12 at 09:18
drkirkby is offline   Reply With Quote
Old 2021-05-12, 09:28   #13
moebius
 
moebius's Avatar
 
Jul 2009
Germany

601 Posts
Default

I paid 260 Euro for 6 months for the RX Vega64 mining version. And yet rather a RX 6800, because the RX 6700 XT only has 0.825 TLOPS FP64, otherwise it will be scarce
moebius is offline   Reply With Quote
Old 2021-05-12, 10:04   #14
drkirkby
 
"David Kirkby"
Jan 2021
Althorne, Essex, UK

4768 Posts
Default

Quote:
Originally Posted by moebius View Post
Yes, please make also a benchmark for my pure gpuowl list with the slightly larger exponent 77936867. Please post it in this thread:
https://mersenneforum.org/showthread...22204&page=246
That exponent is smaller than the one I did before, so therefore will take less time.
Code:
2021-05-12 11:00:40 Quadro P2200-0 77936867 OK       800   0.00% 1579c241dc63eca6 7271 us/it + check 3.12s + save 0.19s; ETA 6d 13:24
I will post a full benchmark at the link suggested.

Last fiddled with by drkirkby on 2021-05-12 at 10:05
drkirkby is offline   Reply With Quote
Old 2021-05-12, 10:10   #15
kruoli
 
kruoli's Avatar
 
"Oliver"
Sep 2017
Porta Westfalica, DE

232 Posts
Default

Quote:
Originally Posted by drkirkby View Post
I am fairly confident that the Dell 7920 can be speeded up considerably, as I think the RAM is running at 1200 MHz, not the 2400 MHz supported by the CPUs.
Depending on the tool you use, that reading is expected and the memory is already running at the maximum frequency. It would be very weird if it was really running at 1200 MHz (I guess you should say MTransfers instead?), since I never saw DDR4 running that slow.

Quote:
Originally Posted by drkirkby View Post
The CPU utilisation never goes above about 45% when running PRP tests, but hits 100% on trial factoring to small numbers at least.
That is also expected. TF will run with the HT cores (2 threads per physical core), while PRP, LL, P-1 etc. will not (1 thread per physical core, so only 50 % total CPU usage in task manager, etc.), because that's way it is (usually) the fastest.
kruoli is offline   Reply With Quote
Old 2021-05-12, 11:15   #16
drkirkby
 
"David Kirkby"
Jan 2021
Althorne, Essex, UK

2×3×53 Posts
Default

Quote:
Originally Posted by kruoli View Post
Depending on the tool you use, that reading is expected and the memory is already running at the maximum frequency. It would be very weird if it was really running at 1200 MHz (I guess you should say MTransfers instead?), since I never saw DDR4 running that slow.

That is also expected. TF will run with the HT cores (2 threads per physical core), while PRP, LL, P-1 etc. will not (1 thread per physical core, so only 50 % total CPU usage in task manager, etc.), because that's way it is (usually) the fastest.
The machine come with Windoze. When I run this benchmark, which I think is fairly common
https://www.passmark.com/
it gives a report on various aspects of the computer - CPUs, graphics, memory etc. These are compared to other system submitted. I think its reasonable to assume anyone submitting a system has some interest in performance, so the list is probably not of computers in use in the world, but some higher end ones. This is how my Dell scores, on the various factors, and as a percentile.



Passmark 66th percentile -that's the overall rating.
CPU mark 99% percentile - obviously two 26-core CPUs are quick
2D mark 52nd percentile
3D mark 53rd percentile
Memory mark 22nd percentile
Disk mark 67th percentile

I know they weight the overall mark, so not single strong attribute can push the mark up much, but one weak one can drop it a lot. I think it's pretty clear that the CPUs are working well, but the memory is working very poorly. This is for a machine, which this YouTube video
https://www.youtube.com/watch?v=jP65i_Iqml8
the Dell 7920 and an HP Z8 can jointly claim to be the worlds fastest workstation. I think it's clear there's something very wrong with the memory performance on my machine. I think it was CPU-Z, which is one of the other CPU performance indicators, that indicates the RAM is running at 1200 MHz (yes, 1.2 GT/s would seem more logical).

There are 24 DIMM sockets - 12 for each CPU. For each CPU, 6 are coloured white, and 6 black. One should occupy the white one first, then the black. But any attempt to put more than 4 DIMMs in the white ones for CPU0 (first CPU) will result in the machine failing to power on. Only if I move the DIMMs to the black sockets, so putting 2 DIMMs on some memory channels, and none on others, can I fit 6 DIMMs on the first CPU. The problem does not occur with the 2nd CPU, and swapping the positions of the CPUs does not help, so I don't think it is a CPU fault. I've also tried the original 8-core CPU, and have the exact same issue.

Unfortunately dealing with Dell is a nightmare. Long waits on the phone, to get to Indian call centres, with staff with little knowledge. Sometime after waiting on the UK phone for a long time (20 minutes or so), one gets a message to dial another number, which starts +91, so India. I'm very unimpressed with Dell, but their argument is that it is not supported with Kingston DIMMs. So I seem in a catch-22

* Pay a fortune for new Dell DIMMs
* Pay a lot of money for 2nd hand Dell DIMMs, then no doubt Dell would argue they don't support used components.
* Replace the motherboard myself.

Dave (not a happy Dell customer).

Last fiddled with by drkirkby on 2021-05-12 at 11:29
drkirkby is offline   Reply With Quote
Old 2021-05-12, 11:31   #17
drkirkby
 
"David Kirkby"
Jan 2021
Althorne, Essex, UK

2×3×53 Posts
Default

PS, I will try disabling hyperthreading in the BIOS, and see if the apparent CPU untilisation rises. I'm running linux, so don't use task manager. Later I will test with a smaller PRP exponent that will fit in the cache. I think performance drops a lot when memory is accessed, but its clear no affordable graphics card is going to worth using in place of the current configuration.

Last fiddled with by drkirkby on 2021-05-12 at 11:33
drkirkby is offline   Reply With Quote
Old 2021-05-12, 11:47   #18
drkirkby
 
"David Kirkby"
Jan 2021
Althorne, Essex, UK

2·3·53 Posts
Default

Yes, with hyperthreading disabled in the BIOS, the CPU usage is closer to maximum. It's 19.2% idle, which is a lot lower than it was with hyperextending enabled. Anyway, I still think there is an issue with the RAM on this, but are pondering my options for resolving it.

Code:
top - 12:43:25 up 4 min,  1 user,  load average: 39.49, 18.73, 7.32
Tasks: 698 total,   1 running, 697 sleeping,   0 
stopped,   0 zombie
%Cpu(s):  0.3 us,  2.7 sy, 77.8 ni, 19.2 id,
  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
top - 12:43:27 up 4 min,  1 user,  load average: 40.17, 19.22, 7.54
MiB Mem : 386593.7 total, 384006.8 free,   1778.2 used,    
%Cpu(s):  0.0 us,  2.7 sy, 80.6 ni, 16.6 id,
  0.0 wa,  0.0 hi,  0.0 si,  0.0 stfree,      0.0 used. 382

MiB Mem : 386593.7 total, 384007.4 free,   1777.7 used,    
808.7 buff/cache
MiB Swap:   2048.0 total,   2048.0 free,      0.0 used. 3821 127:47.64 mprime   
784.8 avail Mem 

      1 root      20   0  167596  11824   8520 S   0.0   0.0   0:02.52 systemd  
   2273 drkirkby  30  10 4542704 572836   6736 S  4331   0.1 129:30.71 mprime   
   1841 drkirkby  20   0 4127312 309948 119092 S   1.7   0.1   0:06.74 gnome-s+ 
   1339 root      20   0   82212   4068   3456 S   0.4   0.0   0:00.13 irqbala+ 
   1583 root     -51   0       0      0      0 S   0.4   0.0   0:01.03 irq/167+ 
   1585 root      20   0       0      0      0 S   0.4   0.0   0:01.08 nv_queue 
   2233 drkirkby  20   0  823232  50752  38268 S   0.4   0.0   0:01.53 gnome-t+ 
   2413 drkirkby  20   0   21180   4528   3180 R   0.4   0.0   0:00.04 top      
      1 root      20   0  167596  11824   8520 S   0.0   0.0   0:02.52 systemd  
      2 root      20   0       0      0      0 S   0.0   0.0   0:00.01 kthreadd 
      3 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_gp   
      4 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_par+ 
      5 root      20   0       0      0      0 I   0.0   0.0   0:00.09 kworker+ 
      6 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 kworker+ 
      7 root      20   0       0      0      0 I   0.0   0.0   0:00.00 kworker+ 
      8 root      20   0       0      0      0 I   0.0   0.0   0:00.22 kworker+ 
      9 root      20   0       0      0      0 I   0.0   0.0   0:00.01 kworker+ 
     10 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 mm_perc+
drkirkby is offline   Reply With Quote
Old 2021-05-12, 15:25   #19
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

148516 Posts
Default

Quote:
Originally Posted by drkirkby View Post
Can anyone give me any sort of ideas of what graphics cards could do a PRP test of 103750501 in under 44 hours, which is what one of my Xeons does? With the pair of Xeons I'm averaging 22 hours (roughly one a day), for exponents around that value. If an affordable graphics card could do significantly better I might be persuaded to put my hands in my pocket and buy one for GIMPS.

Dave
Radeon VII GPUs can do it in about half that. These used to retail new for $600. (Now fully functional used no-warranty VIIs are going for ~US$2K, or somewhat above the list price of a scarce new Radeon Pro VII.)
All GPU models' prices are currently badly inflated by the GPU and general chip shortage. This should eventually pass.

Numerous benchmark timings vs. GpuOwL version and fft length for RX480 and Radeon VII are available (where else, reference thread post attachments!)
Note, those tabulated values are from Windows with reduced GPU power, and can be beaten by operating the GPU at full power, or on Linux with ROCm driver, or both.
The Radeon VII GPU Gpuowl stellar performance is why folks in the know such as Woltman and Mayer and Preda adopted Radeon VII GPUs for PRP testing. They also report somewhat higher total throughput by running two instances per GPU.

Any GPU scoring at least half the GhzD/day rating of a Radeon VII in James Heinrich's table should meet your 44 hour threshold or come close, properly installed & operated.
kriesel is offline   Reply With Quote
Old 2021-05-12, 15:51   #20
masser
 
masser's Avatar
 
Jul 2003
wear a mask

23×5×41 Posts
Default

Quote:
Originally Posted by kriesel View Post
FP64 (double) performance 119.4 GFLOPS (1:32)
In other words, good reason to be slow in GpuOwl, slower in ClLucas. Really only suitable for TF. Could have bought a 6GB GTX1060 for less cost. (Search eBay for "GTX1060 6GB")
(I got a good deal on a GTX 1050 because of this post. Thanks!)
masser is online now   Reply With Quote
Old 2021-05-13, 21:02   #21
moebius
 
moebius's Avatar
 
Jul 2009
Germany

25916 Posts
Default

Quote:
Originally Posted by masser View Post
(I got a good deal on a GTX 1050 because of this post. Thanks!)
Nice.The gpu has a GP107 chipset. I also ask you for a gpuowl benchmark (when you receive the card) as mentioned earlier in the thread. Of course, it's more of a good trial factoring card.
moebius is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
GpuOwl Nvidia Hardware Accelerated Scheduling Test (Windows) xx005fs GpuOwl 4 2020-07-02 04:38
I wonder if there is a single precision version LL-test for Nvidia GPU computing Neutron3529 GPU Computing 40 2019-05-03 09:49
A dream, will stay a dream ( new Nvidia Quadro) firejuggler GPU Computing 0 2018-03-28 16:02
NVIDIA Quadro K4000 speed results benchmark sixblueboxes GPU Computing 3 2014-07-17 00:25
How to stress test nvidia gpu in Windows 7 64-bit RickC GPU Computing 5 2012-10-15 09:19

All times are UTC. The time now is 22:25.

Sun Jun 20 22:25:32 UTC 2021 up 23 days, 20:12, 1 user, load averages: 1.12, 1.12, 1.15

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.