mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2023-05-29, 12:17   #34
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

33×157 Posts
Default

Quote:
Originally Posted by firejuggler View Post
FP64 is what matter.
For TF it's FP32 ("Single Precision") GFLOPS that matters for TF.
For gpuowl etc I have low confidence in my ability to predict performance, either from FP32 or FP64 theoretical throughput, but TF always translates directly.

All the mfkatx performance numbers on my chart are derived from FP32 GFLOPS and a magic multiplier for architecture (generally corresponding to CUDA level for NVIDIA):
Code:
$TF_GFLOPS_per_GHzDayPerDay = array(
	'N' => array(
		10 =>  0.00,
		11 => 14.00,
		12 => 14.00,
		13 => 14.00,
		20 =>  3.65,
		21 =>  5.35,
		30 => 10.75,
		35 => 11.55,
		37 => 11.05, // Tesla K80 -- single benchmark, note that K80 is dual-GPU model
		50 =>  9.00,
		52 =>  9.00,
		60 =>  9.70, // Tesla P100
		61 =>  7.90,
		70 =>  3.58, // Titan V100  -- only one benchmark so far
		75 =>  3.30, // RTX 20x0
		80 =>  2.90, // A100-SXM4
		86 =>  6.15, // RTX 30?0/A?000
		89 =>  6.35, // RTX 40x0
	),
	'A' => array(
		 1 => 11.3, // VLIW5
		 2 => 11.0, // VLIW4
		10 =>  9.3, // GCN 1.0
		11 =>  9.3, // GCN 1.1
		12 =>  9.3, // GCN 1.2
		13 => 10.9, // GCN 1.3
		14 => 10.9, // GCN 1.3
		15 => 10.8, // GCN 1.5
		20 => 13.0, // RDNA 1 (RX 5700)
		30 => 11.0, // RDNA 2 (RX 6600/6700/6800/6900)
		40 => 15.0, // RDNA 3 (RX 7x00)
	),
	'I' => array(
		10 =>  9.5, // Arc A380, A770
	),
);
James Heinrich is offline   Reply With Quote
Old 2023-05-29, 13:11   #35
Jurzal
 
Jurzal's Avatar
 
Jan 2023
Riga, Latvia

2·29 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
For TF it's FP32 ("Single Precision") GFLOPS that matters for TF.
Neat, thanks for information, FP32 it is!
Sorry, if I missed it before, but how I submit my 3060 Ti benchmark? Mine is averaging around 3000, while benchmark table shows 2100 average.

EDIT: Never mind, I found it and I uploaded my benchmark.

Last fiddled with by Jurzal on 2023-05-29 at 13:32
Jurzal is offline   Reply With Quote
Old 2023-05-29, 17:44   #36
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

33×157 Posts
Default

Quote:
Originally Posted by Jurzal View Post
how I submit my 3060 Ti benchmark? Mine is averaging around 3000, while benchmark table shows 2100 average.
That's pretty normal these days. The performance table is based on nominal "stock" clockspeeds, while in reality with decent cooling you're likely to see both Boost and any manufacturer overclock on top of that, oftentimes a significant difference. Your 3060 Ti, for example, has a stock clock of 1410 and your submitted benchmark showed 1965, nearly 40% higher, so the theoretical 2230 GHzd/d * 1.394 = 3108 GHd/d which is in line with your reported performance.

Last fiddled with by James Heinrich on 2023-05-29 at 17:46
James Heinrich is offline   Reply With Quote
Old 2023-05-30, 12:08   #37
Jurzal
 
Jurzal's Avatar
 
Jan 2023
Riga, Latvia

2×29 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
That's pretty normal these days. The performance table is based on nominal "stock" clockspeeds, while in reality with decent cooling you're likely to see both Boost and any manufacturer overclock on top of that, oftentimes a significant difference. Your 3060 Ti, for example, has a stock clock of 1410 and your submitted benchmark showed 1965, nearly 40% higher, so the theoretical 2230 GHzd/d * 1.394 = 3108 GHd/d which is in line with your reported performance.
Thanks for confirming!
Nvidia cards respond very well to proper undervolting + overclocking. Can gain 40% higher performance with -20% power consumption reduction.
1965 MHz clock from base 1410 MHz is with 165W power consumption, instead of 200W default.

Last fiddled with by Jurzal on 2023-05-30 at 12:09
Jurzal is offline   Reply With Quote
Old 2023-05-30, 13:00   #38
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Liverpool (GMT/BST)

17EF16 Posts
Default

Quote:
Originally Posted by Jurzal View Post
Thanks for confirming!
Nvidia cards respond very well to proper undervolting + overclocking. Can gain 40% higher performance with -20% power consumption reduction.
1965 MHz clock from base 1410 MHz is with 165W power consumption, instead of 200W default.
Seems strange that they would release cards like that.
henryzz is offline   Reply With Quote
Old 2023-05-31, 07:50   #39
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

26538 Posts
Default

Quote:
Originally Posted by Jurzal View Post
Nvidia cards respond very well to proper undervolting + overclocking. Can gain 40% higher performance with -20% power consumption reduction.
1965 MHz clock from base 1410 MHz is with 165W power consumption, instead of 200W default.
BUT does it compute correctly at that overclock+undervolt?

In my experience, expecially with TF, it's very easy to overlook wrong compute. You simply don't find factors, and there is no other indication that the GPU is not working correctly.

So if your GPU undervolts+overclocks fantastically, you should spend a significant effort making sure it still works correctly before jumping into serious TF. To check you need to run known-factors TF and verify that the factors are all detected correctly without exception.

Last fiddled with by preda on 2023-05-31 at 07:51
preda is offline   Reply With Quote
Old 2023-05-31, 11:10   #40
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

33·157 Posts
Default

Quote:
Originally Posted by preda View Post
To check you need to run known-factors TF and verify that the factors are all detected correctly without exception.
The easiest way to do this is mfaktc -st2 which will test a large number of known factors of all sizes across multiple kernels and different exponent sizes, and give you confirmation at the end:
Code:
Selftest statistics
  number of tests           26192
  successfull tests         26192

  kernel             | success |   fail
  -------------------+---------+-------
  UNKNOWN kernel     |      0  |      0
  71bit_mul24        |   2586  |      0
  75bit_mul32        |   2682  |      0
  95bit_mul32        |   2867  |      0
  barrett76_mul32    |   1096  |      0
  barrett77_mul32    |   1114  |      0
  barrett79_mul32    |   1153  |      0
  barrett87_mul32    |   1066  |      0
  barrett88_mul32    |   1069  |      0
  barrett92_mul32    |   1084  |      0
  75bit_mul32_gs     |   2420  |      0
  95bit_mul32_gs     |   2597  |      0
  barrett76_mul32_gs |   1079  |      0
  barrett77_mul32_gs |   1096  |      0
  barrett79_mul32_gs |   1130  |      0
  barrett87_mul32_gs |   1044  |      0
  barrett88_mul32_gs |   1047  |      0
  barrett92_mul32_gs |   1062  |      0

selftest PASSED!
(there is also the less-extensive -st test which does the same thing, just less of it, since -st2 can easily take several hours on a slower gpu)
James Heinrich is offline   Reply With Quote
Old 2023-05-31, 17:49   #41
Jurzal
 
Jurzal's Avatar
 
Jan 2023
Riga, Latvia

2×29 Posts
Default

Code:
Selftest statistics
  number of tests           26192
  successfull tests         26192

  kernel             | success |   fail
  -------------------+---------+-------
  UNKNOWN kernel     |      0  |      0
  71bit_mul24        |   2586  |      0
  75bit_mul32        |   2682  |      0
  95bit_mul32        |   2867  |      0
  barrett76_mul32    |   1096  |      0
  barrett77_mul32    |   1114  |      0
  barrett79_mul32    |   1153  |      0
  barrett87_mul32    |   1066  |      0
  barrett88_mul32    |   1069  |      0
  barrett92_mul32    |   1084  |      0
  75bit_mul32_gs     |   2420  |      0
  95bit_mul32_gs     |   2597  |      0
  barrett76_mul32_gs |   1079  |      0
  barrett77_mul32_gs |   1096  |      0
  barrett79_mul32_gs |   1130  |      0
  barrett87_mul32_gs |   1044  |      0
  barrett88_mul32_gs |   1047  |      0
  barrett92_mul32_gs |   1062  |      0

selftest PASSED!
Jurzal is offline   Reply With Quote
Old 2023-05-31, 17:53   #42
Jurzal
 
Jurzal's Avatar
 
Jan 2023
Riga, Latvia

2·29 Posts
Default

I have completed 4001 assignments, found 52 factors.
Mostly at wavefront 75-77.

Added screenshots of GPU72 config.
I will run a recently tested known factor 168785003 at 76-77 bit, to see if I match it.
Attached Thumbnails
Click image for larger version

Name:	Untitled.png
Views:	12
Size:	9.6 KB
ID:	28471   Click image for larger version

Name:	gpu72.png
Views:	14
Size:	21.7 KB
ID:	28472  

Last fiddled with by Jurzal on 2023-05-31 at 17:58
Jurzal is offline   Reply With Quote
Old 2023-05-31, 19:33   #43
Jurzal
 
Jurzal's Avatar
 
Jan 2023
Riga, Latvia

2×29 Posts
Default

Quote:
Originally Posted by preda View Post
BUT does it compute correctly at that overclock+undervolt?

In my experience, expecially with TF, it's very easy to overlook wrong compute. You simply don't find factors, and there is no other indication that the GPU is not working correctly.

So if your GPU undervolts+overclocks fantastically, you should spend a significant effort making sure it still works correctly before jumping into serious TF. To check you need to run known-factors TF and verify that the factors are all detected correctly without exception.

Test succesful, factor found.
This GPU goes up to 2175 MHz with 1.081v on core, 1965 MHz with 0.925v on core, and that is a safe margin, since it can do it on 0.900v too.

I know my overclocks. Cheers.
Attached Thumbnails
Click image for larger version

Name:	factor.png
Views:	11
Size:	11.5 KB
ID:	28473  

Last fiddled with by Jurzal on 2023-05-31 at 19:34
Jurzal is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Nvidia announces RTX 4090 & RTX 4090 Ti - first look tServo Hardware 3 2021-04-30 07:11
Price per core ATH Cloud Computing 8 2018-01-01 05:33
new I9 line price/spec, upto 36 core firejuggler Hardware 14 2017-09-26 05:28
What is the spec of the Opteron 6128 fivemack Hardware 1 2010-04-22 00:55
Trying to Spec Up a Mid-High End System tmorrow Hardware 14 2008-08-11 03:32

All times are UTC. The time now is 22:23.


Fri Jun 2 22:23:50 UTC 2023 up 288 days, 19:52, 0 users, load averages: 0.79, 0.90, 0.96

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔