Radeon VII (2nd gen consumer Vega GPU)
 2019-01-09, 19:00 #1 M344587487     "Composite as Heck" Oct 2017 3×5×53 Posts Radeon VII (2nd gen consumer Vega GPU) They just announced this in the CES keynote. pertinent bullet points: 7nm process 1TB/s memory bandwidth 16GB HBM2 Slide shows +62% OpenCL performance over Vega64, whatever that means RRP of $699 ETA February 7th 60 CU's, so it looks to be a cut-down MI50 That's over twice the memory bandwidth of Vega64. If the bandwidth can be saturated that's twice the performance of a Vega64 for roughly the same (current) price as 2xVega64, at what is hopefully a much lower power consumption than 2xVega 64. Does that analysis sound about right? GW: See post #76 and #195 for quick-start on setting up gpuowl under Linux. Last fiddled with by Prime95 on 2020-01-14 at 03:18  2019-01-09, 21:47 #2 mackerel Feb 2016 UK 419 Posts The big display behind Lisa said 25% more performance at same power - in what? gaming? While it loses some CUs vs Vega, it gains in clock more than offsetting it. I guess the question then is, how is a particular workload affected by bandwidth? My guess, whatever efficiency benefits they got from process, they spent on clock. So maybe they'll stick to similar board power for the overall higher absolute performance. 2019-01-09, 22:35 #3 xx005fs "Eric" Jan 2018 USA 3248 Posts Quote:  Originally Posted by mackerel The big display behind Lisa said 25% more performance at same power - in what? gaming? While it loses some CUs vs Vega, it gains in clock more than offsetting it. I guess the question then is, how is a particular workload affected by bandwidth? My guess, whatever efficiency benefits they got from process, they spent on clock. So maybe they'll stick to similar board power for the overall higher absolute performance. I would assume 25-40% better on gaming depending on the game. As long as they kept 1/2 rate DP like the MI50, it should be the best card on the market to do PRP/LL. It's also gonna be the best value indefinitely because even if the v100s beats it, they would cost so much more that makes them not worthy. Last fiddled with by xx005fs on 2019-01-09 at 22:36 2019-01-09, 23:32 #4 kriesel "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 10100000000112 Posts Quote:  Originally Posted by M344587487 They just announced this in the CES keynote. pertinent bullet points: 7nm process 1TB/s memory bandwidth 16GB HBM2 Slide shows +62% OpenCL performance over Vega64, whatever that means RRP of$699 ETA February 7th 60 CU's, so it looks to be a cut-down MI50 That's over twice the memory bandwidth of Vega64. If the bandwidth can be saturated that's twice the performance of a Vega64 for roughly the same (current) price as 2xVega64, at what is hopefully a much lower power consumption than 2xVega 64. Does that analysis sound about right?
Interesting!
Any power dissipation numbers?
A dual-slot-width card?
Does it require pcie 3.0?

2019-01-09, 23:46   #5
Mark Rose

"/X\(‘-‘)/X\"
Jan 2013

3×977 Posts

Quote:
 Originally Posted by xx005fs I would assume 25-40% better on gaming depending on the game. As long as they kept 1/2 rate DP like the MI50, it should be the best card on the market to do PRP/LL. It's also gonna be the best value indefinitely because even if the v100s beats it, they would cost so much more that makes them not worthy.
At Ars Technica they say the Vega 20 GPU is a die shrink of the Vega 10 GPU found in the Vega 64, so it's probably 1:16.

2019-01-10, 01:37   #6
tServo

"Marv"
May 2009
near the Tannhäuser Gate

27×5 Posts

Quote:
 Originally Posted by kriesel Interesting! Any power dissipation numbers? A dual-slot-width card? Does it require pcie 3.0?
Anandtech estimates 300W power.
Dual width, 3 fans that exhaust heat within the case.
It looks like it is higher than the end io bracket.
I've never seen any card that requires pcie 3.0 .

http://www.anandtech.com/show/13832/...ry-7th-for-699

2019-01-10, 01:48   #7
xx005fs

"Eric"
Jan 2018
USA

3248 Posts

Quote:
 Originally Posted by Mark Rose At Ars Technica they say the Vega 20 GPU is a die shrink of the Vega 10 GPU found in the Vega 64, so it's probably 1:16.
It's a die shrink of Vega indeed. However, it is the same GPU as the MI50 which have 1/2 DP capabilities, and unless AMD botched that feature on the consumer variant, it would be able to be the king of LL/PRP

2019-01-10, 06:13   #8
M344587487

"Composite as Heck"
Oct 2017

14338 Posts

Quote:
 Originally Posted by kriesel ... Does it require pcie 3.0?
It's a GFX9 card so no: https://github.com/RadeonOpenCompute/ROCm
Quote:
 Originally Posted by ROCm git readme As described above, GFX8 GPUs require PCIe 3.0 with PCIe atomics in order to run ROCm. In particular, the CPU and every active PCIe point between the CPU and GPU require support for PCIe 3.0 and PCIe atomics. The CPU root must indicate PCIe AtomicOp Completion capabilities and any intermediate switch must indicate PCIe AtomicOp Routing capabilities. ... Beginning with ROCm 1.8, GFX9 GPUs (such as Vega 10) no longer require PCIe atomics. We have similarly opened up more options for number of PCIe lanes. GFX9 GPUs can now be run on CPUs without PCIe atomics and on older PCIe generations, such as PCIe 2.0. This is not supported on GPUs below GFX9, e.g. GFX8 cards in the Fiji and Polaris families.

2019-01-10, 06:51   #9
SELROC

5·432 Posts

Quote:
 Originally Posted by M344587487 It's a GFX9 card so no: https://github.com/RadeonOpenCompute/ROCm

The pcie 1.0 slots are limited in speed, this affects GEC speed. Faster with pcie 3.0

 2019-01-12, 12:48 #10 nomead     "Sam Laur" Dec 2018 Turku, Finland 2×3×5×11 Posts https://twitter.com/RyanSmithAT/stat...59608371175424 "FP64 is not among the couple of features they dialed back for the consumer card." So if this is indeed true, that gets me a bit excited.
2019-01-12, 12:48 #10
M344587487

"Composite as Heck"
Oct 2017

3×5×53 Posts

Quote:
 Originally Posted by nomead https://twitter.com/RyanSmithAT/stat...59608371175424 "FP64 is not among the couple of features they dialed back for the consumer card." So if this is indeed true, that gets me a bit excited.
Fingers crossed. If it has the full 1:2 ratio does that mean we can potentially saturate the memory at lower core clocks, or even do TF with the extra headroom with higher clocks? I wonder if it's possible to assign some CU's to gpuowl and others to mfakto, is SR-IOV needed for that or equivalent? I have doubts SR-IOV would make it to the consumer version.

