mersenneforum.org  

Go Back   mersenneforum.org > Other Stuff > Open Projects > y-cruncher

Reply
 
Thread Tools
Old 2022-08-28, 18:43   #1
Xyzzy
 
Xyzzy's Avatar
 
Aug 2002

23×11×97 Posts
Default AVX512 and Zen4 pre-release speculations

https://www.techpowerup.com/298194/l...2-optimization

Xyzzy is offline   Reply With Quote
Old 2022-08-29, 21:28   #2
Mysticial
 
Mysticial's Avatar
 
Sep 2016

2×5×37 Posts
Default

I'll have a lot more to say once the embargo lifts. But it looks like some of the big names also have the chip. So the tear-down of the chip which I've prepared probably won't be the only one - let alone the best. Especially since I'm mostly a SIMD person with less insight into the rest of the chip.


Lots of juicy stuff to come whenever I'm allowed to post.
Mysticial is offline   Reply With Quote
Old 2022-09-03, 13:45   #3
mackerel
 
mackerel's Avatar
 
Feb 2016
UK

2×223 Posts
Default

Looking forward to it. A problem with the more well known reviewers is they try to cover everything, and FP perf doesn't get much depth to it, or is tested in ways hard to relate to our interests. For the audience here more info in that area would be very interesting.
mackerel is offline   Reply With Quote
Old 2022-09-04, 04:24   #4
LordJulius
 
LordJulius's Avatar
 
"Doug K"
Aug 2021
California

22×5 Posts
Default RE: Zen4 & AVX-512

https://hothardware.com/news/amd-ryz...dna-3-surprise


"Some details and performance expectations were also disclosed regarding Zen 4’s AVX-512 implementation. In Zen 4, AVX-512 is implemented using double-pumped 256-bit data chunks. This design decision was reportedly made to avoid large frequency fluctuations when executing AVX-512 workloads. In terms of performance, AMD is claiming a 1.3x improvement in FP32 inferencing workloads versus Zen 3, and up to a 2.5X improvements for Int8."
LordJulius is offline   Reply With Quote
Old 2022-09-04, 07:53   #5
Mysticial
 
Mysticial's Avatar
 
Sep 2016

2·5·37 Posts
Default

Quote:
Originally Posted by mackerel View Post
Looking forward to it. A problem with the more well known reviewers is they try to cover everything, and FP perf doesn't get much depth to it, or is tested in ways hard to relate to our interests. For the audience here more info in that area would be very interesting.
The other problem with launch day reviews is that they fail to capture the new product with optimizations for it. Simply because the developers of the benchmark/game have not had the opportunity to do it yet.

In a way, all new products are inherently disadvantaged from start. Only months later would things actually improve - assuming the benchmark/game is still in development. That's besides the point for early adopters wanting the performance on day one, but it doesn't tell the whole story.


Zen4 feels a bit different this time as there's an unknown number of (individual) software devs who have gotten the chip very early - well ahead of the usual press who do the hardware reviews.

I'm actually surprised this doesn't happen more often. Prior to a launch, the vendor sends out samples to all the still-in-development benchmarks to let them time to do pre-launch optimizations. Makes the launch day reviews look better. Heck, get some free beta-testing in the process. Slap everyone with NDAs and no one would risk their reputation to leak anything.


Quote:
Originally Posted by LordJulius View Post
https://hothardware.com/news/amd-ryz...dna-3-surprise

"Some details and performance expectations were also disclosed regarding Zen 4’s AVX-512 implementation. In Zen 4, AVX-512 is implemented using double-pumped 256-bit data chunks. This design decision was reportedly made to avoid large frequency fluctuations when executing AVX-512 workloads. In terms of performance, AMD is claiming a 1.3x improvement in FP32 inferencing workloads versus Zen 3, and up to a 2.5X improvements for Int8."
The double pumping was entirely predictable. Nobody expected AMD to spend that much silicon on something that's hardly used yet. (along with the power implications of full 512-bit)

Besides all the new non-width related features of AVX512, there are advantages of running 512-bit on 256-bit hardware. Half the number of instructions for the same amount of work - and therefore half the front-end overhead.

In order to fully utilize Zen3's 4 FPU pipes, you need to sustain 4 instructions/cycle. This is hard to do outside of synthetics. 4 IPC is hard to sustain in general because you're pushing against the limit of the instruction decoding and dispatch. Floating-point has long latencies and you only have 16 registers. If you play games with renaming to get around the limited # of regs, you start running up against the limit of the reorder window.

Last fiddled with by Mysticial on 2022-09-04 at 07:55
Mysticial is offline   Reply With Quote
Old 2022-09-15, 01:08   #6
Mysticial
 
Mysticial's Avatar
 
Sep 2016

37010 Posts
Default

Just wondering:

Where should I post my Zen4 AVX512 breakdown when the embargo lifts?

If I post it here, it's already buried beneath a bunch of posts. If I post a new thread in the Hardware subforum, I won't be able to fix/annotate errors after the edit grace period. If I post a new thread under this section, the title will seem redundant of this one.




Side Note: I still don't have a solid date for when embargo lifts. The date that my AMD contact gave me is almost a week before the date that I'm reading on Twitter and from my various media contacts. So when launch gets closer, I'm going to ask my contact again.
Mysticial is offline   Reply With Quote
Old 2022-09-15, 03:05   #7
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

556210 Posts
Default

I'd post it right here in this thread. Anyone browsing this subforum will see the new post in the thread, and anyone who reads the site via "New Posts" will also see it. The thread title catches plenty of attention from those who like this sort of thing.
VBCurtis is offline   Reply With Quote
Old 2022-09-15, 03:15   #8
Mysticial
 
Mysticial's Avatar
 
Sep 2016

2×5×37 Posts
Default

Problem is that I'll be linking externally.
Mysticial is offline   Reply With Quote
Old 2022-09-15, 03:16   #9
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

10101101110102 Posts
Default

By all means, make your own clean thread!
If there are edits you wish to make later, you can get mod attention and one of us (I volunteer, at minimum) will paste in the edits you request.
VBCurtis is offline   Reply With Quote
Old 2022-09-15, 03:28   #10
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

1F9416 Posts
Default

Quote:
Originally Posted by Mysticial View Post
If I post it here, it's already buried beneath a bunch of posts. If I post a new thread in the Hardware subforum, I won't be able to fix/annotate errors after the edit grace period. If I post a new thread under this section, the title will seem redundant of this one.
How about renaming this thread to "AVX-512 and Zen4 pre-release speculations",
then start a new thread "AVX-512 and Zen4 details"?

Or whatever you want to do. Those that care will find your posts!
Prime95 is offline   Reply With Quote
Old 2022-09-15, 11:27   #11
Xyzzy
 
Xyzzy's Avatar
 
Aug 2002

215816 Posts
Default

Quote:
Originally Posted by Mysticial View Post
If I post a new thread in the Hardware subforum, I won't be able to fix/annotate errors after the edit grace period.
Are you sure about that?

Xyzzy is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Zen4 7950X Benchmarks Mysticial Hardware 20 2022-11-27 13:51
Zen4's AVX512 Teardown Mysticial Hardware 21 2022-11-07 01:10
AMD Zen speculations Mark Rose Hardware 177 2017-12-11 11:43
Intel Processor Speculations Mark Rose Hardware 109 2017-10-13 16:55
Cannonlake speculations henryzz Hardware 0 2017-03-03 19:49

All times are UTC. The time now is 09:16.


Fri Dec 9 09:16:55 UTC 2022 up 113 days, 6:45, 0 users, load averages: 1.42, 1.13, 0.96

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔