mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2021-03-30, 15:36   #12
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

14338 Posts
Default

Here's a couple of benchmarks: https://www.phoronix.com/scan.php?pa...kl-linux&num=1


And a review of the 11400, IMO the only part that looks interesting: https://www.youtube.com/watch?v=LYdHTSQxdCM


The way pricing is right now (at least in the UK), the 11400 is competitive with the 3600. It's the only CPU win intel is likely to have for a while but it may or may not be the most viable for 24/7 compute depending on power consumption. The most interesting thing about it for me is the Xe iGPU, it'll be interesting to see how detrimental 14nm is compared to the 10nm version which I think we have data on somewhere.

Last fiddled with by M344587487 on 2021-04-02 at 08:19
M344587487 is online now   Reply With Quote
Old 2021-04-06, 09:17   #13
mackerel
 
mackerel's Avatar
 
Feb 2016
UK

1101000112 Posts
Default

I ended up re-ordering the 11700k after all as I now have an alternate way to obtain it. This time they have shipped it arriving should be tomorrow.

I did get someone elsewhere to try a quick small FFT bench with Prime95 and it was suggestive of about 50% IPC increase vs previous (non-AVX-512) Intel. Could be better, could be worse. Of course, I'll look at this in more detail myself once I get the CPU.

BTW the iGPU isn't really interesting. Although it has the newer Xe architecture, I think it has something like 1/3 the execution units of Tiger Lake, so performance is going to suck. I don't intend to do any testing on it.
mackerel is offline   Reply With Quote
Old 2021-04-06, 10:01   #14
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

3×5×53 Posts
Default

I didn't realise that about the iGPU, that's unfortunate. 1/3 of the execution units is terrible when you consider the die is much bigger than the 10k series and it has 2 less cores too. Doesn't bode well that they made a backport like this, seems like the sort of thing that they plan to iterate on to make sense, but that would mean 14nm is here to stay for a while yet.


Also didn't realise 10th gen consumer doesn't have AVX512, I guess that alone could be a reason for the backport to exist.
M344587487 is online now   Reply With Quote
Old 2021-04-06, 12:21   #15
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

2·5·587 Posts
Default

Quote:
Originally Posted by mackerel View Post
BTW the iGPU isn't really interesting. Although it has the newer Xe architecture, I think it has something like 1/3 the execution units of Tiger Lake, so performance is going to suck. I don't intend to do any testing on it.
I think it is standard practice for mobile cpus to have more execution units as they are less likely to have a discrete gpu.
henryzz is offline   Reply With Quote
Old 2021-04-06, 14:49   #16
mackerel
 
mackerel's Avatar
 
Feb 2016
UK

6438 Posts
Default

Ok, for wider context I looked up on paper how the GPUs compare.

10th gen desktop (Comet Lake) has 24 EUs.
11th gen desktop (Rocket Lake) has 32 EUs with newer Xe architecture
11th gen mobile (Tiger Lake) has 96 EUs also on Xe architecture.

I don't know how much difference there is between Xe and previous. Assuming each EU is no worse, Rocket Lake GPU is still an upgrade over previous desktop CPUs, but nowhere near the performance of the mobile Tiger Lake parts which were getting reasonable for entry level gaming.


I believe Rocket Lake is the first mainstream desktop part Intel has released which has AVX-512. Mobile parts have had it earlier, as have HEDT/server parts. My concern was and remains that not all implementations of AVX-512 are equal. I just re-tested on Skylake-X where I got around 80% improvement over AVX2, so the 50% of Rocket Lake from other's testing is in an odd place. I'm just setting up to run some Prime95 benchmarks now so I have other system data for comparison when I drop in Rocket Lake tomorrow.

As for why backport, see Ian Cutress (TechTechPotato) video on Rocket Lake for a different look than the mass hatred from much of the techtuber area. It's more about the why than the what. Regardless of how the product is perceived it will leave Intel's engineering skillset better placed for the future. To de-risk future challenges like they currently have with manufacturing, they can co-design for different fabs and mitigate that way.
mackerel is offline   Reply With Quote
Old 2021-04-07, 13:43   #17
mackerel
 
mackerel's Avatar
 
Feb 2016
UK

419 Posts
Default

11700k just arrived and dropped it in and ran P95 bench pretty much as 2nd thing after CPU-Z to check CPU model and ram setting.

For 128k FFT tasks (one per core) it confirms the data I got elsewhere showing about 50% IPC improvement relative to Skylake+ AVX2 Intel CPUs. This is lower than the Skylake-X implementation which was over 80%. Perf/watt is about parity to AVX2 at the mobo default settings.

I think there is room for significant improvement there, as it is running 4.6 GHz at 1.35V with the AVX-512 workload (216W peak). Temps under a Noctua D15 peaked at 92C but this is not a long test, so sustained loads in summer will likely hit thermal throttle without some action. I think a power limit to e.g. 150W picking a random number would improve efficiency a LOT while not impacting performance nearly as much. That is an area for investigation.

If anyone wants a bigger FFT tested let me know what's interesting these days as I don't do that myself.
mackerel is offline   Reply With Quote
Old 2021-04-07, 15:19   #18
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

10110111011102 Posts
Default

Quote:
Originally Posted by mackerel View Post
11700k just arrived and dropped it in and ran P95 bench pretty much as 2nd thing after CPU-Z to check CPU model and ram setting.

For 128k FFT tasks (one per core) it confirms the data I got elsewhere showing about 50% IPC improvement relative to Skylake+ AVX2 Intel CPUs. This is lower than the Skylake-X implementation which was over 80%. Perf/watt is about parity to AVX2 at the mobo default settings.

I think there is room for significant improvement there, as it is running 4.6 GHz at 1.35V with the AVX-512 workload (216W peak). Temps under a Noctua D15 peaked at 92C but this is not a long test, so sustained loads in summer will likely hit thermal throttle without some action. I think a power limit to e.g. 150W picking a random number would improve efficiency a LOT while not impacting performance nearly as much. That is an area for investigation.

If anyone wants a bigger FFT tested let me know what's interesting these days as I don't do that myself.
Did this test fit in the L3 cache? One or two workers might fit better. The AVX-512 benchmark could be running into bandwidth issues. More but slower cores is definitely the efficient way to go. Pushing cores to fast clocks is power inefficient(especially > 3 GHz). It is worth making sure that the cpu speed is the limiting factor and not memory. If not I would underclock the CPU to save power.
henryzz is offline   Reply With Quote
Old 2021-04-07, 15:25   #19
mackerel
 
mackerel's Avatar
 
Feb 2016
UK

419 Posts
Default

I chose 128k FFT size because it fits one per core with room to spare, and is about the smallest size I'd encounter in my areas of interest. Ram doesn't matter for that.

Once I'm done with some other testing, my goal is to play with power limits, as that seems to be the smartest way to control the CPU without giving up performance for non-AVX-512 workloads. The system it is in will be used for many other uses other than prime number finding. I'll monitor the worst case AVX2 power usage and set the limit somewhere above that as global limit, and see where that gets me.

For me 3 GHz might be more efficient but unnecessarily low outside of high core count CPUs. I feel the sweet spot is typically around 4 GHz (AVX2) as a balance of performance and efficiency, but I need to see how AVX-512 responds in more detail as that might need something different. My Skylake-X runs 2.9 GHz for AVX-512 but that does get hot really fast if you even look at the voltage control.
mackerel is offline   Reply With Quote
Old 2021-04-07, 16:33   #20
mackerel
 
mackerel's Avatar
 
Feb 2016
UK

6438 Posts
Default

I was asked to compare non-512 performance of Rocket Lake. To disable AVX-512, is all I need to do add the following line to local.txt?
CpuSupportsAVX512F=0

If so, the results were unexpected. Normalised performance dropped by only 13% compared to AVX-512. It was still some 25% better than Comet Lake.


Also I have to correct a previous statement. I had stated Rocket Lake was about 50% faster and Skylake-X >80% faster. That was based on a test using my Kaby Lake laptop as reference. Now I look at the numbers more, that laptop performed worse than expected relative to Coffee Lake/Comet Lake. If I use the Comet Lake results as reference, that becomes 43% and 75% improvement respectively, which is closer to what I saw with some real world LLR testing separately.
mackerel is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Zen 3 speculation M344587487 Hardware 90 2021-03-31 08:39
Pix of rocket passing the moon tServo Astronomy 0 2020-10-12 13:23
Cascade lake AP henryzz Hardware 16 2019-05-23 00:24
Kaby Lake processors: bor-ing ! tServo Hardware 11 2016-12-18 10:32
It's not rocket surgery... lavalamp Puzzles 27 2011-01-22 14:16

All times are UTC. The time now is 21:53.

Thu May 13 21:53:19 UTC 2021 up 35 days, 16:34, 0 users, load averages: 2.89, 3.01, 3.16

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.