mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2020-11-17, 22:50   #12
Ethan (EO)
 
Ethan (EO)'s Avatar
 
"Ethan O'Connor"
Oct 2002
GIMPS since Jan 1996

1348 Posts
Default

Whatever they put in the iMac refresh will probably be that first high core-count part. Once that works down to the Mac mini these will be very attractive for energy efficient compute with great granularity of cost and space.

But for now, single-threaded performance is faster than any other consumer CPU out there in specint and specfp2006. 10% higher specfp2006 than Ryzen 9 5950x in second place. All 4 big cores can run at full speed at the same time, so anything that fits the cache hierarchy well is going to be pretty zippy, but single core can saturate main memory at ~60GB/s.

The power draw figures are at-the-wall, and the Mac Mini may not be totally optimized for platform energy consumption, so I think the efficiency is hard to evaluate at these low wattage levels.

Between this and AMD’s roadmap, we have some degree of competition in mass market CPUs for the first time in quite a while!

Quote:
Originally Posted by M344587487 View Post
https://www.anandtech.com/print/1625...pple-m1-tested


Memory bandwidth looks very nice, maybe in a generation or two they might add enough cores to make compute more viable. Power draw is high, possibly way higher than the SoC can make efficient use of, so I'm not entirely sure how to interpret the figures. Comparing to the 4800u which is in a much more efficient state seems disingenuous, comparing to the 4900HS makes more sense even though ideally you'd run both efficiently and knock a small percentage off the figures. The GPU looks nice, I wonder what GPGPU APIs Apple has neglected to kill as yet.

Last fiddled with by Ethan (EO) on 2020-11-17 at 23:00
Ethan (EO) is offline   Reply With Quote
Old 2020-11-18, 01:12   #13
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

76410 Posts
Default

Quote:
Originally Posted by Ethan (EO) View Post
Whatever they put in the iMac refresh will probably be that first high core-count part.
That didn't sound right until I looked and there is precedent in that the A12X has two more cores than the A12. Then again it looks like the imac has just been refreshed, is it likely they'll refresh so soon?

Quote:
Originally Posted by Ethan (EO) View Post
Once that works down to the Mac mini these will be very attractive for energy efficient compute with great granularity of cost and space.

But for now, single-threaded performance is faster than any other consumer CPU out there in specint and specfp2006. 10% higher specfp2006 than Ryzen 9 5950x in second place. All 4 big cores can run at full speed at the same time, so anything that fits the cache hierarchy well is going to be pretty zippy, but single core can saturate main memory at ~60GB/s.
Anandtech suggests that the cache is partitioned somehow, the number of compute workloads that fit the M1 hardware well might not be as high as we'd like.

Quote:
Originally Posted by Ethan (EO) View Post
The power draw figures are at-the-wall, and the Mac Mini may not be totally optimized for platform energy consumption, so I think the efficiency is hard to evaluate at these low wattage levels.
For reference, from memory my 4700u mini-PC idles at ~4.5W and consumes ~20.5W total when the SoC target is 11W. Assuming the target matches reality which I'm inclined to think is true, this indicates that the SODIMM DDR4 consumes ~5W under load. One of the main points of LPDDR4X is that it's energy efficient, I don't think it's unreasonable to guess that the LPDDR4X has an upper bound of ~4W under load if the SODIMM's 5W figure is accurate. With the idle figures in the article that gives ~9W of power not consumed by the SoC when under load so the SoC consumes ~22W for the compute MT workload in the article. That sounds like it'd be quite a bit above the M1's sweet spot which I'm assuming is somewhere in the 5-10W range. If I'm wrong and the sweet spot is say 15W then the figures turn out a bit worse for the M1, as then it's more reasonable to directly compare it to the 4800u.

Quote:
Originally Posted by Ethan (EO) View Post
Between this and AMD’s roadmap, we have some degree of competition in mass market CPUs for the first time in quite a while!
I hope Apple drags other ARM producers into the performance category, Apple is competition only in a loose sense in that there's no way in hell many people will switch to or from them.
M344587487 is online now   Reply With Quote
Old 2020-11-27, 18:53   #14
wagner85
 
Aug 2020

2510 Posts
Default

Hey guys,
I watched some videos on YouTube about the M1 chip.

It is a crazy improvement.
Check the comparison between the intel mid 2020 version to the M1.
On min 4:00
https://youtu.be/VXgLBa5jgr8


Has anyone seen this new chip running p95 on PRP test?
Since ram, cpu, gpu are all packed in the same chip, I believe we could expect big improvements.
Could you please share your thoughts on that?

How many Ms/itr may that generate for 110M expoent?
My fastest machine has 7ms/itr.
E5-2690 v0.
I find joy in comparing these benchmarks 😂
wagner85 is offline   Reply With Quote
Old 2020-11-27, 19:38   #15
ldesnogu
 
ldesnogu's Avatar
 
Jan 2008
France

54310 Posts
Default

Quote:
Originally Posted by wagner85 View Post
Has anyone seen this new chip running p95 on PRP test?
Since ram, cpu, gpu are all packed in the same chip, I believe we could expect big improvements.
Could you please share your thoughts on that?
P95 isn't an ARM application so it would run through Rosetta 2 which doesn't have support for anything beyond SSEn.

I guess mlucas would be faster than p95 under Rosetta but still slower than mlucas and p95 running on the fastest x86 chips: M1 can process 512 bits of FP data per cycle vs 1024 bits for best x86. M1 has a nice RAM bandwidth ~58GB/s but I'm not sure that's enough to compensate for the lack of FP width.

I'll provide some benchmark once I know what to install to compile mlucas on my shiny new MBP M1.
ldesnogu is offline   Reply With Quote
Old 2020-11-27, 20:51   #16
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

2·2,909 Posts
Default

mlucas running arm code is the way forward. I would hope it would run quite well.
henryzz is online now   Reply With Quote
Old 2020-11-27, 22:10   #17
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2D5716 Posts
Default

Quote:
Originally Posted by ldesnogu View Post
P95 isn't an ARM application so it would run through Rosetta 2 which doesn't have support for anything beyond SSEn.

I guess mlucas would be faster than p95 under Rosetta but still slower than mlucas and p95 running on the fastest x86 chips: M1 can process 512 bits of FP data per cycle vs 1024 bits for best x86. M1 has a nice RAM bandwidth ~58GB/s but I'm not sure that's enough to compensate for the lack of FP width.

I'll provide some benchmark once I know what to install to compile mlucas on my shiny new MBP M1.
Hi, Laurent - the Mlucas-for-ARM-128-bit-SIMD binaries I posted via the README are alas for Linux (e.g. Raspberry Pi, Odroid, Android phones with suitable CPUs), but hopefully the same build procedure used for those will work for you.

Hard at work on Mlucas v20, which will add support for p-1 factoring and PRP-proofing, but is alas way behind schedule due to various reasons - what a crazy year it's been.
ewmayer is offline   Reply With Quote
Old 2020-11-29, 15:05   #18
pvn
 
Nov 2020

22 Posts
Default

Quote:
Originally Posted by ldesnogu View Post
Can OS X run Linux binaries? Because at the moment Linux isn't available (Parallel isn't out yet).

Docker for Arm-based macs is now running though not quite yet publicly available. Build a docker image for Arm Linux and it will run on M1 macs.
pvn is offline   Reply With Quote
Old 2020-11-29, 21:00   #19
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

265278 Posts
Default

Laurent PMed me results of a clang/llvm Mlucas build attempt. Most of the source files compiled fine, but there is a subset which failed, all due to clang reporting "error: inline assembly requires more registers than available" for the same macro, which is a particularly register-greedy one - the clobber list for the Arm64 ASIMD version of it shows 12 of 16 GPRs and 25 of 32 vector-regs used. I am look into it to see if there is some straightforward way to reduce the number of registers used - alas the compiler didn't include any detail in the error message about the precise *type* of registers clang it needed more of.

The same macro has built fine on Arm64/ASIMD in the past using GCC, so this appears to be something related to Clang optimizations and/or the OS.
ewmayer is offline   Reply With Quote
Old 2020-12-01, 11:06   #20
FrankieBalla
 
"George V Phelps"
Nov 2020
San Diego, Californi

1 Posts
Default

It turns out that in the current version of the macOS, the OS sends to Apple a hash (unique identifier) of each and every program you run, when you run it. Lots of people didn’t realize this, because it’s silent and invisible and it fails instantly and gracefully when you’re offline, but today the server got really slow and it didn’t hit the fail-fast code pat
FrankieBalla is offline   Reply With Quote
Old 2020-12-01, 20:32   #21
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

3·53·73 Posts
Default

I misspoke re. number-of-GPRs-available above ... there are in fact 32, my Mlucas macros only ever use ones from the bottom 16, x0-x15. Laurent sent me a link to the Apple Developer page Writing Arm64 Code For Apple Platforms which notes some GPRs are reserved for the OS, but those are all in the high 16 GPRs x16-x31, so the out-of-registers compile error is a mystery. As it happens I have an old friend from my years in Cupertino who worked for Apple most of his career until taking early-retirement a few years back visiting me in the coming week, will ask him if he might still have any contacts among the Apple compiler group.

Quote:
Originally Posted by FrankieBalla View Post
It turns out that in the current version of the macOS, the OS sends to Apple a hash (unique identifier) of each and every program you run, when you run it. Lots of people didn’t realize this, because it’s silent and invisible and it fails instantly and gracefully when you’re offline, but today the server got really slow and it didn’t hit the fail-fast code pat
See my post #10 in this thread.
ewmayer is offline   Reply With Quote
Old 2020-12-02, 08:59   #22
Happy5214
 
Happy5214's Avatar
 
"Alexander"
Nov 2008
The Alamo City

1111100102 Posts
Default

Quote:
Originally Posted by ewmayer View Post
See my post #10 in this thread.
It's a direct copy-paste of part of the quote you posted in post #10. It looks very odd/suspicious.
Happy5214 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Could Apple become the new Windows? jasong jasong 2 2012-12-07 05:57
Bugs in apple calculator diep Lounge 8 2011-05-10 21:59
New to Apple; New to GIMPS Unregistered Information & Answers 4 2009-03-16 13:10
Dualcore mac G5 machines from Apple Peter Nelson Hardware 20 2006-03-07 11:26
64-bit GMP-ECM on Apple G5/OS X v10.4 PBMcL GMP-ECM 5 2005-06-04 06:12

All times are UTC. The time now is 12:50.

Wed Mar 3 12:50:51 UTC 2021 up 90 days, 9:02, 0 users, load averages: 1.82, 2.20, 2.25

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.