View Single Post
Old 2020-12-30, 06:22   #152
ewmayer's Avatar
Sep 2002
Rep├║blica de California

101101011010012 Posts

Originally Posted by Mark Rose View Post
It seems the M1 has some undocumented instructions that may be useful for Mlucas:
Interesting, but ugh - that kind of nonportability is only worth it if it offers huge performance benefits for one's application.

Brief update re. Mlucas-on-M1: Laurent Desnogues has a gcc-under-brew build working, we are playing around to see what maximizes total throughput on the big+little processor pair. I need to ask him how much detail I may release publicly, for now let me just say that 4-threaded performance on the big core alone is well more than 10x that of my Odroid C2, clock-for-clock. (But the C2 ain't exactly world-beating, so that's not saying all that much, except "the M1 doesn't suck").

I finished debug of some code mods designed to accommodate clang-on-M1's tighter-than-gcc macro-#args constraint, tested on my Odroid but waiting to hear whether it solves his Clang build issues on M1. We want both build options to be able to compare timings, clearly - the asm shouldn't care too much, but all the surrounding C code might.
ewmayer is offline   Reply With Quote