View Single Post
Old 2020-08-03, 14:59   #38
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

5,953 Posts
Default

Quote:
Originally Posted by Happy5214 View Post
I'm planning to buy an ODROID N2+ in the coming days for the express purpose of learning and testing ARM assembly (and as a replacement for my old RPi 2 B+ that hasn't worked in years). I'll let you know when it arrives if I feel up to porting the ASM routines.
Cool. I suggest that you start with fpu_mulmod function. That will likely be the easiest one to port. Most of the others can be built on top of that in one way or another. next up would by the 4x version of an fpu routine although I do not know what gains you can get on ARM by doing more than one mulmod concurrently and I don't know how many is optimal. I suspect that ARM does not have an 80-bit fpu, so it will be limited to p < 2^52. I also do not know if ARM has any vector instructions such like SSE or AVX on x86. You will notice that Worker.h has some builtin checks for AVX compatibility. You will likely need to add something similar to control ARM code paths.
rogue is online now   Reply With Quote