mersenneforum.org Which SIMD flag to use for Raspberry Pi
 Register FAQ Search Today's Posts Mark Forums Read

 2017-11-16, 00:46 #1 BrainStone     Nov 2017 Karlsruhe, Germany 31 Posts Which SIMD flag to use for Raspberry Pi Hi. I'm trying to compile the latest version of MLucas for my Raspberry Pi Model B Rev 2. I ran all four commands to determine which SIMD type to use, but it came up empty. So which flag should I use? Also since my Raspi is a single core would it be benificial to not compile using the "-DUSE_THREADS" flag? I hope you can help me out :)
2017-11-16, 09:18   #2
henryzz
Just call me Henry

"David"
Sep 2007
Cambridge (GMT/BST)

132728 Posts

Quote:
 Originally Posted by BrainStone Hi. I'm trying to compile the latest version of MLucas for my Raspberry Pi Model B Rev 2. I ran all four commands to determine which SIMD type to use, but it came up empty. So which flag should I use? Also since my Raspi is a single core would it be benificial to not compile using the "-DUSE_THREADS" flag? I hope you can help me out :)
The Rapsberry Pi B2 is ARM-v7. The SIMD code is for ARM-v8. The Raspberry Pi B2 has 4 cores.

2017-11-16, 16:32   #3
BrainStone

Nov 2017
Karlsruhe, Germany

31 Posts

Quote:
 Originally Posted by henryzz The Rapsberry Pi B2 is ARM-v7. The SIMD code is for ARM-v8. The Raspberry Pi B2 has 4 cores.
We are talking about the Raspberry Pi 1 B, not the 2 B

So it is an ARM-v6 single core processor.

Anyways, I compiled it it with
Code:
gcc -c -O3 -DUSE_THREADS ../src/*.c
and this resulted in this error: https://gist.github.com/464437e0ab39...ef5e2311952cc5

Last fiddled with by BrainStone on 2017-11-16 at 16:33

2017-11-16, 23:23   #4
ewmayer
2ω=0

Sep 2002
República de California

3·53·73 Posts

Quote:
 Originally Posted by BrainStone We are talking about the Raspberry Pi 1 B, not the 2 B So it is an ARM-v6 single core processor.
Yes, you should just omit -DUSE_THREADS, unless that single core is hyperthreaded, which seems unlikely. (My A53 v8 quad-core isn't.)

Quote:
 Anyways, I compiled it it with Code: gcc -c -O3 -DUSE_THREADS ../src/*.c and this resulted in this error: https://gist.github.com/464437e0ab39...ef5e2311952cc5
Looks like HWCAP_ASIMD is only defined as of v8 (or perhaps v7, but def. not v6). What happens if you compile (just use 'gcc -c -O3 ../src/util.c') with this hacked version of has_asimd() replacing the current one at line 1882 of util.c?
Code:
	int has_asimd(void)
{
unsigned long hwcaps = getauxval(AT_HWCAP);
#ifndef HWCAP_ASIMD
const unsigned long HWCAP_ASIMD = 0;
#endif
if (hwcaps & HWCAP_ASIMD) {
return 1;
}
return 0;
}

2017-11-16, 23:33   #5

"Kieren"
Jul 2011
In My Own Galaxy!

236548 Posts

Quote:
 Originally Posted by BrainStone We are talking about the Raspberry Pi 1 B, not the 2 B So it is an ARM-v6 single core processor. Anyways, I compiled it it with Code: gcc -c -O3 -DUSE_THREADS ../src/*.c and this resulted in this error: https://gist.github.com/464437e0ab39...ef5e2311952cc5
In any case, welcome to the forum. Also, thanks for a question I learned some things from.

 2017-11-16, 23:45 #6 ewmayer ∂2ω=0     Sep 2002 República de California 3·53·73 Posts BTW, in case it wasn't obvious, single-core v6 is going to be godawfully slow - my A53 quad, using the SIMD code on all 4 cores, is gonna need 2 months to DC a single exponent ~45 million @2304K FFT length.
 2017-11-17, 00:31 #7 BrainStone     Nov 2017 Karlsruhe, Germany 31 Posts Yes, I'm aware that it's going to be slow. I kinda just want to play with my Pi since I'm not using it for anything else. Initially I was trying to use mprime and assign it only work that takes the least amount of work. But since mprime doesn't run on arm I can't use it. Assuming I get it to work, and use the script, does it have a feature like mprime that saves the progress so it continues after reboot?
2017-11-17, 01:35   #8
ewmayer
2ω=0

Sep 2002
República de California

265278 Posts

Quote:
 Originally Posted by BrainStone Yes, I'm aware that it's going to be slow. I kinda just want to play with my Pi since I'm not using it for anything else. Initially I was trying to use mprime and assign it only work that takes the least amount of work. But since mprime doesn't run on arm I can't use it. Assuming I get it to work, and use the script, does it have a feature like mprime that saves the progress so it continues after reboot?
Yes, the program saves a checkpoint file every 10000 iterations - for exponent N, you will see the following 3 files:

pN.stat - text status file, initially the FFT params being used, then 1 line added at each checkpoint;
pN,qN - pair of redundant checkpoint files, program reads these automatically on restart-from-interrupt.

A worthwhile experiment might be to make 2 binaries, one with -DUSE_THREADS, the other not. Both will be restricted to 1-thread running on your Pi and one would expect the unthreaded build to be a bit faster due to no thread-management overhead, but I have seen instances where the threaded code is faster even when running 1-threaded. The automated self-tests which will provide the answer will be sufficiently slow on your system that you should probably just run them overnight.

Did the above function hack work for you?

 2017-11-17, 03:11 #9 BrainStone     Nov 2017 Karlsruhe, Germany 3110 Posts It's still compiling. Takes around 3 hours to do so. I'll let you know in the morning. I'll start a second compilation with the flag active though.
 2017-11-17, 03:16 #10 BrainStone     Nov 2017 Karlsruhe, Germany 31 Posts Ok. The compilation without threads errored with this log: https://gist.github.com/a1c1a0900155...820b6c9707594e
2017-11-17, 04:09   #11
ewmayer
2ω=0

Sep 2002
República de California

3·53·73 Posts

Quote:
 Originally Posted by BrainStone It's still compiling. Takes around 3 hours to do so. I'll let you know in the morning. I'll start a second compilation with the flag active though.
That's why I suggested just recompiling util.c to see if the code patch works. Incremental recompilation, it's a beautiful thing. :)

Quote:
 Originally Posted by BrainStone Ok. The compilation without threads errored with this log: https://gist.github.com/a1c1a0900155...820b6c9707594e
Thanks - my bad, it's been a while since I did an unthreaded build, since that mode is now deprecated. I will need to try one locally in each of the various SIMD modes (scalar-double, SSE2, ARMv8, AVX, AVX2, AVX512) and issue a patch based on the resulting fixes.

Did your above unthreaded build attempt use the modified has_asimd() I posted? If it did and you got no errors in the compilation of util.c, that means you can go back to your original -DUSE_THREAD build (assuming you did that in a separate obj-file directory, and the resulting .o files are still around) and simply do an incremental recompile of util.c, and that should allow you link an executable. If you did your latest build in the same obj-file directory, I suggest you create a 2nd dir strictly for threaded-build obj-files, cd to that, then first make sure compilation of util.c works, then retry the all-sourcefiles compile.

 Similar Threads Thread Thread Starter Forum Replies Last Post ewmayer Mlucas 183 2019-02-25 08:17 GP2 GMP-ECM 3 2016-10-16 10:21 richs YAFU 11 2016-01-30 14:27 Mr. P-1 Programming 77 2015-02-23 00:04 fivemack Software 7 2009-03-23 18:15

All times are UTC. The time now is 04:12.

Thu Mar 4 04:12:35 UTC 2021 up 91 days, 23 mins, 1 user, load averages: 1.47, 1.55, 1.55