mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software > Mlucas

Reply
 
Thread Tools
Old 2019-10-16, 03:12   #45
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2×13×443 Posts
Default

Thanks - so you're getting quite a decent speedup from using both logical cores, though I haven't a clue if the absolute timings are reasonable for the hardware in question - 2 ms/iter @192K is quite slow by (say) Haswell-and-beyond desktop-PC standards.

I suggest you proceed to the full production-run-oriented self-tests, and please post a zipped copy of the resulting self-test logfile here:

./Mlucas -s m -iters 100 -cpu 0:1 >& selftest.log
ewmayer is offline   Reply With Quote
Old 2019-10-16, 04:55   #46
Dylan14
 
Dylan14's Avatar
 
"Dylan"
Mar 2017

29 Posts
Default

Quote:
Originally Posted by ewmayer View Post
Thanks - so you're getting quite a decent speedup from using both logical cores, though I haven't a clue if the absolute timings are reasonable for the hardware in question - 2 ms/iter @192K is quite slow by (say) Haswell-and-beyond desktop-PC standards.

I suggest you proceed to the full production-run-oriented self-tests, and please post a zipped copy of the resulting self-test logfile here:

./Mlucas -s m -iters 100 -cpu 0:1 >& selftest.log

See attached file. Note: this is on a new session of Colab, so the processor is not the same as before. I have also attached the cpu info and cfg files.
Attached Files
File Type: txt cpuinfo.txt (2.3 KB, 54 views)
File Type: log selftest.log (64.4 KB, 49 views)
File Type: txt mlucas.cfg.txt (2.3 KB, 49 views)
Dylan14 is offline   Reply With Quote
Old 2019-10-16, 19:14   #47
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2·13·443 Posts
Default

Thanks for the build & test data - I see this particular new instance supports avx-512, so you'll want to prepare a second build that invokes those inline-asm macros in the code:

gcc -c -O3 -DUSE_AVX512 -march=skylake-avx512 -DUSE_THREADS ../src/*.c >& build.log

...and use a different name for the resulting executable, you could call the 2 binaries mlucas_avx2 and mlucas_avx512, say. "grep avx512 /proc/cpuinfo" on whatever system you get during a particular session will tell you which binary to use. Rerun the self-tests on this new system to see what kind of speedup you get from using avx-512.

(Wait - while working through your selftest.log data further down in this note, I came across these infoprints @7168K:

radix28_ditN_cy_dif1: No AVX-512 support; Skipping this leading radix.

So you did prepare and use an avx-512 build as per above compile flags for this set of runs? If so, that obviates the avx2-vs-avx512 parts of the commentary below.)

As to your avx2-build timings, I realized after posting my "seems slow' comment yesterday that I was thinking in terms of multicore running on hardware like my Haswell. For a single-physical-core running at 2 GHz, ~50 msec/iter at the current GIMPS wavefront (5120K) is not at all bad - for comparison, here is the mlucas.cfg file for all 4 physical cores (no hyperthreading on this CPU) of my 3.3GHz Haswell. On a single CPU the runtimes would be perhaps ~3.5x as large, so (say) at 5120K we'd expect ~47 msec/iter, only ~10% faster than your 1-core/2-thread timings, and this is at 3.3GHz vs your 2GHz:
Code:
18.0
      2048  msec/iter =    5.25  ROE[avg,max] = [0.222878714, 0.312500000]  radices =  64 16 32 32  0  0  0  0  0  0
      2304  msec/iter =    5.85  ROE[avg,max] = [0.259770659, 0.375000000]  radices = 144 16 16 32  0  0  0  0  0  0
      2560  msec/iter =    6.28  ROE[avg,max] = [0.252363335, 0.312500000]  radices = 160 16 16 32  0  0  0  0  0  0
      2816  msec/iter =    7.44  ROE[avg,max] = [0.239182557, 0.312500000]  radices = 176 16 16 32  0  0  0  0  0  0
      3072  msec/iter =    8.35  ROE[avg,max] = [0.251998996, 0.312500000]  radices = 192 16 16 32  0  0  0  0  0  0
      3328  msec/iter =    9.02  ROE[avg,max] = [0.243424657, 0.312500000]  radices = 208 16 16 32  0  0  0  0  0  0
      3584  msec/iter =    9.25  ROE[avg,max] = [0.248507344, 0.312500000]  radices = 224 16 16 32  0  0  0  0  0  0
      3840  msec/iter =   10.17  ROE[avg,max] = [0.256763639, 0.343750000]  radices = 240 16 16 32  0  0  0  0  0  0
      4096  msec/iter =   10.63  ROE[avg,max] = [0.279075387, 0.343750000]  radices = 256 16 16 32  0  0  0  0  0  0
      4608  msec/iter =   12.21  ROE[avg,max] = [0.269211099, 0.343750000]  radices = 288 16 16 32  0  0  0  0  0  0
      5120  msec/iter =   13.48  ROE[avg,max] = [0.300527545, 0.375000000]  radices = 320 16 16 32  0  0  0  0  0  0
      5632  msec/iter =   15.42  ROE[avg,max] = [0.230105748, 0.281250000]  radices = 176 16 32 32  0  0  0  0  0  0
      6144  msec/iter =   17.51  ROE[avg,max] = [0.246608585, 0.312500000]  radices = 192 16 32 32  0  0  0  0  0  0
      6656  msec/iter =   18.60  ROE[avg,max] = [0.231292347, 0.312500000]  radices = 208 16 32 32  0  0  0  0  0  0
Further using an avx-512 build on this type of instance should give a nice added speedup, perhaps as much as 1.6x. And if/when a Prime95/mprime build for these systems comes online, that should be faster still.

Looking more closely at your selftest.log and mlucas.cfg files, I see "Excessive level of roundoff error detected" messages for individual FFT radix sets at 2816K, 3328K, 5120K and 7168K, but in none of those cases did the skipped radix set(s) happen to be the fastest one(s) at the FFT length in question.
ewmayer is offline   Reply With Quote
Old 2019-11-28, 02:09   #48
kracker
ἀβουλία
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

5·433 Posts
Default

Trying to compile under MSYS2/windows, getting 'SIGHUP' undeclared errors.
Code:
../src/fermat_mod_square.c:1869:18: error: 'SIGHUP' undeclared (first use in this function)
../src/mers_mod_square.c:2382:18: error: 'SIGHUP' undeclared (first use in this function)
../src/Mlucas.c:182:21: error: 'SIGHUP' undeclared (first use in this function)
kracker is online now   Reply With Quote
Old 2019-11-28, 02:53   #49
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2×13×443 Posts
Default

Quote:
Originally Posted by kracker View Post
Trying to compile under MSYS2/windows, getting 'SIGHUP' undeclared errors.
Code:
../src/fermat_mod_square.c:1869:18: error: 'SIGHUP' undeclared (first use in this function)
../src/mers_mod_square.c:2382:18: error: 'SIGHUP' undeclared (first use in this function)
../src/Mlucas.c:182:21: error: 'SIGHUP' undeclared (first use in this function)
I no longer have access to a Windows machine of any kind - perhaps SIGHUP has no proper analog in Windows? Anyhow, quick workaround is to simply comment out any clauses giving such errors and recompile. E.g. in Mlucas.c:
Code:
void sig_handler(int signo)
{
	if (signo == SIGINT) {
		fprintf(stderr,"received SIGINT signal.\n");	sprintf(cbuf,"received SIGINT signal.\n");
	} else if(signo == SIGTERM) {
		fprintf(stderr,"received SIGTERM signal.\n");	sprintf(cbuf,"received SIGTERM signal.\n");
//	} else if(signo == SIGHUP) {
//		fprintf(stderr,"received SIGHUP signal.\n");	sprintf(cbuf,"received SIGHUP signal.\n");
	}
	// Toggle a global to allow desired code sections to detect signal-received and take appropriate action:
	MLUCAS_KEEP_RUNNING = 0;
}
..and similarly in the other 2 files which define signal handlers and are giving errors.
ewmayer is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Mlucas version 17.1 ewmayer Mlucas 96 2019-10-16 12:55
Mlucas on ubuntu Damian Mlucas 17 2017-11-13 18:12
Mlucas version 17 ewmayer Mlucas 3 2017-06-17 11:18
MLucas on IBM Mainframe Lorenzo Mlucas 52 2016-03-13 08:45
mlucas on sun delta_t Mlucas 14 2007-10-04 05:45

All times are UTC. The time now is 02:28.

Tue Sep 22 02:28:24 UTC 2020 up 11 days, 23:39, 0 users, load averages: 1.61, 1.51, 1.57

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.