mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   YAFU (https://www.mersenneforum.org/forumdisplay.php?f=96)
-   -   AVX-ECM (https://www.mersenneforum.org/showthread.php?t=25056)

EdH 2020-01-03 04:19

I had been using cat /proc/cpuinfo and lscpu, but really appreciate the egrep line.

I have dicovered that if I try to get a new python3 motebook, it invariably has no avx512. However, if I "Connect" the "Welcome" page, I get the full list of avx512bw/cd/dq/f/vl. I hope to build a Colab AVX-ECM instance soon.

PhilF 2020-01-03 19:01

[QUOTE=EdH;534088]I had been using cat /proc/cpuinfo and lscpu, but really appreciate the egrep line.

I have dicovered that if I try to get a new python3 motebook, it invariably has no avx512. However, if I "Connect" the "Welcome" page, I get the full list of avx512bw/cd/dq/f/vl. I hope to build a Colab AVX-ECM instance soon.[/QUOTE]

I think it is more random than that. I always connect through the Welcome page, but the CPU I get is hit-or-miss between Haswell, Broadwell, or Skylake. Only the Skylake has AVX-512. If I need a Skylake, right after I connect I use lscpu to check, then I reset the session until I get a one. It usually takes only a few tries.

EdH 2020-01-03 19:47

[QUOTE=PhilF;534138]I think it is more random than that. I always connect through the Welcome page, but the CPU I get is hit-or-miss between Haswell, Broadwell, or Skylake. Only the Skylake has AVX-512. If I need a Skylake, right after I connect I use lscpu to check, then I reset the session until I get a one. It usually takes only a few tries.[/QUOTE]
You seem to be quite right. I tried several without success (even from Welcome page) and then got one, but alas, I couldn't get avx-ecm to compile. GCC was 7.4.0 - changing GCC=7.4.0 in the Makefile didn't help. I'll play more later.

mathwiz 2020-01-03 19:51

[QUOTE=EdH;534140]You seem to be quite right. I tried several without success (even from Welcome page) and then got one, but alas, I couldn't get avx-ecm to compile. GCC was 7.4.0 - changing GCC=7.4.0 in the Makefile didn't help. I'll play more later.[/QUOTE]

I was able to get it to compile and run successfully on a Skylake machine by doing the following:

[CODE]!sudo apt-get install libgmp-dev
!git clone https://github.com/bbuhrow/avx-ecm.git
!cd avx-ecm/ && ls
!make -j 8 SKYLAKEX=1 COMPILER=gcc[/CODE]

And running with, e.g.:

[CODE]!cd avx-ecm && ./avx-ecm "2^1277-1" 100 1000000 8[/CODE]

ATH 2020-01-03 20:16

I'm getting compiling errors on an EC2 Ubuntu 18.04 instance with GCC 9.2.1

Edit: ...and same error with GCC 7.4.0

[CODE]ubuntu@ip-172-31-1-23:/mnt-efs/z/avx-ecm$ gcc --version
gcc (Ubuntu 9.2.1-17ubuntu1~18.04.1) 9.2.1 20191102
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

ubuntu@ip-172-31-1-23:/mnt-efs/z/avx-ecm$ make MAXBITS=416 COMPILER=gcc SKYLAKEX=1
gcc -fopenmp -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/presieve.o eratosthenes/presieve.c
gcc -fopenmp -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/count.o eratosthenes/count.c
gcc -fopenmp -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/offsets.o eratosthenes/offsets.c
gcc -fopenmp -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/primes.o eratosthenes/primes.c
gcc -fopenmp -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/roots.o eratosthenes/roots.c
gcc -fopenmp -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/linesieve.o eratosthenes/linesieve.c
gcc -fopenmp -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/soe.o eratosthenes/soe.c
eratosthenes/soe.c: In function ‘spSOE’:
eratosthenes/soe.c:273:43: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint32_t’ {aka ‘unsigned int’} [-Wformat=]
273 | printf("using %lu primes, max prime = %lu \n", sdata.pboundi, sieve_p[sdata.pboundi]); // sdata.pbound);
| ~~^ ~~~~~~~~~~~~~~~~~~~~~~
| | |
| long unsigned int uint32_t {aka unsigned int}
| %u
gcc -fopenmp -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/tiny.o eratosthenes/tiny.c
gcc -fopenmp -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/worker.o eratosthenes/worker.c
gcc -fopenmp -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/soe_util.o eratosthenes/soe_util.c
eratosthenes/soe_util.c: In function ‘check_input’:
eratosthenes/soe_util.c:140:17: warning: assignment to ‘__mpz_struct (*)[1]’ {aka ‘struct <anonymous> (*)[1]’} from incompatible pointer type ‘__mpz_struct *’ {aka ‘struct <anonymous> *’} [-Wincompatible-pointer-types]
140 | sdata->offset = offset;
| ^
eratosthenes/soe_util.c: In function ‘init_sieve’:
eratosthenes/soe_util.c:172:13: warning: implicit declaration of function ‘spGCD’ [-Wimplicit-function-declaratio ]
172 | if (spGCD(i, (uint64_t)prodN) == 1)
| ^~~~~
gcc -fopenmp -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/wrapper.o eratosthenes/wrapper.c
gcc -fopenmp -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall -I. -I/projects/gmp-6.0.0a/install/include -c -o threadpool.o threadpool.c
gcc -fopenmp -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall -I. -I/projects/gmp-6.0.0a/install/include -c -o main.o main.c
<command-line>: error: expected identifier or ‘(’ before numeric constant
avx_ecm.h:107:10: note: in expansion of macro ‘MAXBITS’
107 | uint32_t MAXBITS;
| ^~~~~~~
main.c: In function ‘main’:
main.c:203:17: error: lvalue required as left operand of assignment
203 | MAXBITS = 208;
| ^
main.c:206:21: error: lvalue required as left operand of assignment
206 | MAXBITS += 208;
| ^~
main.c:211:17: error: lvalue required as left operand of assignment
211 | MAXBITS = 128;
| ^
main.c:214:21: error: lvalue required as left operand of assignment
214 | MAXBITS += 128;
| ^~
main.c:274:24: warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘size_t’ {aka ‘long unsigned int’} [-Wformat=]
274 | printf("Input has %d bits, using %d threads (%d curves/thread)\n",
| ~^
| |
| int
| %ld
main.c:325:13: warning: unused variable ‘j’ [-Wunused-variable]
325 | int j;
| ^
main.c:145:9: warning: unused variable ‘nextptr’ [-Wunused-variable]
145 | char **nextptr;
| ^~~~~~~
main.c:144:14: warning: unused variable ‘j’ [-Wunused-variable]
144 | uint32_t i, j;
| ^
main.c:140:12: warning: unused variable ‘siglist’ [-Wunused-variable]
140 | uint32_t *siglist;
| ^~~~~~~
main.c:138:15: warning: unused variable ‘n’ [-Wunused-variable]
138 | bignum **f, *n;
| ^
main.c:138:11: warning: unused variable ‘f’ [-Wunused-variable]
138 | bignum **f, *n;
| ^
At top level:
main.c:50:12: warning: ‘debugctr’ defined but not used [-Wunused-variable]
50 | static int debugctr = 0;
| ^~~~~~~~
Makefile:174: recipe for target 'main.o' failed
make: *** [main.o] Error 1
ubuntu@ip-172-31-1-23:/mnt-efs/z/avx-ecm$
[/CODE]

bsquared 2020-01-03 20:45

[QUOTE=ATH;534146]I'm getting compiling errors on an EC2 Ubuntu 18.04 instance with GCC 9.2.1

Edit: ...and same error with GCC 7.4.0
[/QUOTE]

As of post #4, you don't need the MAXBITS=416 directive on the compile line. Just build with COMPILER=gcc and SKYLAKEX=1.

Sorry, I will update the readme.

mathwiz 2020-01-03 22:04

From my limited knowledge of Makefile's and gcc, I notice that the gcc options for SKYLAKEX=1 specify "-march=skylake-avx512" but not "-mtune=skylake-avx512". Is this deliberate?

bsquared 2020-01-03 22:24

[QUOTE=mathwiz;534168]From my limited knowledge of Makefile's and gcc, I notice that the gcc options for SKYLAKEX=1 specify "-march=skylake-avx512" but not "-mtune=skylake-avx512". Is this deliberate?[/QUOTE]

From [url]https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html[/url]:

[QUOTE]
[B]3.18.59 x86 Options[/B]
These ‘-m’ options are defined for the x86 family of computers.

-march=cpu-type
Generate instructions for the machine type cpu-type. In contrast to -mtune=cpu-type, which merely tunes the generated code for the specified cpu-type, -march=cpu-type allows GCC to generate code that may not run at all on processors other than the one indicated. [Specifying -march=cpu-type implies -mtune=cpu-type.
[/QUOTE]

So according to the last sentence there, we don't need mtune if we've already put in march. I ran some experiments anyway and didn't see any speed difference when adding mtune.

I'll also mention that if you have access to Intel's compiler, it seems to do a much better job with this code. For the 624-bit size, for instance, I get 8 stage 1 curves at B1=1e6 in 8 seconds with gcc versus 7 seconds with icc.

Even thinking about building a statically-linked executable gives me a headache though...

ATH 2020-01-04 00:57

Thanks, it worked. I tested it in an EC2 instance with 1 core / 2 threads (Xeon Cascadelake) on a 402 bit prime number:

[U]160 curves at B1=1,000,000:[/U]
GMP-ECM: [B]300.6 sec[/B]
AVX-ECM 1 thread: [B]179.5 sec[/B]
AVX-ECM 2 threads: [B]144.8 sec[/B]


I created a 402 bit composite number with a 30 digit (97 bit) factor and tested it 10 times with B1=1,000,000 (35 digit level) to see how fast the factor was found with GMP-ECM and AVX-ECM:

[CODE]
GMP-ECM AVX-ECM
curve 23 curves 0-15
curve 34 curves 16-31
curve 48 curves 32-47
curve 81 curves 48-63 (found twice)
curve 89 curves 64-79
curve 129 curves 144-159
curve 135 curves 144-159
curve 168 curves 160-175 (found twice)
curve 195 curves 384-399
curve 264 curves 416-431[/CODE]

So based on the speed from the first test GMP-ECM found the factor 10 times in 2191 seconds and AVX-ECM found the factor 10 (12) times in 1419 seconds.

I took the 12 sigmas values for which AVX-ECM found the factor and tested in GMP-ECM, and it found the factor as well with all the 12 sigmas.

EdH 2020-01-04 03:17

[QUOTE=mathwiz;534142]I was able to get it to compile and run successfully on a Skylake machine by doing the following:

[CODE]!sudo apt-get install libgmp-dev
!git clone https://github.com/bbuhrow/avx-ecm.git
!cd avx-ecm/ && ls
!make -j 8 SKYLAKEX=1 COMPILER=gcc[/CODE]And running with, e.g.:

[CODE]!cd avx-ecm && ./avx-ecm "2^1277-1" 100 1000000 8[/CODE][/QUOTE]
Thanks! I got a Colab session to work. I'll play a little more and then make a "How I . . ." for my blog section.

bsquared 2020-01-04 03:37

[QUOTE=ATH;534182]Thanks, it worked. I tested it in an EC2 instance with 1 core / 2 threads (Xeon Cascadelake) on a 402 bit prime number:

[U]160 curves at B1=1,000,000:[/U]
GMP-ECM: [B]300.6 sec[/B]
AVX-ECM 1 thread: [B]179.5 sec[/B]
AVX-ECM 2 threads: [B]144.8 sec[/B]


I created a 402 bit composite number with a 30 digit (97 bit) factor and tested it 10 times with B1=1,000,000 (35 digit level) to see how fast the factor was found with GMP-ECM and AVX-ECM:

[CODE]
GMP-ECM AVX-ECM
curve 23 curves 0-15
curve 34 curves 16-31
curve 48 curves 32-47
curve 81 curves 48-63 (found twice)
curve 89 curves 64-79
curve 129 curves 144-159
curve 135 curves 144-159
curve 168 curves 160-175 (found twice)
curve 195 curves 384-399
curve 264 curves 416-431[/CODE]

So based on the speed from the first test GMP-ECM found the factor 10 times in 2191 seconds and AVX-ECM found the factor 10 (12) times in 1419 seconds.

I took the 12 sigmas values for which AVX-ECM found the factor and tested in GMP-ECM, and it found the factor as well with all the 12 sigmas.[/QUOTE]

Fantastic, thank you for those tests!

Decent speedup from the hyperthread - I haven't had a chance to test that yet so thanks again.


All times are UTC. The time now is 08:27.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.