mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > YAFU

Reply
 
Thread Tools
Old 2020-01-03, 04:19   #23
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

22·3·5·59 Posts
Default

I had been using cat /proc/cpuinfo and lscpu, but really appreciate the egrep line.

I have dicovered that if I try to get a new python3 motebook, it invariably has no avx512. However, if I "Connect" the "Welcome" page, I get the full list of avx512bw/cd/dq/f/vl. I hope to build a Colab AVX-ECM instance soon.
EdH is offline   Reply With Quote
Old 2020-01-03, 19:01   #24
PhilF
 
PhilF's Avatar
 
Feb 2005
Colorado

577 Posts
Default

Quote:
Originally Posted by EdH View Post
I had been using cat /proc/cpuinfo and lscpu, but really appreciate the egrep line.

I have dicovered that if I try to get a new python3 motebook, it invariably has no avx512. However, if I "Connect" the "Welcome" page, I get the full list of avx512bw/cd/dq/f/vl. I hope to build a Colab AVX-ECM instance soon.
I think it is more random than that. I always connect through the Welcome page, but the CPU I get is hit-or-miss between Haswell, Broadwell, or Skylake. Only the Skylake has AVX-512. If I need a Skylake, right after I connect I use lscpu to check, then I reset the session until I get a one. It usually takes only a few tries.
PhilF is offline   Reply With Quote
Old 2020-01-03, 19:47   #25
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

22×3×5×59 Posts
Default

Quote:
Originally Posted by PhilF View Post
I think it is more random than that. I always connect through the Welcome page, but the CPU I get is hit-or-miss between Haswell, Broadwell, or Skylake. Only the Skylake has AVX-512. If I need a Skylake, right after I connect I use lscpu to check, then I reset the session until I get a one. It usually takes only a few tries.
You seem to be quite right. I tried several without success (even from Welcome page) and then got one, but alas, I couldn't get avx-ecm to compile. GCC was 7.4.0 - changing GCC=7.4.0 in the Makefile didn't help. I'll play more later.
EdH is offline   Reply With Quote
Old 2020-01-03, 19:51   #26
mathwiz
 
Mar 2019

100011112 Posts
Default

Quote:
Originally Posted by EdH View Post
You seem to be quite right. I tried several without success (even from Welcome page) and then got one, but alas, I couldn't get avx-ecm to compile. GCC was 7.4.0 - changing GCC=7.4.0 in the Makefile didn't help. I'll play more later.
I was able to get it to compile and run successfully on a Skylake machine by doing the following:

Code:
!sudo apt-get install libgmp-dev
!git clone https://github.com/bbuhrow/avx-ecm.git
!cd avx-ecm/ && ls
!make -j 8 SKYLAKEX=1 COMPILER=gcc
And running with, e.g.:

Code:
!cd avx-ecm && ./avx-ecm "2^1277-1" 100 1000000 8
mathwiz is offline   Reply With Quote
Old 2020-01-03, 20:16   #27
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

3·17·59 Posts
Default

I'm getting compiling errors on an EC2 Ubuntu 18.04 instance with GCC 9.2.1

Edit: ...and same error with GCC 7.4.0

Code:
ubuntu@ip-172-31-1-23:/mnt-efs/z/avx-ecm$ gcc --version
gcc (Ubuntu 9.2.1-17ubuntu1~18.04.1) 9.2.1 20191102
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

ubuntu@ip-172-31-1-23:/mnt-efs/z/avx-ecm$ make MAXBITS=416 COMPILER=gcc SKYLAKEX=1
gcc -fopenmp  -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall  -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/presieve.o eratosthenes/presieve.c
gcc -fopenmp  -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall  -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/count.o eratosthenes/count.c
gcc -fopenmp  -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall  -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/offsets.o eratosthenes/offsets.c
gcc -fopenmp  -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall  -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/primes.o eratosthenes/primes.c
gcc -fopenmp  -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall  -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/roots.o eratosthenes/roots.c
gcc -fopenmp  -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall  -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/linesieve.o eratosthenes/linesieve.c
gcc -fopenmp  -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall  -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/soe.o eratosthenes/soe.c
eratosthenes/soe.c: In function ‘spSOE’:
eratosthenes/soe.c:273:43: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint32_t’ {aka ‘unsigned int’} [-Wformat=]
  273 |   printf("using %lu primes, max prime = %lu  \n", sdata.pboundi, sieve_p[sdata.pboundi]); // sdata.pbound);
      |                                         ~~^                      ~~~~~~~~~~~~~~~~~~~~~~
      |                                           |                             |
      |                                           long unsigned int             uint32_t {aka unsigned int}
      |                                         %u
gcc -fopenmp  -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall  -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/tiny.o eratosthenes/tiny.c
gcc -fopenmp  -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall  -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/worker.o eratosthenes/worker.c
gcc -fopenmp  -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall  -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/soe_util.o eratosthenes/soe_util.c
eratosthenes/soe_util.c: In function ‘check_input’:
eratosthenes/soe_util.c:140:17: warning: assignment to ‘__mpz_struct (*)[1]’ {aka ‘struct <anonymous> (*)[1]’} from incompatible pointer type ‘__mpz_struct *’ {aka ‘struct <anonymous> *’} [-Wincompatible-pointer-types]
  140 |   sdata->offset = offset;
      |                 ^
eratosthenes/soe_util.c: In function ‘init_sieve’:
eratosthenes/soe_util.c:172:13: warning: implicit declaration of function ‘spGCD’ [-Wimplicit-function-declaratio ]
  172 |         if (spGCD(i, (uint64_t)prodN) == 1)
      |             ^~~~~
gcc -fopenmp  -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall  -I. -I/projects/gmp-6.0.0a/install/include -c -o eratosthenes/wrapper.o eratosthenes/wrapper.c
gcc -fopenmp  -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall  -I. -I/projects/gmp-6.0.0a/install/include -c -o threadpool.o threadpool.c
gcc -fopenmp  -DMAXBITS=416 -g -O3 -march=skylake-avx512 -DSKYLAKEX -Wall  -I. -I/projects/gmp-6.0.0a/install/include -c -o main.o main.c
<command-line>: error: expected identifier or ‘(’ before numeric constant
avx_ecm.h:107:10: note: in expansion of macro ‘MAXBITS’
  107 | uint32_t MAXBITS;
      |          ^~~~~~~
main.c: In function ‘main’:
main.c:203:17: error: lvalue required as left operand of assignment
  203 |         MAXBITS = 208;
      |                 ^
main.c:206:21: error: lvalue required as left operand of assignment
  206 |             MAXBITS += 208;
      |                     ^~
main.c:211:17: error: lvalue required as left operand of assignment
  211 |         MAXBITS = 128;
      |                 ^
main.c:214:21: error: lvalue required as left operand of assignment
  214 |             MAXBITS += 128;
      |                     ^~
main.c:274:24: warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘size_t’ {aka ‘long unsigned int’} [-Wformat=]
  274 |     printf("Input has %d bits, using %d threads (%d curves/thread)\n",
      |                       ~^
      |                        |
      |                        int
      |                       %ld
main.c:325:13: warning: unused variable ‘j’ [-Wunused-variable]
  325 |         int j;
      |             ^
main.c:145:9: warning: unused variable ‘nextptr’ [-Wunused-variable]
  145 |  char **nextptr;
      |         ^~~~~~~
main.c:144:14: warning: unused variable ‘j’ [-Wunused-variable]
  144 |  uint32_t i, j;
      |              ^
main.c:140:12: warning: unused variable ‘siglist’ [-Wunused-variable]
  140 |  uint32_t *siglist;
      |            ^~~~~~~
main.c:138:15: warning: unused variable ‘n’ [-Wunused-variable]
  138 |  bignum **f, *n;
      |               ^
main.c:138:11: warning: unused variable ‘f’ [-Wunused-variable]
  138 |  bignum **f, *n;
      |           ^
At top level:
main.c:50:12: warning: ‘debugctr’ defined but not used [-Wunused-variable]
   50 | static int debugctr = 0;
      |            ^~~~~~~~
Makefile:174: recipe for target 'main.o' failed
make: *** [main.o] Error 1
ubuntu@ip-172-31-1-23:/mnt-efs/z/avx-ecm$

Last fiddled with by ATH on 2020-01-03 at 20:30
ATH is online now   Reply With Quote
Old 2020-01-03, 20:45   #28
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

64418 Posts
Default

Quote:
Originally Posted by ATH View Post
I'm getting compiling errors on an EC2 Ubuntu 18.04 instance with GCC 9.2.1

Edit: ...and same error with GCC 7.4.0
As of post #4, you don't need the MAXBITS=416 directive on the compile line. Just build with COMPILER=gcc and SKYLAKEX=1.

Sorry, I will update the readme.
bsquared is offline   Reply With Quote
Old 2020-01-03, 22:04   #29
mathwiz
 
Mar 2019

11·13 Posts
Default

From my limited knowledge of Makefile's and gcc, I notice that the gcc options for SKYLAKEX=1 specify "-march=skylake-avx512" but not "-mtune=skylake-avx512". Is this deliberate?
mathwiz is offline   Reply With Quote
Old 2020-01-03, 22:24   #30
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

3,361 Posts
Default

Quote:
Originally Posted by mathwiz View Post
From my limited knowledge of Makefile's and gcc, I notice that the gcc options for SKYLAKEX=1 specify "-march=skylake-avx512" but not "-mtune=skylake-avx512". Is this deliberate?
From https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html:

Quote:
3.18.59 x86 Options
These ‘-m’ options are defined for the x86 family of computers.

-march=cpu-type
Generate instructions for the machine type cpu-type. In contrast to -mtune=cpu-type, which merely tunes the generated code for the specified cpu-type, -march=cpu-type allows GCC to generate code that may not run at all on processors other than the one indicated. [Specifying -march=cpu-type implies -mtune=cpu-type.
So according to the last sentence there, we don't need mtune if we've already put in march. I ran some experiments anyway and didn't see any speed difference when adding mtune.

I'll also mention that if you have access to Intel's compiler, it seems to do a much better job with this code. For the 624-bit size, for instance, I get 8 stage 1 curves at B1=1e6 in 8 seconds with gcc versus 7 seconds with icc.

Even thinking about building a statically-linked executable gives me a headache though...
bsquared is offline   Reply With Quote
Old 2020-01-04, 00:57   #31
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

3×17×59 Posts
Default

Thanks, it worked. I tested it in an EC2 instance with 1 core / 2 threads (Xeon Cascadelake) on a 402 bit prime number:

160 curves at B1=1,000,000:
GMP-ECM: 300.6 sec
AVX-ECM 1 thread: 179.5 sec
AVX-ECM 2 threads: 144.8 sec


I created a 402 bit composite number with a 30 digit (97 bit) factor and tested it 10 times with B1=1,000,000 (35 digit level) to see how fast the factor was found with GMP-ECM and AVX-ECM:

Code:
GMP-ECM		AVX-ECM
curve 23	curves 0-15
curve 34	curves 16-31
curve 48	curves 32-47
curve 81	curves 48-63 (found twice)
curve 89	curves 64-79
curve 129	curves 144-159
curve 135	curves 144-159
curve 168	curves 160-175 (found twice)
curve 195	curves 384-399
curve 264	curves 416-431
So based on the speed from the first test GMP-ECM found the factor 10 times in 2191 seconds and AVX-ECM found the factor 10 (12) times in 1419 seconds.

I took the 12 sigmas values for which AVX-ECM found the factor and tested in GMP-ECM, and it found the factor as well with all the 12 sigmas.

Last fiddled with by ATH on 2020-01-04 at 01:06
ATH is online now   Reply With Quote
Old 2020-01-04, 03:17   #32
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

22×3×5×59 Posts
Default

Quote:
Originally Posted by mathwiz View Post
I was able to get it to compile and run successfully on a Skylake machine by doing the following:

Code:
!sudo apt-get install libgmp-dev
!git clone https://github.com/bbuhrow/avx-ecm.git
!cd avx-ecm/ && ls
!make -j 8 SKYLAKEX=1 COMPILER=gcc
And running with, e.g.:

Code:
!cd avx-ecm && ./avx-ecm "2^1277-1" 100 1000000 8
Thanks! I got a Colab session to work. I'll play a little more and then make a "How I . . ." for my blog section.
EdH is offline   Reply With Quote
Old 2020-01-04, 03:37   #33
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

64418 Posts
Default

Quote:
Originally Posted by ATH View Post
Thanks, it worked. I tested it in an EC2 instance with 1 core / 2 threads (Xeon Cascadelake) on a 402 bit prime number:

160 curves at B1=1,000,000:
GMP-ECM: 300.6 sec
AVX-ECM 1 thread: 179.5 sec
AVX-ECM 2 threads: 144.8 sec


I created a 402 bit composite number with a 30 digit (97 bit) factor and tested it 10 times with B1=1,000,000 (35 digit level) to see how fast the factor was found with GMP-ECM and AVX-ECM:

Code:
GMP-ECM		AVX-ECM
curve 23	curves 0-15
curve 34	curves 16-31
curve 48	curves 32-47
curve 81	curves 48-63 (found twice)
curve 89	curves 64-79
curve 129	curves 144-159
curve 135	curves 144-159
curve 168	curves 160-175 (found twice)
curve 195	curves 384-399
curve 264	curves 416-431
So based on the speed from the first test GMP-ECM found the factor 10 times in 2191 seconds and AVX-ECM found the factor 10 (12) times in 1419 seconds.

I took the 12 sigmas values for which AVX-ECM found the factor and tested in GMP-ECM, and it found the factor as well with all the 12 sigmas.
Fantastic, thank you for those tests!

Decent speedup from the hyperthread - I haven't had a chance to test that yet so thanks again.
bsquared is offline   Reply With Quote
Reply

Thread Tools


All times are UTC. The time now is 17:46.

Mon Jan 18 17:46:47 UTC 2021 up 46 days, 13:58, 0 users, load averages: 1.57, 1.59, 1.60

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.