mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > YAFU

Reply
 
Thread Tools
Old 2020-11-13, 18:55   #34
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

3,361 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
I was up and running yesterday, until I decided to try compiling the sievers and trying the wip version for AVX2. I am back up and running, my pet number benchmark came back at 1423s, a slight 4% difference from the 1368s I got yesterday, but in the wrong direction... wip+AVX2 is now running slower than base without AVX2.
I was very surprised to verify this result, apparently it's been awhile since I've compared builds. I had thought the AVX2 version was faster. I ran benchmarks with different build options and with both gcc and icc compilers to confirm.

The input is a random 91 digit (300 bit) RSA number with SIQS, on a Xeon 5122 (very similar to your cpu), timings are for 16 threads, only the sieving portion time.

gcc 7.3
SSE41: 129 sec
AVX2: 135 sec
AVX2-BMI2: 135 sec
AVX512: 96 sec

And the same with icc compiler:
SSE41: broken!
AVX2: 135 sec
AVX2-BMI2: 135 sec
AVX512: 92 sec

By the way, the Xeon 5120 CPU should not only have AVX2, but also AVX512. That should get you a substantial speed improvement with SIQS (as above). Add SKYLAKEX=1 to the WIP version build line to enable it for that cpu.

Regarding the .ini file error spewing, you can ignore that, and I will add it to the list to fix.

Finally regarding the SIQS segfaults in trunk, it is high time I merged wip back into trunk, which should take care of that.

Apologies for all of the confusion and thanks for all of the testing!
bsquared is offline   Reply With Quote
Old 2020-11-13, 19:03   #35
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

3,361 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
I was up and running yesterday, until I decided to try compiling the sievers and trying the wip version for AVX2. I am back up and running, my pet number benchmark came back at 1423s, a slight 4% difference from the 1368s I got yesterday, but in the wrong direction... wip+AVX2 is now running slower than base without AVX2.
I went back a re-read the thread and I see that your pet number is being factored by NFS. So the fact that it was slightly slower with wip+AVX2 actually had nothing to do with build options! Because those mostly only impact SIQS, which isn't being used. A 4% variation is probably to be expected; for instance the NFS polynomial used will likely be different from run to run and will slightly change the run-time.

Still, I would be curious to know, again if you are willing, if you are able to build with SKYLAKEX and how that changes your SIQS times. The number I used in the benchmarks above was:
Code:
1173409788347755181387080556399719318596373877059851463520389102364922325971774873949704379
bsquared is offline   Reply With Quote
Old 2020-11-13, 19:06   #36
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

5×709 Posts
Default

Thanks Ben.

I occasionally get a Colab instance with AVX512 in the flags. Is there something in particular I should look for other than that, to try adding SKYLAKE=1, or possibly AVX512=1?
EdH is offline   Reply With Quote
Old 2020-11-13, 19:12   #37
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

3,361 Posts
Default

Quote:
Originally Posted by EdH View Post
Thanks Ben.

I occasionally get a Colab instance with AVX512 in the flags. Is there something in particular I should look for other than that, to try adding SKYLAKE=1, or possibly AVX512=1?
If you see AVX512F and AVX512BW in the flags, then you can try adding SKYLAKEX=1 to the build (note the X).

I should also make a version that just uses AVX512F since the extra BW isn't on all cpus and doesn't add much extra anyway. There are so many cpu variations out there now; I need a better method of managing all this...
bsquared is offline   Reply With Quote
Old 2020-11-13, 20:14   #38
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

22·811 Posts
Default

Quote:
Originally Posted by bsquared View Post
Add SKYLAKEX=1 to the WIP version build line to enable it for that cpu.
Trying:
make NFS=1 USE_SSE41=1 USE_AVX2=1 SKYLAKEX=1
output the usual stuff for a minute or so, and then ended with:
Quote:
cc -g -DUSE_BMI2 -DUSE_AVX2 -DUSE_AVX512F -DUSE_AVX512BW -march=skylake-avx512 -DUSE_AVX2 -DUSE_SSE41 -mavx2 -DUSE_SSE41 -m64 -msse4.1 -DUSE_NFS -O2 -fomit-frame-pointer -Wall -I. -Iinclude -Itop/aprcl -Itop/ -I../msieve/zlib -I../../gmp_install/gmp-6.2.0/include -I../gmp-ecm/include/ factor/qs/msieve/lanczos.o factor/qs/msieve/lanczos_matmul0.o factor/qs/msieve/lanczos_matmul1.o factor/qs/msieve/lanczos_matmul2.o factor/qs/msieve/lanczos_pre.o factor/qs/msieve/sqrt.o factor/qs/msieve/savefile.o factor/qs/msieve/gf2.o top/driver.o top/utils.o top/stack.o top/calc.o top/test.o top/aprcl/mpz_aprcl.o factor/factor_common.o factor/rho.o factor/squfof.o factor/trialdiv.o factor/tune.o factor/qs/filter.o factor/qs/tdiv.o factor/qs/tdiv_small.o factor/qs/tdiv_large.o factor/qs/large_sieve.o factor/qs/new_poly.o factor/qs/siqs_test.o factor/tinyqs/tinySIQS.o factor/qs/siqs_aux.o factor/qs/smallmpqs.o factor/qs/SIQS.o factor/qs/med_sieve_32k.o factor/qs/poly_roots_32k.o factor/gmp-ecm/ecm.o factor/gmp-ecm/pp1.o factor/gmp-ecm/pm1.o factor/gmp-ecm/tinyecm.o factor/gmp-ecm/microecm.o factor/nfs/nfs.o arith/arith0.o arith/arith1.o arith/arith2.o arith/arith3.o arith/monty.o top/eratosthenes/presieve.o top/eratosthenes/count.o top/eratosthenes/offsets.o top/eratosthenes/primes.o top/eratosthenes/roots.o top/eratosthenes/linesieve.o top/eratosthenes/soe.o top/eratosthenes/tiny.o top/eratosthenes/worker.o top/eratosthenes/soe_util.o top/eratosthenes/wrapper.o top/threadpool.o top/queue.o factor/prime_sieve.o factor/batch_factor.o factor/qs/cofactorize_siqs.o factor/avx-ecm/avxecm.o factor/avx-ecm/avx_ecm_main.o factor/avx-ecm/vec_common.o factor/avx-ecm/vecarith.o factor/avx-ecm/vecarith52.o factor/qs/tdiv_med_32k_avx2.o factor/qs/update_poly_roots_32k_avx2.o factor/qs/med_sieve_32k_avx2.o factor/qs/tdiv_resieve_32k_avx2.o factor/qs/update_poly_roots_32k_sse4.1.o factor/qs/med_sieve_32k_sse4.1.o factor/qs/tdiv_scan_knl.o factor/qs/update_poly_roots_32k_knl.o factor/qs/update_poly_roots_32k.o factor/qs/tdiv_med_32k.o factor/qs/tdiv_resieve_32k.o factor/nfs/nfs_sieving.o factor/nfs/nfs_poly.o factor/nfs/nfs_postproc.o factor/nfs/nfs_filemanip.o factor/nfs/nfs_threading.o factor/nfs/snfs.o -o yafu -L../../gmp_install/gmp-6.2.0/lib/ -L../gmp-ecm/lib/ -L../msieve -lmsieve -lecm /sppdg/scratch/buhrow/projects/gmp_install/gmp-6.2.0/lib/libgmp.a -lpthread -lm -ldl
cc: error: /sppdg/scratch/buhrow/projects/gmp_install/gmp-6.2.0/lib/libgmp.a: No such file or directory
make: *** [Makefile:398: all] Error 1
Hardcoded paths? Or did I do something wrong?
James Heinrich is offline   Reply With Quote
Old 2020-11-13, 20:25   #39
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

3,361 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
Trying:
make NFS=1 USE_SSE41=1 USE_AVX2=1 SKYLAKEX=1
output the usual stuff for a minute or so, and then ended with:Hardcoded paths? Or did I do something wrong?
Yes, hardcoded paths. You can replace the line appearing in the error message to
LIBS += -lecm -lgmp

which should be inside of a ifeq ($(SKYLAKEX),1) block around line 178 in the makefile.
bsquared is offline   Reply With Quote
Old 2020-11-13, 20:48   #40
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

22·811 Posts
Default

I changed the path in Makefile, and it seems to have compiled without errors, but when I run yafu I just get:
Quote:
./yafu
Illegal instruction (core dumped)
James Heinrich is offline   Reply With Quote
Old 2020-11-13, 20:50   #41
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

3,361 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
I changed the path in Makefile, and it seems to have compiled without errors, but when I run yafu I just get:
Can you run "lscpu" and report the flags section?
bsquared is offline   Reply With Quote
Old 2020-11-13, 20:59   #42
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

CAC16 Posts
Default

Quote:
Originally Posted by bsquared View Post
Can you run "lscpu" and report the flags section?
I don't see any AVX512 listed?:
Quote:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 6
On-line CPU(s) list: 0-5
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 6
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Model name: Intel(R) Xeon(R) Gold 5120 CPU @ 2.20GHz
Stepping: 0
CPU MHz: 2194.843
BogoMIPS: 4389.68
Hypervisor vendor: VMware
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 1024K
L3 cache: 19712K
NUMA node0 CPU(s): 0-5
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 avx2 smep bmi2 invpcid rdseed adx smap xsaveopt arat md_clear flush_l1d arch_capabilities
James Heinrich is offline   Reply With Quote
Old 2020-11-13, 21:05   #43
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

3,361 Posts
Default

Yep, seems to be missing. The VM must be masking it... I've run into that with VirtualBox before, not sure about other VMs.

Sorry to run you through all of that! Looks like your original sse41 build is probably the best one.

But it wasn't all a waste - you are at least getting more experience with building finicky software on linux
bsquared is offline   Reply With Quote
Old 2020-11-14, 00:59   #44
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

5·709 Posts
Default

Probably not helpful, but I have had Virtualbox hide 64bit capability until I found I had to set something in the BIOS. I think it was some sort of hardware VM setting. Could the missing AVX512 be something of that sort?
EdH is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Running YAFU via Aliqueit doesn't find yafu.ini EdH YAFU 8 2018-03-14 17:22
Adventures with 16f siever VBCurtis Factoring 6 2018-01-24 11:06
Building yafu on windows/linux 2147483647 YAFU 19 2016-12-09 07:59
The Adventures of a Donkey a1call Puzzles 9 2016-05-27 16:50
Building gcc 4.4.0 CRGreathouse Software 1 2009-07-07 22:25

All times are UTC. The time now is 22:30.

Fri Jan 22 22:30:53 UTC 2021 up 50 days, 18:42, 0 users, load averages: 2.27, 1.87, 1.84

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.