![]() |
![]() |
#1 |
"Ed Hall"
Dec 2009
Adirondack Mtns
3×7×263 Posts |
![]()
I seem to be having troubles with YAFU on a Colab Instance if I try to use USE_AVX2=1 as an option. I'm getting quite regular failures of the following sort:
Code:
./yafu 83627958813331634770105456990581223975460530782647023599500689759334189187309703 fac: factoring 83627958813331634770105456990581223975460530782647023599500689759334189187309703 fac: using pretesting plan: normal fac: no tune info: using qs/gnfs crossover of 93 digits div: primes less than 10000 rho: x^2 + 3, starting 200 iterations on C80 rho: x^2 + 2, starting 200 iterations on C80 rho: x^2 + 1, starting 200 iterations on C80 pm1: starting B1 = 150K, B2 = gmp-ecm default on C80 ecm: 30/30 curves on C80, B1=2K, B2=gmp-ecm default ecm: 74/74 curves on C80, B1=11K, B2=gmp-ecm default ecm: 188/188 curves on C80, B1=50K, B2=gmp-ecm default, ETA: 0 sec starting SIQS on c80: 83627958813331634770105456990581223975460530782647023599500689759334189187309703 ==== sieving in progress ( 2 threads): 48096 relations needed ==== ==== Press ctrl-c to abort and save state ==== The CPU is: Code:
Intel(R) Xeon(R) CPU @ 2.00GHz If I compile with USE_SSE41=1 and not USE_AVX2=1, I only see a failure very rarely. I am not including msieve or NFS at all. Any help appreciated. . . |
![]() |
![]() |
![]() |
#2 |
"Ed Hall"
Dec 2009
Adirondack Mtns
159316 Posts |
![]()
Here is a pretty reproducible run with more details:
Command used: Code:
./yafu "siqs(83627958813331634770105456990581223975460530782647023599500689759334189187309703)" -v -v -v This returned immediately: Code:
11/10/19 19:04:36 v1.34.5 @ c39f9954850d, System/Build Info: Using GMP-ECM 7.0.5-dev, Powered by GMP 6.1.2 detected Intel(R) Xeon(R) CPU @ 2.00GHz detected L1 = 32768 bytes, L2 = 40370176 bytes, CL = 64 bytes measured cpu frequency ~= 42.000000 using 1 random witnesses for Rabin-Miller PRP checks =============================================================== ======= Welcome to YAFU (Yet Another Factoring Utility) ======= ======= bbuhrow@gmail.com ======= ======= Type help at any time, or quit to quit ======= =============================================================== cached 78498 primes. pmax = 999983 >> starting SIQS on c80: 83627958813331634770105456990581223975460530782647023599500689759334189187309703 static memory usage: initial cycle hashtable: 16777216 bytes initial cycle table: 160000 bytes factor base: 960640 bytes allocated 1784 bytes for roots allocated 0 bytes for lower mod prime allocated 458752 bytes for sieve lines time to compute linear sieve roots = 0.00 starting root computation over 446 to 446 starting root computation over 446 to 446 time to compute bucket sieve roots = 0.00 allocated 1784 bytes for offsets for 446 sieving primes allocated 1784 bytes for offsets for 446 sieving primes finding requested range 0 to 10000000 sieving range 0 to 11010048 using 446 primes, max prime = 3162 using 2 residue classes lines have 229376 bytes and 1835008 flags lines broken into = 7 blocks of size 32768 blocks contain 262144 flags and cover 1572864 primes using 465328 bytes for sieving storage thread 0 finding primes from byte offset 0 to 114688 thread 1 finding primes from byte offset 114688 to 229376 allocating temporary space for 443347 primes between 0 and 5505024 allocating temporary space for 405442 primes between 5505024 and 11010048 computing: 85%adding 380909 primes found in thread 0 adding 283466 primes founfb bounds small: 1024 SPV: 33 10bit: 96 11bit: 152 12bit: 272 13bit: 504 32k div 3: 664 14bit: 944 15bit: 1768 med: 2528 large: 16624 all: 48032 start primes SPV: 241 10bit: 1087 11bit: 2027 12bit: 4157 13bit: 8221 32k div 3: 11059 14bit: 16417 15bit: 32789 med: 49393 large: 392981 memory usage during sieving: curr_poly structure: 131152 bytes relation buffer: 1310720 bytes factor bases: 1698816 bytes update data: 624416 bytes sieve: 32768 bytes bucket data: 1376963 bytes memory usage during sieving: curr_poly structure: 131152 bytes relation buffer: 1310720 bytes factor bases: 1698816 bytes update data: 624416 bytes sieve: 32768 bytes bucket data: 1376963 bytes ==== sieve params ==== n = 81 digits, 269 bits factor base: 48032 primes (max prime = 1241407) single large prime cutoff: 117933665 (95 * pmax) double large prime range from 41 to 49 bits double large prime range from 1541091339649 to 338024385079292 allocating 7 large prime slices of factor base buckets hold 2048 elements using AVX2 enabled 32k sieve core sieve interval: 12 blocks of size 32768 polynomial A has ~ 10 factors using multiplier of 7 using SPV correction of 20 bits, starting at offset 33 trial factoring cutoff at 88 bits ==== sieving in progress ( 2 threads): 48096 relations needed ==== ==== Press ctrl-c to abort and save state ==== Code:
11/10/19 19:04:36 v1.34.5 @ c39f9954850d, starting SIQS on c80: 83627958813331634770105456990581223975460530782647023599500689759334189187309703 11/10/19 19:04:36 v1.34.5 @ c39f9954850d, random seeds: 2503899283, 1201291079 Code:
. . . ==== sieving in progress (1 thread): 48096 relations needed ==== ==== Press ctrl-c to abort and save state ==== Segmentation fault (core dumped) Last fiddled with by EdH on 2019-11-10 at 19:24 |
![]() |
![]() |
![]() |
#3 |
"Ben"
Feb 2007
22·941 Posts |
![]()
Do you get the same error if you run with the /branches/wip/ version of yafu instead of trunk with AVX2?
|
![]() |
![]() |
![]() |
#4 | |
"Ed Hall"
Dec 2009
Adirondack Mtns
3×7×263 Posts |
![]() Quote:
Code:
In function `_trail_zcnt64': /content/yafu/include/arith.h:102: undefined reference to `_BitScanForward64' /content/yafu/include/arith.h:102: undefined reference to `_BitScanForward64' factor/squfof.o: In function `_lead_zcnt64': /content/yafu/include/arith.h:110: undefined reference to `_BitScanReverse64' arith/arith3.o: In function `_trail_zcnt64': /content/yafu/include/arith.h:102: undefined reference to `_BitScanForward64' /content/yafu/include/arith.h:102: undefined reference to `_BitScanForward64' /content/yafu/include/arith.h:102: undefined reference to `_BitScanForward64' top/eratosthenes/primes.o: In function `_trail_zcnt64': /content/yafu/include/arith.h:102: undefined reference to `_BitScanForward64' collect2: error: ld returned 1 exit status Makefile:359: recipe for target 'all' failed make: *** [all] Error 1 |
|
![]() |
![]() |
![]() |
#5 |
"Ed Hall"
Dec 2009
Adirondack Mtns
3×7×263 Posts |
![]()
With my limited knowledge I haven't been able to get past the above error(s).
GCC is version 7.4.0. I commented out "CC = gcc-7.3.0" in the Makefile, which was aborting the compile. |
![]() |
![]() |
![]() |
#6 |
"Ed Hall"
Dec 2009
Adirondack Mtns
3×7×263 Posts |
![]()
I tried to go back a couple revisions, but still no luck with AVX2, only SSE41.
Code:
top/eratosthenes/primes.c:354:11: note: called from here | _pdep_u64(x2, 0xaaaaaaaaaaaaaaaa); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:83:0, from include/soe.h:27, from top/eratosthenes/primes.c:15: /usr/lib/gcc/x86_64-linux-gnu/7/include/bmi2intrin.h:69:1: error: inlining failed in call to always_inline \u2018_pdep_u64\u2019: target specific option mismatch _pdep_u64 (unsigned long long __X, unsigned long long __Y) ^~~~~~~~~ top/eratosthenes/primes.c:353:12: note: called from here return _pdep_u64(x1, 0x5555555555555555) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <builtin>: recipe for target 'top/eratosthenes/primes.o' failed make: *** [top/eratosthenes/primes.o] Error 1 |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
How I Create a Colab Session That Factors factordb Composites with YAFU | EdH | EdH | 23 | 2022-09-23 12:36 |
Colab question | David703 | GPU to 72 | 302 | 2022-07-01 03:41 |
New instance types soon from AWS: next-gen C5, an FPGA instance, more GPU options | GP2 | Cloud Computing | 8 | 2020-11-16 08:21 |
AVX2 weirdness | bsquared | Programming | 1 | 2016-01-17 17:26 |
Haswell New Instructions / AVX2 | ixfd64 | Hardware | 72 | 2013-03-20 00:00 |