mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > YAFU

Reply
 
Thread Tools
Old 2019-11-10, 04:10   #1
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

1110001100102 Posts
Default AVX2 Troubles with Colab Instance

I seem to be having troubles with YAFU on a Colab Instance if I try to use USE_AVX2=1 as an option. I'm getting quite regular failures of the following sort:
Code:
./yafu 83627958813331634770105456990581223975460530782647023599500689759334189187309703

fac: factoring 83627958813331634770105456990581223975460530782647023599500689759334189187309703
fac: using pretesting plan: normal
fac: no tune info: using qs/gnfs crossover of 93 digits
div: primes less than 10000
rho: x^2 + 3, starting 200 iterations on C80 
rho: x^2 + 2, starting 200 iterations on C80 
rho: x^2 + 1, starting 200 iterations on C80 
pm1: starting B1 = 150K, B2 = gmp-ecm default on C80
ecm: 30/30 curves on C80, B1=2K, B2=gmp-ecm default
ecm: 74/74 curves on C80, B1=11K, B2=gmp-ecm default
ecm: 188/188 curves on C80, B1=50K, B2=gmp-ecm default, ETA: 0 sec 

starting SIQS on c80: 83627958813331634770105456990581223975460530782647023599500689759334189187309703

==== sieving in progress ( 2 threads):   48096 relations needed ====
====            Press ctrl-c to abort and save state            ====
and then it returns.

The CPU is:
Code:
 Intel(R) Xeon(R) CPU @ 2.00GHz
and I'm using the trunk branch.

If I compile with USE_SSE41=1 and not USE_AVX2=1, I only see a failure very rarely. I am not including msieve or NFS at all.

Any help appreciated. . .
EdH is offline   Reply With Quote
Old 2019-11-10, 19:16   #2
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

2×23×79 Posts
Default

Here is a pretty reproducible run with more details:

Command used:

Code:
./yafu "siqs(83627958813331634770105456990581223975460530782647023599500689759334189187309703)" -v -v -v

This returned immediately:
Code:
11/10/19 19:04:36 v1.34.5 @ c39f9954850d, System/Build Info: 
Using GMP-ECM 7.0.5-dev, Powered by GMP 6.1.2
detected            Intel(R) Xeon(R) CPU @ 2.00GHz
detected L1 = 32768 bytes, L2 = 40370176 bytes, CL = 64 bytes
measured cpu frequency ~= 42.000000
using 1 random witnesses for Rabin-Miller PRP checks

===============================================================
======= Welcome to YAFU (Yet Another Factoring Utility) =======
=======             bbuhrow@gmail.com                   =======
=======     Type help at any time, or quit to quit      =======
===============================================================
cached 78498 primes. pmax = 999983


>> 
starting SIQS on c80: 83627958813331634770105456990581223975460530782647023599500689759334189187309703
static memory usage:
    initial cycle hashtable: 16777216 bytes
    initial cycle table: 160000 bytes
    factor base: 960640 bytes
allocated 1784 bytes for roots
allocated 0 bytes for lower mod prime
allocated 458752 bytes for sieve lines
time to compute linear sieve roots = 0.00
starting root computation over 446 to 446
starting root computation over 446 to 446
time to compute bucket sieve roots = 0.00
allocated 1784 bytes for offsets for 446 sieving primes 
allocated 1784 bytes for offsets for 446 sieving primes 
finding requested range 0 to 10000000
sieving range 0 to 11010048
using 446 primes, max prime = 3162  
using 2 residue classes
lines have 229376 bytes and 1835008 flags
lines broken into = 7 blocks of size 32768
blocks contain 262144 flags and cover 1572864 primes
using 465328 bytes for sieving storage
thread 0 finding primes from byte offset 0 to 114688
thread 1 finding primes from byte offset 114688 to 229376
allocating temporary space for 443347 primes between 0 and 5505024
allocating temporary space for 405442 primes between 5505024 and 11010048
computing: 85%adding 380909 primes found in thread 0
adding 283466 primes founfb bounds
    small: 1024
    SPV: 33
    10bit: 96
    11bit: 152
    12bit: 272
    13bit: 504
    32k div 3: 664
    14bit: 944
    15bit: 1768
    med: 2528
    large: 16624
    all: 48032
start primes
    SPV: 241
    10bit: 1087
    11bit: 2027
    12bit: 4157
    13bit: 8221
    32k div 3: 11059
    14bit: 16417
    15bit: 32789
    med: 49393
    large: 392981
memory usage during sieving:
    curr_poly structure: 131152 bytes
    relation buffer: 1310720 bytes
    factor bases: 1698816 bytes
    update data: 624416 bytes
    sieve: 32768 bytes
    bucket data: 1376963 bytes
memory usage during sieving:
    curr_poly structure: 131152 bytes
    relation buffer: 1310720 bytes
    factor bases: 1698816 bytes
    update data: 624416 bytes
    sieve: 32768 bytes
    bucket data: 1376963 bytes

==== sieve params ====
n = 81 digits, 269 bits
factor base: 48032 primes (max prime = 1241407)
single large prime cutoff: 117933665 (95 * pmax)
double large prime range from 41 to 49 bits
double large prime range from 1541091339649 to 338024385079292
allocating 7 large prime slices of factor base
buckets hold 2048 elements
using AVX2 enabled 32k sieve core
sieve interval: 12 blocks of size 32768
polynomial A has ~ 10 factors
using multiplier of 7
using SPV correction of 20 bits, starting at offset 33
trial factoring cutoff at 88 bits

==== sieving in progress ( 2 threads):   48096 relations needed ====
====            Press ctrl-c to abort and save state            ====
This is factor.log:
Code:
11/10/19 19:04:36 v1.34.5 @ c39f9954850d, starting SIQS on c80: 83627958813331634770105456990581223975460530782647023599500689759334189187309703 
11/10/19 19:04:36 v1.34.5 @ c39f9954850d, random seeds: 2503899283, 1201291079 
EDIT: I tried this on a home machine:
Code:
. . .
==== sieving in progress (1 thread):   48096 relations needed ====
====           Press ctrl-c to abort and save state           ====
Segmentation fault (core dumped)
AVX2 doesn't need GCC 7, does it?

Last fiddled with by EdH on 2019-11-10 at 19:24
EdH is offline   Reply With Quote
Old 2019-11-11, 17:21   #3
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

3,371 Posts
Default

Do you get the same error if you run with the /branches/wip/ version of yafu instead of trunk with AVX2?
bsquared is offline   Reply With Quote
Old 2019-11-11, 18:13   #4
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

2×23×79 Posts
Default

Quote:
Originally Posted by bsquared View Post
Do you get the same error if you run with the /branches/wip/ version of yafu instead of trunk with AVX2?
I can't get it compiled and have to run ATM. I'll play more later:
Code:
 In function `_trail_zcnt64':
/content/yafu/include/arith.h:102: undefined reference to `_BitScanForward64'
/content/yafu/include/arith.h:102: undefined reference to `_BitScanForward64'
factor/squfof.o: In function `_lead_zcnt64':
/content/yafu/include/arith.h:110: undefined reference to `_BitScanReverse64'
arith/arith3.o: In function `_trail_zcnt64':
/content/yafu/include/arith.h:102: undefined reference to `_BitScanForward64'
/content/yafu/include/arith.h:102: undefined reference to `_BitScanForward64'
/content/yafu/include/arith.h:102: undefined reference to `_BitScanForward64'
top/eratosthenes/primes.o: In function `_trail_zcnt64':
/content/yafu/include/arith.h:102: undefined reference to `_BitScanForward64'
collect2: error: ld returned 1 exit status
Makefile:359: recipe for target 'all' failed
make: *** [all] Error 1
EdH is offline   Reply With Quote
Old 2019-11-11, 22:44   #5
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

70628 Posts
Default

With my limited knowledge I haven't been able to get past the above error(s).

GCC is version 7.4.0. I commented out "CC = gcc-7.3.0" in the Makefile, which was aborting the compile.
EdH is offline   Reply With Quote
Old 2019-11-12, 01:49   #6
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

2·23·79 Posts
Default

I tried to go back a couple revisions, but still no luck with AVX2, only SSE41.
Code:
top/eratosthenes/primes.c:354:11: note: called from here
         | _pdep_u64(x2, 0xaaaaaaaaaaaaaaaa);
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:83:0,
                 from include/soe.h:27,
                 from top/eratosthenes/primes.c:15:
/usr/lib/gcc/x86_64-linux-gnu/7/include/bmi2intrin.h:69:1: error: inlining failed in call to always_inline \u2018_pdep_u64\u2019: target specific option mismatch
 _pdep_u64 (unsigned long long __X, unsigned long long __Y)
 ^~~~~~~~~
top/eratosthenes/primes.c:353:12: note: called from here
     return _pdep_u64(x1, 0x5555555555555555) 
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<builtin>: recipe for target 'top/eratosthenes/primes.o' failed
make: *** [top/eratosthenes/primes.o] Error 1
EdH is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Colab question David703 GPU to 72 279 2020-12-12 01:26
New instance types soon from AWS: next-gen C5, an FPGA instance, more GPU options GP2 Cloud Computing 8 2020-11-16 08:21
How I Create a Colab Session That Factors factordb Composites with YAFU EdH EdH 12 2019-11-11 17:44
AVX2 weirdness bsquared Programming 1 2016-01-17 17:26
Haswell New Instructions / AVX2 ixfd64 Hardware 72 2013-03-20 00:00

All times are UTC. The time now is 15:11.

Sat Mar 6 15:11:05 UTC 2021 up 93 days, 11:22, 0 users, load averages: 1.95, 1.75, 2.10

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.