mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)

 ryanp 2022-08-24 17:19

[QUOTE=rogue;609937]Not certain why 1.6.3 would behave differently. Can you reduce -g and try again?[/QUOTE]

I don't really understand how to set these parameters. If I explicitly set [C]-G 1[/C] (according to "-h", the default of [C]-G[/C] is 0):

[CODE]\$ ./srsieve2cl -g 64 -G 1 -P 1e14 -o "ferm81_3M_15M.txt" -s "81*2^n+1" -n 3e6 -N 15e6
srsieve2cl v1.6.3, a program to find factors of k*b^n+c numbers for fixed b and variable k and n
(b2) Removed 3000000 algebraic factors for 81*2^n+1 of the form (3^2)*2^(n/2)-3*2^((n+2)/4))+1 when n%4=2
Sieving with generic logic for p >= 3
GPU primes per worker is 1769472
Sieve started: 3 < p < 1e14 with 9000001 terms (3000000 < n < 15000000, k*2^n+1) (expecting 8693280 factors)
Sieving with single sequence c=1 logic for p >= 257
BASE_MULTIPLE = 30, POWER_RESIDUE_LCM = 720, LIMIT_BASE = 720
Split 1 base 2 sequence into 384 base 2^720 sequences.
Legendre summary: Approximately 2 B needed for Legendre tables
1 total sequences
1 are eligible for Legendre tables
0 are not eligible for Legendre tables
1 have Legendre tables in memory
0 cannot have Legendre tables in memory
0 have Legendre tables loaded from files
1 required building of the Legendre tables
518400 bytes used for congruent q and ladder indices
295200 bytes used for congruent qs and ladders

OpenCL Error: Out of resources
argument: factorCount[/CODE]

If I use [C]-W 1[/C] instead, it runs, but it's extremely slow compared to the CPU srsieve2 (even with a high [C]-g[/C]), and I see 0% GPU utilization according to [C]nvidia-smi[/C]:

[CODE]\$ ./srsieve2cl -g 5120 -W 1 -P 1e14 -o "ferm81_3M_15M.txt" -s "81*2^n+1" -n 3e6 -N 15e6
srsieve2cl v1.6.3, a program to find factors of k*b^n+c numbers for fixed b and variable k and n
(b2) Removed 3000000 algebraic factors for 81*2^n+1 of the form (3^2)*2^(n/2)-3*2^((n+2)/4))+1 when n%4=2
Sieving with generic logic for p >= 3
Sieve started: 3 < p < 1e14 with 9000001 terms (3000000 < n < 15000000, k*2^n+1) (expecting 8693280 factors)
Sieving with single sequence c=1 logic for p >= 257
BASE_MULTIPLE = 30, POWER_RESIDUE_LCM = 720, LIMIT_BASE = 720
Split 1 base 2 sequence into 384 base 2^720 sequences.
Legendre summary: Approximately 2 B needed for Legendre tables
1 total sequences
1 are eligible for Legendre tables
0 are not eligible for Legendre tables
1 have Legendre tables in memory
0 cannot have Legendre tables in memory
0 have Legendre tables loaded from files
1 required building of the Legendre tables
518400 bytes used for congruent q and ladder indices
295200 bytes used for congruent qs and ladders
Decreasing worksize to 4000 since each chunk needs more than 5 seconds to test
Increasing worksize to 16000 since each chunk is tested in less than a second
Increasing worksize to 64000 since each chunk is tested in less than a second
p=1020457127, 856.8K p/sec, 7729676 factors found at 5.619K f/sec (last 1 min) p=2253211963, 905.7K p/sec, 7776218 factors found at 2.643K f/sec (last 2 min), 0.0% done. ETC 2022-10-26 09:18[/CODE]

And if I pass neither [C]-W[/C] nor [C]-G[/C], it segfaults.

[CODE]\$ ./srsieve2cl -g 5120 -P 1e14 -o "ferm81_3M_15M.txt" -s "81*2^n+1" -n 3e6 -N 15e6
srsieve2cl v1.6.3, a program to find factors of k*b^n+c numbers for fixed b and variable k and n
(b2) Removed 3000000 algebraic factors for 81*2^n+1 of the form (3^2)*2^(n/2)-3*2^((n+2)/4))+1 when n%4=2
Sieving with generic logic for p >= 3
Creating CPU worker to use until p >= 1000000
GPU primes per worker is 141557760
Sieve started: 3 < p < 1e14 with 9000001 terms (3000000 < n < 15000000, k*2^n+1) (expecting 8693280 factors)
Sieving with single sequence c=1 logic for p >= 257
BASE_MULTIPLE = 30, POWER_RESIDUE_LCM = 720, LIMIT_BASE = 720
Split 1 base 2 sequence into 384 base 2^720 sequences.
Legendre summary: Approximately 2 B needed for Legendre tables
1 total sequences
1 are eligible for Legendre tables
0 are not eligible for Legendre tables
1 have Legendre tables in memory
0 cannot have Legendre tables in memory
0 have Legendre tables loaded from files
1 required building of the Legendre tables
518400 bytes used for congruent q and ladder indices
295200 bytes used for congruent qs and ladders
Creating CPU worker to use until p >= 1000000
Segmentation fault (core dumped)[/CODE]

 pepi37 2022-08-24 19:59

Rogue, I dont ask you, I beg you. I read this thread many times, maybe I didnot understund .

It will be very nice if you for mtsieve package [U][I][B]write detailed tutorial.[/B][/I][/U] It is in my opinion insanely continue with the development of the program if few of us (that use mtsieve package) cannot use it as it should. I never get to get some gain in speed using cl version of sieves, always is stay on CPU. Again, and again I very respect you work your mtsieve package, but without detailed tutorial it is very hard to get some nice results. Number of few switches you use sieve programs can he huge,and I can try for many hours and dont find combination I need to "enable" GPU to full speed.
Thanks and keep doing this great work. This message is not a criticism in any sense, just asking that you write some kind of tutorial where you will explain correlation between switches.
Best regards

 Dylan14 2022-08-24 23:44

In the latest revision of the code (r201), gcc fails when working on the Sophie Germain sieve with the following:

[CODE]g++ -Isieve -m64 -Wall -DUSE_X86 -std=c++11 -O3 -c -o sophie_germain/SophieGermainWorker.o sophie_germain/SophieGermainWorker.cpp
In file included from sophie_germain/SophieGermainWorker.cpp:12:
sophie_germain/../core/MpArithVector.h:20:11: error: ‘size_t’ has not been declared
20 | template <size_t N>
| ^~~~~~
sophie_germain/../core/MpArithVector.h:24:21: error: ‘N’ was not declared in this scope
24 | uint64_t _r[N];
| ^
sophie_germain/../core/MpArithVector.h:27:36: error: ‘size_t’ does not name a type
27 | uint64_t operator [](const size_t i) const { return _r[i]; }
| ^~~~~~
sophie_germain/../core/MpArithVector.h:1:1: note: ‘size_t’ is defined in header ‘<cstddef>’; did you forget to ‘#include <cstddef>’?
+++ |+#include <cstddef>
1 | /* MpArithVector.h -- (C) Mark Rodenkirch, December 2020
sophie_germain/../core/MpArithVector.h:28:38: error: ‘size_t’ does not name a type
28 | uint64_t & operator [](const size_t i) { return _r[i]; }
| ^~~~~~
sophie_germain/../core/MpArithVector.h:28:38: note: ‘size_t’ is defined in header ‘<cstddef>’; did you forget to ‘#include <cstddef>’?
sophie_germain/../core/MpArithVector.h:32:11: error: ‘size_t’ has not been declared
32 | template <size_t N>
| ^~~~~~
sophie_germain/../core/MpArithVector.h:36:21: error: ‘N’ was not declared in this scope
36 | uint64_t _p[N], _q[N];
| ^
sophie_germain/../core/MpArithVector.h:36:28: error: ‘N’ was not declared in this scope
36 | uint64_t _p[N], _q[N];
| ^
sophie_germain/../core/MpArithVector.h:37:21: error: ‘N’ was not declared in this scope
37 | MpResVector<N> _one; // 2^64 mod p
| ^
sophie_germain/../core/MpArithVector.h:37:22: error: template argument 1 is invalid
37 | MpResVector<N> _one; // 2^64 mod p
| ^
sophie_germain/../core/MpArithVector.h:38:21: error: ‘N’ was not declared in this scope
38 | MpResVector<N> _r2; // (2^64)^2 mod p
| ^
sophie_germain/../core/MpArithVector.h:38:22: error: template argument 1 is invalid
38 | MpResVector<N> _r2; // (2^64)^2 mod p
| ^
sophie_germain/../core/MpArithVector.h:77:28: error: ‘N’ was not declared in this scope
77 | static MpResVector<N> zero()
| ^
sophie_germain/../core/MpArithVector.h:77:29: error: template argument 1 is invalid
77 | static MpResVector<N> zero()
| ^
sophie_germain/../core/MpArithVector.h:84:21: error: ‘N’ was not declared in this scope
84 | MpResVector<N> one() const { return _one; } // Montgomery form of 1
| ^
sophie_germain/../core/MpArithVector.h:84:22: error: template argument 1 is invalid
84 | MpResVector<N> one() const { return _one; } // Montgomery form of 1
| ^
sophie_germain/../core/MpArithVector.h:86:20: error: ‘size_t’ has not been declared
86 | uint64_t p(size_t k) const { return _p[k]; }
| ^~~~~~
sophie_germain/../core/MpArithVector.h:88:61: error: ‘N’ was not declared in this scope
88 | static bool at_least_one_is_equal(const MpResVector<N> & a, const MpResVector<N> & b)
| ^
sophie_germain/../core/MpArithVector.h:88:62: error: template argument 1 is invalid
88 | static bool at_least_one_is_equal(const MpResVector<N> & a, const MpResVector<N> & b)
| ^
sophie_germain/../core/MpArithVector.h:88:87: error: ‘N’ was not declared in this scope
88 | l at_least_one_is_equal(const MpResVector<N> & a, const MpResVector<N> & b)
| ^

sophie_germain/../core/MpArithVector.h:88:88: error: template argument 1 is invalid
88 | l at_least_one_is_equal(const MpResVector<N> & a, const MpResVector<N> & b)
| ^

sophie_germain/../core/MpArithVector.h:95:21: error: ‘N’ was not declared in this scope
95 | MpResVector<N> add(const MpResVector<N> & a, const MpResVector<N> & b) const
| ^
sophie_germain/../core/MpArithVector.h:95:22: error: template argument 1 is invalid
95 | MpResVector<N> add(const MpResVector<N> & a, const MpResVector<N> & b) const
| ^
sophie_germain/../core/MpArithVector.h:95:46: error: ‘N’ was not declared in this scope
95 | MpResVector<N> add(const MpResVector<N> & a, const MpResVector<N> & b) const
| ^
sophie_germain/../core/MpArithVector.h:95:47: error: template argument 1 is invalid
95 | MpResVector<N> add(const MpResVector<N> & a, const MpResVector<N> & b) const
| ^
sophie_germain/../core/MpArithVector.h:95:72: error: ‘N’ was not declared in this scope
95 | MpResVector<N> add(const MpResVector<N> & a, const MpResVector<N> & b) const
| ^

sophie_germain/../core/MpArithVector.h:95:73: error: template argument 1 is invalid
95 | MpResVector<N> add(const MpResVector<N> & a, const MpResVector<N> & b) const
| ^

sophie_germain/../core/MpArithVector.h:106:21: error: ‘N’ was not declared in this scope
106 | MpResVector<N> sub(const MpResVector<N> & a, const MpResVector<N> & b) const
| ^
sophie_germain/../core/MpArithVector.h:106:22: error: template argument 1 is invalid
106 | MpResVector<N> sub(const MpResVector<N> & a, const MpResVector<N> & b) const
| ^
sophie_germain/../core/MpArithVector.h:106:46: error: ‘N’ was not declared in this scope
106 | MpResVector<N> sub(const MpResVector<N> & a, const MpResVector<N> & b) const
| ^
sophie_germain/../core/MpArithVector.h:106:47: error: template argument 1 is invalid
106 | MpResVector<N> sub(const MpResVector<N> & a, const MpResVector<N> & b) const
| ^
sophie_germain/../core/MpArithVector.h:106:72: error: ‘N’ was not declared in this scope
106 | MpResVector<N> sub(const MpResVector<N> & a, const MpResVector<N> & b) const
| ^

sophie_germain/../core/MpArithVector.h:106:73: error: template argument 1 is invalid
106 | MpResVector<N> sub(const MpResVector<N> & a, const MpResVector<N> & b) const
| ^

sophie_germain/../core/MpArithVector.h:117:58: error: ‘N’ was not declared in this scope
117 | uint64_t mul(const uint64_t a, const MpResVector<N> & b, size_t k) const
| ^
sophie_germain/../core/MpArithVector.h:117:59: error: template argument 1 is invalid
117 | uint64_t mul(const uint64_t a, const MpResVector<N> & b, size_t k) const
| ^
sophie_germain/../core/MpArithVector.h:117:66: error: ‘size_t’ has not been declared
117 | uint64_t mul(const uint64_t a, const MpResVector<N> & b, size_t k) const
| ^~~~~~

sophie_germain/../core/MpArithVector.h:122:35: error: ‘N’ was not declared in this scope
122 | uint64_t mul(const MpResVector<N> & a, const MpResVector<N> & b, size_t k) const
| ^
sophie_germain/../core/MpArithVector.h:122:36: error: template argument 1 is invalid
122 | uint64_t mul(const MpResVector<N> & a, const MpResVector<N> & b, size_t k) const
| ^
sophie_germain/../core/MpArithVector.h:122:61: error: ‘N’ was not declared in this scope
122 | uint64_t mul(const MpResVector<N> & a, const MpResVector<N> & b, size_t k) const
| ^
sophie_germain/../core/MpArithVector.h:122:62: error: template argument 1 is invalid
122 | uint64_t mul(const MpResVector<N> & a, const MpResVector<N> & b, size_t k) const
| ^
sophie_germain/../core/MpArithVector.h:122:69: error: ‘size_t’ has not been declared
122 | int64_t mul(const MpResVector<N> & a, const MpResVector<N> & b, size_t k) const
| ^~~~~~

sophie_germain/../core/MpArithVector.h:127:21: error: ‘N’ was not declared in this scope
127 | MpResVector<N> mul(const MpResVector<N> & a, const MpResVector<N> & b) const
| ^
sophie_germain/../core/MpArithVector.h:127:22: error: template argument 1 is invalid
127 | MpResVector<N> mul(const MpResVector<N> & a, const MpResVector<N> & b) const
| ^
sophie_germain/../core/MpArithVector.h:127:46: error: ‘N’ was not declared in this scope
127 | MpResVector<N> mul(const MpResVector<N> & a, const MpResVector<N> & b) const
| ^
sophie_germain/../core/MpArithVector.h:127:47: error: template argument 1 is invalid
127 | MpResVector<N> mul(const MpResVector<N> & a, const MpResVector<N> & b) const
| ^
sophie_germain/../core/MpArithVector.h:127:72: error: ‘N’ was not declared in this scope
127 | MpResVector<N> mul(const MpResVector<N> & a, const MpResVector<N> & b) const
| ^

sophie_germain/../core/MpArithVector.h:127:73: error: template argument 1 is invalid
127 | MpResVector<N> mul(const MpResVector<N> & a, const MpResVector<N> & b) const
| ^

sophie_germain/../core/MpArithVector.h:137:21: error: ‘N’ was not declared in this scope
137 | MpResVector<N> pow(const MpResVector<N> & a, size_t exp) const
| ^
sophie_germain/../core/MpArithVector.h:137:22: error: template argument 1 is invalid
137 | MpResVector<N> pow(const MpResVector<N> & a, size_t exp) const
| ^
sophie_germain/../core/MpArithVector.h:137:46: error: ‘N’ was not declared in this scope
137 | MpResVector<N> pow(const MpResVector<N> & a, size_t exp) const
| ^
sophie_germain/../core/MpArithVector.h:137:47: error: template argument 1 is invalid
137 | MpResVector<N> pow(const MpResVector<N> & a, size_t exp) const
| ^
sophie_germain/../core/MpArithVector.h:137:54: error: ‘size_t’ has not been declared
137 | MpResVector<N> pow(const MpResVector<N> & a, size_t exp) const
| ^~~~~~
sophie_germain/../core/MpArithVector.h:159:21: error: ‘N’ was not declared in this scope
159 | MpResVector<N> nToRes(const uint64_t *n) const
| ^
sophie_germain/../core/MpArithVector.h:159:22: error: template argument 1 is invalid
159 | MpResVector<N> nToRes(const uint64_t *n) const
| ^
sophie_germain/../core/MpArithVector.h:171:21: error: ‘N’ was not declared in this scope
171 | MpResVector<N> nToRes(uint64_t n) const
| ^
sophie_germain/../core/MpArithVector.h:171:22: error: template argument 1 is invalid
171 | MpResVector<N> nToRes(uint64_t n) const
| ^
sophie_germain/../core/MpArithVector.h:183:21: error: ‘N’ was not declared in this scope
183 | MpResVector<N> nToRes(uint32_t *n) const
| ^
sophie_germain/../core/MpArithVector.h:183:22: error: template argument 1 is invalid
183 | MpResVector<N> nToRes(uint32_t *n) const
| ^
sophie_germain/../core/MpArithVector.h:195:21: error: ‘N’ was not declared in this scope
195 | MpResVector<N> resToN(const MpResVector<N> & a) const
| ^
sophie_germain/../core/MpArithVector.h:195:22: error: template argument 1 is invalid
195 | MpResVector<N> resToN(const MpResVector<N> & a) const
| ^
sophie_germain/../core/MpArithVector.h:195:49: error: ‘N’ was not declared in this scope
195 | MpResVector<N> resToN(const MpResVector<N> & a) const
| ^
sophie_germain/../core/MpArithVector.h:195:50: error: template argument 1 is invalid
195 | MpResVector<N> resToN(const MpResVector<N> & a) const
| ^
sophie_germain/SophieGermainWorker.cpp: In member function ‘void SophieGermainWorker::TestMegaPrimeChunkSmall()’:
sophie_germain/SophieGermainWorker.cpp:73:21: error: invalid conversion from ‘uint64_t*’ {aka ‘long unsigned int*’} to ‘MpArithVec’ {aka ‘int’} [-fpermissive]
73 | MpArithVec mp(ps);
| ^~
| |
| uint64_t* {aka long unsigned int*}
sophie_germain/SophieGermainWorker.cpp:75:29: error: request for member ‘nToRes’ in ‘mp’, which is of non-class type ‘MpArithVec’ {aka ‘int’}
75 | MpResVec resInvs = mp.nToRes(invs);
| ^~~~~~
sophie_germain/SophieGermainWorker.cpp:76:25: error: request for member ‘pow’ in ‘mp’, which is of non-class type ‘MpArithVec’ {aka ‘int’}
76 | MpResVec res = mp.pow(resInvs, ii_N);
| ^~~
sophie_germain/SophieGermainWorker.cpp:77:27: error: request for member ‘resToN’ in ‘mp’, which is of non-class type ‘MpArithVec’ {aka ‘int’}
77 | MpResVec resKs = mp.resToN(res);
| ^~~~~~
sophie_germain/SophieGermainWorker.cpp:79:16: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
79 | if (resKs[0] <= il_MaxK) RemoveTermsSmallPrime(resKs[0], true, ps[0]);
| ^
sophie_germain/SophieGermainWorker.cpp:79:59: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
79 | if (resKs[0] <= il_MaxK) RemoveTermsSmallPrime(resKs[0], true, ps[0]);
| ^
sophie_germain/SophieGermainWorker.cpp:80:16: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
80 | if (resKs[1] <= il_MaxK) RemoveTermsSmallPrime(resKs[1], true, ps[1]);
| ^
sophie_germain/SophieGermainWorker.cpp:80:59: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
80 | if (resKs[1] <= il_MaxK) RemoveTermsSmallPrime(resKs[1], true, ps[1]);
| ^
sophie_germain/SophieGermainWorker.cpp:81:16: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
81 | if (resKs[2] <= il_MaxK) RemoveTermsSmallPrime(resKs[2], true, ps[2]);
| ^
sophie_germain/SophieGermainWorker.cpp:81:59: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
81 | if (resKs[2] <= il_MaxK) RemoveTermsSmallPrime(resKs[2], true, ps[2]);
| ^
sophie_germain/SophieGermainWorker.cpp:82:16: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
82 | if (resKs[3] <= il_MaxK) RemoveTermsSmallPrime(resKs[3], true, ps[3]);
| ^
sophie_germain/SophieGermainWorker.cpp:82:59: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
82 | if (resKs[3] <= il_MaxK) RemoveTermsSmallPrime(resKs[3], true, ps[3]);
| ^
sophie_germain/SophieGermainWorker.cpp:86:19: error: request for member ‘mul’ in ‘mp’, which is of non-class type ‘MpArithVec’ {aka ‘int’}
86 | res = mp.mul(res, resInvs);
| ^~~
sophie_germain/SophieGermainWorker.cpp:93:22: error: request for member ‘mul’ in ‘mp’, which is of non-class type ‘MpArithVec’ {aka ‘int’}
93 | res = mp.mul(res, resInvs);
| ^~~
sophie_germain/SophieGermainWorker.cpp:103:22: error: request for member ‘mul’ in ‘mp’, which is of non-class type ‘MpArithVec’ {aka ‘int’}
103 | res = mp.mul(res, mp.nToRes(invs));
| ^~~
sophie_germain/SophieGermainWorker.cpp:103:34: error: request for member ‘nToRes’ in ‘mp’, which is of non-class type ‘MpArithVec’ {aka ‘int’}
103 | res = mp.mul(res, mp.nToRes(invs));
| ^~~~~~
sophie_germain/SophieGermainWorker.cpp:107:18: error: request for member ‘resToN’ in ‘mp’, which is of non-class type ‘MpArithVec’ {aka ‘int’}
107 | resKs = mp.resToN(res);
| ^~~~~~
sophie_germain/SophieGermainWorker.cpp:109:16: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
109 | if (resKs[0] <= il_MaxK) RemoveTermsSmallPrime(resKs[0], false, ps[0]);
| ^
sophie_germain/SophieGermainWorker.cpp:109:59: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
109 | if (resKs[0] <= il_MaxK) RemoveTermsSmallPrime(resKs[0], false, ps[0]);
| ^
sophie_germain/SophieGermainWorker.cpp:110:16: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
110 | if (resKs[1] <= il_MaxK) RemoveTermsSmallPrime(resKs[1], false, ps[1]);
| ^
sophie_germain/SophieGermainWorker.cpp:110:59: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
110 | if (resKs[1] <= il_MaxK) RemoveTermsSmallPrime(resKs[1], false, ps[1]);
| ^
sophie_germain/SophieGermainWorker.cpp:111:16: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
111 | if (resKs[2] <= il_MaxK) RemoveTermsSmallPrime(resKs[2], false, ps[2]);
| ^
sophie_germain/SophieGermainWorker.cpp:111:59: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
111 | if (resKs[2] <= il_MaxK) RemoveTermsSmallPrime(resKs[2], false, ps[2]);
| ^
sophie_germain/SophieGermainWorker.cpp:112:16: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
112 | if (resKs[3] <= il_MaxK) RemoveTermsSmallPrime(resKs[3], false, ps[3]);
| ^
sophie_germain/SophieGermainWorker.cpp:112:59: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
112 | if (resKs[3] <= il_MaxK) RemoveTermsSmallPrime(resKs[3], false, ps[3]);
| ^
sophie_germain/SophieGermainWorker.cpp: In member function ‘void SophieGermainWorker::TestMegaPrimeChunkLarge()’:
sophie_germain/SophieGermainWorker.cpp:155:21: error: invalid conversion from ‘uint64_t*’ {aka ‘long unsigned int*’} to ‘MpArithVec’ {aka ‘int’} [-fpermissive]
155 | MpArithVec mp(ps);
| ^~
| |
| uint64_t* {aka long unsigned int*}
sophie_germain/SophieGermainWorker.cpp:157:29: error: request for member ‘nToRes’ in ‘mp’, which is of non-class type ‘MpArithVec’ {aka ‘int’}
157 | MpResVec resInvs = mp.nToRes(invs);
| ^~~~~~
sophie_germain/SophieGermainWorker.cpp:158:25: error: request for member ‘pow’ in ‘mp’, which is of non-class type ‘MpArithVec’ {aka ‘int’}
158 | MpResVec res = mp.pow(resInvs, ii_N);
| ^~~
sophie_germain/SophieGermainWorker.cpp:159:27: error: request for member ‘resToN’ in ‘mp’, which is of non-class type ‘MpArithVec’ {aka ‘int’}
159 | MpResVec resKs = mp.resToN(res);
| ^~~~~~
sophie_germain/SophieGermainWorker.cpp:161:16: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
161 | if (resKs[0] >= il_MinK && resKs[0] <= il_MaxK) RemoveTermsLargePrime(resKs[0], true, ps[0]);
| ^
sophie_germain/SophieGermainWorker.cpp:161:39: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
161 | if (resKs[0] >= il_MinK && resKs[0] <= il_MaxK) RemoveTermsLargePrime(resKs[0], true, ps[0]);
| ^
sophie_germain/SophieGermainWorker.cpp:161:82: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
161 | ] >= il_MinK && resKs[0] <= il_MaxK) RemoveTermsLargePrime(resKs[0], true, ps[0]);
| ^

sophie_germain/SophieGermainWorker.cpp:162:16: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
162 | if (resKs[1] >= il_MinK && resKs[1] <= il_MaxK) RemoveTermsLargePrime(resKs[1], true, ps[1]);
| ^
sophie_germain/SophieGermainWorker.cpp:162:39: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
162 | if (resKs[1] >= il_MinK && resKs[1] <= il_MaxK) RemoveTermsLargePrime(resKs[1], true, ps[1]);
| ^
sophie_germain/SophieGermainWorker.cpp:162:82: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
162 | ] >= il_MinK && resKs[1] <= il_MaxK) RemoveTermsLargePrime(resKs[1], true, ps[1]);
| ^

sophie_germain/SophieGermainWorker.cpp:163:16: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
163 | if (resKs[2] >= il_MinK && resKs[2] <= il_MaxK) RemoveTermsLargePrime(resKs[2], true, ps[2]);
| ^
sophie_germain/SophieGermainWorker.cpp:163:39: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
163 | if (resKs[2] >= il_MinK && resKs[2] <= il_MaxK) RemoveTermsLargePrime(resKs[2], true, ps[2]);
| ^
sophie_germain/SophieGermainWorker.cpp:163:82: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
163 | ] >= il_MinK && resKs[2] <= il_MaxK) RemoveTermsLargePrime(resKs[2], true, ps[2]);
| ^

sophie_germain/SophieGermainWorker.cpp:164:16: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
164 | if (resKs[3] >= il_MinK && resKs[3] <= il_MaxK) RemoveTermsLargePrime(resKs[3], true, ps[3]);
| ^
sophie_germain/SophieGermainWorker.cpp:164:39: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
164 | if (resKs[3] >= il_MinK && resKs[3] <= il_MaxK) RemoveTermsLargePrime(resKs[3], true, ps[3]);
| ^
sophie_germain/SophieGermainWorker.cpp:164:82: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
164 | ] >= il_MinK && resKs[3] <= il_MaxK) RemoveTermsLargePrime(resKs[3], true, ps[3]);
| ^

sophie_germain/SophieGermainWorker.cpp:168:19: error: request for member ‘mul’ in ‘mp’, which is of non-class type ‘MpArithVec’ {aka ‘int’}
168 | res = mp.mul(res, resInvs);
| ^~~
sophie_germain/SophieGermainWorker.cpp:176:19: error: request for member ‘mul’ in ‘mp’, which is of non-class type ‘MpArithVec’ {aka ‘int’}
176 | res = mp.mul(res, mp.nToRes(invs));
| ^~~
sophie_germain/SophieGermainWorker.cpp:176:31: error: request for member ‘nToRes’ in ‘mp’, which is of non-class type ‘MpArithVec’ {aka ‘int’}
176 | res = mp.mul(res, mp.nToRes(invs));
| ^~~~~~
sophie_germain/SophieGermainWorker.cpp:179:18: error: request for member ‘resToN’ in ‘mp’, which is of non-class type ‘MpArithVec’ {aka ‘int’}
179 | resKs = mp.resToN(res);
| ^~~~~~
sophie_germain/SophieGermainWorker.cpp:181:34: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
181 | RemoveTermsLargePrime(resKs[0], false, ps[0]);
| ^
sophie_germain/SophieGermainWorker.cpp:182:34: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
182 | RemoveTermsLargePrime(resKs[1], false, ps[1]);
| ^
sophie_germain/SophieGermainWorker.cpp:183:34: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
183 | RemoveTermsLargePrime(resKs[2], false, ps[2]);
| ^
sophie_germain/SophieGermainWorker.cpp:184:34: error: invalid types ‘MpResVec {aka int}[int]’ for array subscript
184 | RemoveTermsLargePrime(resKs[3], false, ps[3]);
| ^
make: *** [makefile:217: sophie_germain/SophieGermainWorker.o] Error 1[/CODE]

I believe a include <cstddef> should solve most, if not all of the errors here, unless size_t is defined elsewhere.

 rogue 2022-08-25 12:25

ryanp,

When you specify -W1, it will not use a GPU worker. -W is for CPU workers and when used with a GPU enabled program will prevent it from using the GPU. My recommendation is then you are using a GPU enabled program to not specify -G at all until you can successfully run without it.

I will look into the segfault with -g5120. I have run into a bug with srsieve2cl. I need to look into it.

I suggest running without -g to see if it can run successfully.

Dylan14, on which platform is that occurring? As you said it might be as simple as adding a missing #include.

pepi37, I understand your desire for a tutorial. I do have a [URL="https://www.mersenneforum.org/rogue/mtsieve.html"]webpage[/URL], but it is horribly out of date. It is on my to-do list to update that page.

For most of the sieves I would think that the basic parameters are fairly obvious. In most cases you don't even need to specify values as they will default. My recommendation is to use the defaults until you are comfortable enough to use the other parameters to see if you can get better performance.

 Dylan14 2022-08-25 13:07

The error with compilation is occurring on Linux (more specifically, Arch Linux). And indeed adding the #include <cstddef> in core/MpArithVector.h fixed the issue.

 ryanp 2022-08-25 13:39

[QUOTE=rogue;612060]I will look into the segfault with -g5120. I have run into a bug with srsieve2cl. I need to look into it.

I suggest running without -g to see if it can run successfully.[/QUOTE]

Without any of [C]-G[/C], [C]-g[/C] or [C]-W[/C]:

[CODE]\$ ./srsieve2cl -P 1e14 -o "ferm81_3M_15M.txt" -s "81*2^n+1" -n 3e6 -N 15e6
srsieve2cl v1.6.3, a program to find factors of k*b^n+c numbers for fixed b and variable k and n
(b2) Removed 3000000 algebraic factors for 81*2^n+1 of the form (3^2)*2^(n/2)-3*2^((n+2)/4))+1 when n%4=2
Sieving with generic logic for p >= 3
Creating CPU worker to use until p >= 1000000
GPU primes per worker is 221184
Sieve started: 3 < p < 1e14 with 9000001 terms (3000000 < n < 15000000, k*2^n+1) (expecting 8693280 factors)
Sieving with single sequence c=1 logic for p >= 257
BASE_MULTIPLE = 30, POWER_RESIDUE_LCM = 720, LIMIT_BASE = 720
Split 1 base 2 sequence into 384 base 2^720 sequences.
Legendre summary: Approximately 2 B needed for Legendre tables
1 total sequences
1 are eligible for Legendre tables
0 are not eligible for Legendre tables
1 have Legendre tables in memory
0 cannot have Legendre tables in memory
0 have Legendre tables loaded from files
1 required building of the Legendre tables
518400 bytes used for congruent q and ladder indices
295200 bytes used for congruent qs and ladders
Creating CPU worker to use until p >= 1000000
double free or corruption (!prev)
Aborted (core dumped)[/CODE]

Here's a backtrace with gdb from a debug run:

[CODE]double free or corruption (!prev)

__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1 0x00007ffff6cb97f1 in __GI_abort () at abort.c:79
#2 0x00007ffff6d02837 in __libc_message (action=action@entry=do_abort,
fmt=fmt@entry=0x7ffff6e2fa7b "%s\n") at ../sysdeps/posix/libc_fatal.c:181
#3 0x00007ffff6d098ba in malloc_printerr (
str=str@entry=0x7ffff6e317a8 "double free or corruption (!prev)")
at malloc.c:5342
#4 0x00007ffff6d10e5c in _int_free (have_lock=0, p=0x55555675f070,
av=0x7ffff7064c40 <main_arena>) at malloc.c:4311
#5 __GI___libc_free (mem=0x55555675f080) at malloc.c:3134
#6 0x00005555555653d8 in xfree (memoryPtr=0x55555675f100) at core/main.cpp:285
#7 0x00005555555638b8 in Worker::~Worker (this=0x555555829170,
__in_chrg=<optimized out>) at core/Worker.cpp:80
#8 0x000055555559b968 in AbstractWorker::~AbstractWorker (
this=0x555555829170, __in_chrg=<optimized out>)
at sierpinski_riesel/AbstractWorker.h:24
#9 0x00005555555a5ece in CisOneWithOneSequenceGpuWorker::~CisOneWithOneSequenceGpuWorker (this=0x555555829170, __in_chrg=<optimized out>)
at sierpinski_riesel/CisOneWithOneSequenceGpuWorker.h:26
#10 0x00005555555a5eea in CisOneWithOneSequenceGpuWorker::~CisOneWithOneSequenceGpuWorker (this=0x555555829170, __in_chrg=<optimized out>)
at sierpinski_riesel/CisOneWithOneSequenceGpuWorker.h:26
#11 0x000055555555e38a in App::DeleteWorkers (this=0x5555557e6970)
at core/App.cpp:678
#12 0x000055555555e209 in App::PauseSievingAndRebuild (this=0x5555557e6970)
at core/App.cpp:643
#13 0x000055555555dd39 in App::Sieve (this=0x5555557e6970) at core/App.cpp:499
#14 0x000055555555d9f8 in App::Run (this=0x5555557e6970) at core/App.cpp:422
#15 0x0000555555564bef in main (argc=11, argv=0x7fffffffe3f8)
at core/main.cpp:91[/CODE]

[QUOTE]pepi37, I understand your desire for a tutorial. I do have a [URL="https://www.mersenneforum.org/rogue/mtsieve.html"]webpage[/URL], but it is horribly out of date. It is on my to-do list to update that page.

For most of the sieves I would think that the basic parameters are fairly obvious. In most cases you don't even need to specify values as they will default. My recommendation is to use the defaults until you are comfortable enough to use the other parameters to see if you can get better performance.[/QUOTE]

I would agree with pepi's general sentiment for the *cl sieves. The tuning of [C]-G[/C], [C]-g[/C], [C]-W[/C] and/or [C]-K[/C] is not obvious, at least to me.

I don't think a lengthy tutorial is needed, but maybe just a several-paragraph doc with "how to tune flags for the common cases"? e.g.: one GPU + one sequence being sieved. It also wasn't clear that [C]-W[/C] turns off GPU sieving altogether (I'm not sure why the *cl programs would even support this flag).

 rogue 2022-08-25 14:58

The double-free is what I am seeing as well. Whatever the cause is, it is not obvious at this time.

By default you get only one type of worker or the other. You have to use -W and -G together to get both kinds of workers. Without -W, -G defaults to 1. I know that is a little confusing. My typical uses for the GPU-enabled programs is to not set either -W or -G. I will use -W for the CPU only programs, but I could see the use of both if one has a lot of CPU cores and and GPU isn't that much faster than a single GPU core.

 rogue 2022-08-25 16:43

I tracked down the problem. I have committed a change to CisOneWithOneSequenceGpuWorker.

 ryanp 2022-08-25 22:09

[QUOTE=rogue;612071]I tracked down the problem. I have committed a change to CisOneWithOneSequenceGpuWorker.[/QUOTE]

It finally runs. But I can't seem to tune [C]-g[/C] to be any more efficient than 16 regular CPU workers, even when running [C]srsieve2cl[/C] on an NVIDIA A100... which seems quite odd.

 rogue 2022-08-26 00:29

[QUOTE=ryanp;612078]It finally runs. But I can't seem to tune [C]-g[/C] to be any more efficient than 16 regular CPU workers, even when running [C]srsieve2cl[/C] on an NVIDIA A100... which seems quite odd.[/QUOTE]

Not necessarily. You can use -G2 to try two GPU workers or you can mix CPU and GPU workers, e.g. -G1 -W16.

What is the comparative speed of one GPU worker to one CPU worker?

 rogue 2022-08-29 14:25