mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   mtsieve (https://www.mersenneforum.org/showthread.php?t=23042)

rogue 2022-05-17 14:13

I tried switching to the cygwin g++ compiler, but it won't link. It appears to be a bug in primesieve or the linker or compiler used by cygwin. So I am now trying to use the llvm-mingw compiler. It appears that I can debug with lldb or gdb for programs built on it.

This will require a bit of testing. It seems to crash when resetting the rounding mode for SSE, but only one sieve uses SSE, so I will probably just modify that sieve to use different logic and thus remove SSE assembler completely from the code. It is possible that something else I changed between releases triggered this crash, but it is odd that only srsieve2 seems to be affected by it. Without use of gdb on msys2 builds, I cannot diagnose the root cause of the crash.

ryanp 2022-05-23 13:30

Found what seems to be a bug in the latest SVN code. I have not had time to recompile or test with [C]gdb[/C].

[CODE]$ ./srsieve2 -P 1e14 -W 64 -s "39*2^n+1" -o ferm39_10M_20M_sv1e14.txt -n 10e6 -N 20e6
srsieve2 v1.6.2, a program to find factors of k*b^n+c numbers for fixed b and variable k and n
Sieving with generic logic for p >= 3
Sieve started: 3 < p < 1e14 with 10000001 terms (10000000 < n < 20000000, k*2^n+1) (expecting 9659200 factors)
Sieving with single sequence c=1 logic for p >= 257
BASE_MULTIPLE = 30, POWER_RESIDUE_LCM = 720, LIMIT_BASE = 720
Split 1 base 2 sequence into 408 base 2^720 sequences.
Legendre summary: Approximately 40 B needed for Legendre tables
1 total sequences
1 are eligible for Legendre tables
0 are not eligible for Legendre tables
1 have Legendre tables in memory
0 cannot have Legendre tables in memory
0 have Legendre tables loaded from files
1 required building of the Legendre tables
518400 bytes used for congruent q and ladder indices
311200 bytes used for congruent qs and ladders
Unable to lock mutex thread_6_worker. Exiting.[/CODE]

Interestingly, this only seems to happen for [C]39*2^n+1[/C] and not [C]37*2^n+1[/C] or [C]41*2^n+1[/C].

rogue 2022-05-23 15:00

[QUOTE=ryanp;606332]Found what seems to be a bug in the latest SVN code. I have not had time to recompile or test with [C]gdb[/C].

[CODE]$ ./srsieve2 -P 1e14 -W 64 -s "39*2^n+1" -o ferm39_10M_20M_sv1e14.txt -n 10e6 -N 20e6
srsieve2 v1.6.2, a program to find factors of k*b^n+c numbers for fixed b and variable k and n
Sieving with generic logic for p >= 3
Sieve started: 3 < p < 1e14 with 10000001 terms (10000000 < n < 20000000, k*2^n+1) (expecting 9659200 factors)
Sieving with single sequence c=1 logic for p >= 257
BASE_MULTIPLE = 30, POWER_RESIDUE_LCM = 720, LIMIT_BASE = 720
Split 1 base 2 sequence into 408 base 2^720 sequences.
Legendre summary: Approximately 40 B needed for Legendre tables
1 total sequences
1 are eligible for Legendre tables
0 are not eligible for Legendre tables
1 have Legendre tables in memory
0 cannot have Legendre tables in memory
0 have Legendre tables loaded from files
1 required building of the Legendre tables
518400 bytes used for congruent q and ladder indices
311200 bytes used for congruent qs and ladders
Unable to lock mutex thread_6_worker. Exiting.[/CODE]

Interestingly, this only seems to happen for [C]39*2^n+1[/C] and not [C]37*2^n+1[/C] or [C]41*2^n+1[/C].[/QUOTE]

srsieve2 doesn't seem to perform nicely for large numbers of threads. I think it would require a rethinking of the framework to do that. You are probably off running multiple instances with a smaller number of threads, each with its own range. You can also bump -w to change the number of primes per worker thread. That might alleviate some of the thread contention. I do not have a CPU with more than 8 cores to run a test like this on.

I do not recall if sr1sieve is faster (compared to srsieve2). I know it is slower than srsieve2cl.

ryanp 2022-05-23 17:16

[QUOTE=rogue;606337]srsieve2 doesn't seem to perform nicely for large numbers of threads. I think it would require a rethinking of the framework to do that. You are probably off running multiple instances with a smaller number of threads, each with its own range. You can also bump -w to change the number of primes per worker thread. That might alleviate some of the thread contention. I do not have a CPU with more than 8 cores to run a test like this on.

I do not recall if sr1sieve is faster (compared to srsieve2). I know it is slower than srsieve2cl.[/QUOTE]

I've been able to repro this on multiple machines. Observations:

* it fails consistently and only with [C]-s "39*2^n+1"[/C]
* it fails with even -W 4, though this produces:

[CODE]518400 bytes used for congruent q and ladder indices
311200 bytes used for congruent qs and ladders
corrupted size vs. prev_size
Aborted[/CODE]

instead of:

[CODE]518400 bytes used for congruent q and ladder indices
311200 bytes used for congruent qs and ladders
srsieve2: ../nptl/pthread_mutex_lock.c:81: __pthread_mutex_lock: Assertion `mutex->__data.__owner == 0' failed.
Aborted[/CODE]

Here's the result of debugging with [C]gdb[/C]:

[CODE][New Thread 0x7ffff6ef9640 (LWP 206359)]
[New Thread 0x7ffff5ef7640 (LWP 206360)]
corrupted size vs. prev_size

Thread 1 "srsieve2" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49
49 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49
#1 0x00007ffff7a4a546 in __GI_abort () at abort.c:79
#2 0x00007ffff7aa1eb8 in __libc_message (action=action@entry=do_abort,
fmt=fmt@entry=0x7ffff7bbfa78 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3 0x00007ffff7aa991a in malloc_printerr (
str=str@entry=0x7ffff7bbd714 "corrupted size vs. prev_size")
at malloc.c:5628
#4 0x00007ffff7aaa816 in unlink_chunk (p=p@entry=0x5555555f9e00,
av=<optimized out>) at malloc.c:1608
#5 0x00007ffff7aad24a in _int_malloc (
av=av@entry=0x7ffff7bf6ba0 <main_arena>, bytes=bytes@entry=112)
at malloc.c:4263
#6 0x00007ffff7aae4b1 in __GI___libc_malloc (bytes=112) at malloc.c:3237
#7 0x00007ffff7e0e6bc in operator new(unsigned long) ()
from /lib/x86_64-linux-gnu/libstdc++.so.6
#8 0x000055555556f308 in __gnu_cxx::new_allocator<primesieve::SievingPrime>::allocate (this=0x7fffffffd5f0, __n=14)
at /usr/include/c++/11/ext/new_allocator.h:127
#9 0x000055555556efb2 in std::allocator_traits<std::allocator<primesieve::SievingPrime> >::allocate (__a=..., __n=14)
at /usr/include/c++/11/bits/alloc_traits.h:464
#10 0x000055555556ea3c in std::_Vector_base<primesieve::SievingPrime, std::allocator<primesieve::SievingPrime> >::_M_allocate (this=0x7fffffffd5f0, __n=14)
at /usr/include/c++/11/bits/stl_vector.h:346
#11 0x000055555556e6c4 in std::vector<primesieve::SievingPrime, std::allocator<primesieve::SievingPrime> >::reserve (this=0x7fffffffd5f0, __n=14)
at /usr/include/c++/11/bits/vector.tcc:78
#12 0x000055555556ba1c in primesieve::EratSmall::init (this=0x7fffffffd5d0,
stop=1021020, l1CacheSize=17017, maxPrime=17) at sieve/EratSmall.cpp:57
#13 0x000055555556f9d4 in primesieve::PreSieve::initBuffer (
this=0x55555583efd8, maxPrime=17, primeProduct=510510)
at sieve/PreSieve.cpp:86
#14 0x000055555556f8dc in primesieve::PreSieve::init (this=0x55555583efd8,
start=11924379, stop=23704475) at sieve/PreSieve.cpp:68
#15 0x0000555555564079 in primesieve::Erat::init (this=0x55555583ec70,
start=11924379, stop=23704475, sieveSize=512, preSieve=...)
at sieve/Erat.cpp:79
#16 0x000055555557793b in primesieve::PrimeGenerator::initErat (
this=0x55555583ec70) at sieve/PrimeGenerator.cpp:159
#17 0x00005555555778ad in primesieve::PrimeGenerator::init (
this=0x55555583ec70,
primes=std::vector of length 256, capacity 256 = {...},
size=0x7fffffffd8f8) at sieve/PrimeGenerator.cpp:147
#18 0x0000555555577bab in primesieve::PrimeGenerator::sieveSegment (
this=0x55555583ec70,
primes=std::vector of length 256, capacity 256 = {...},
size=0x7fffffffd8f8) at sieve/PrimeGenerator.cpp:232
#19 0x0000555555577d56 in primesieve::PrimeGenerator::fill (
this=0x55555583ec70,
primes=std::vector of length 256, capacity 256 = {...},
size=0x7fffffffd8f8) at sieve/PrimeGenerator.cpp:291
#20 0x00005555555850d5 in primesieve::iterator::generate_next_primes (
this=0x7fffffffd8f0) at sieve/iterator.cpp:67
#21 0x0000555555560110 in primesieve::iterator::next_prime (
this=0x7fffffffd8f0) at sieve/primesieve/iterator.hpp:69
#22 0x0000555555560632 in primesieve::store_n_primes<std::vector<unsigned long, std::allocator<unsigned long> > > (n=216804, start=1257,
primes=std::vector of length 783196, capacity 1000000 = {...})
at sieve/primesieve/StorePrimes.hpp:87
#23 0x000055555556028c in primesieve::generate_n_primes<unsigned long> (
n=1000000, start=1258, primes=0x5555556fb390)
at core/../sieve/primesieve.hpp:62
#24 0x0000555555561fff in Worker::ProcessNextPrimeChunk (this=0x5555556fb380,
startFrom=1257, maxPrimeForChunk=1257) at core/Worker.cpp:155
#25 0x000055555555cbe1 in App::Sieve (this=0x5555555d98e0) at core/App.cpp:450
#26 0x000055555555ca3b in App::Run (this=0x5555555d98e0) at core/App.cpp:405
#27 0x0000555555563288 in main (argc=13, argv=0x7fffffffdba8)
at core/main.cpp:87[/CODE]

pepi37 2022-05-23 18:15

What is range of n?


[QUOTE=ryanp;606344]I've been able to repro this on multiple machines. Observations:

* it fails consistently and only with [C]-s "39*2^n+1"[/C]
* it fails with even -W 4, though this produces:

[

rogue 2022-05-23 19:29

When I have some time I will play around with this using the latest build as it uses a different compiler. I have also upgraded to the latest primesieve code. One of those changes might fix this issue.

pepi37 2022-05-23 21:58

fix for problem (temporary)
Make initial sieve with srsieve

use srsieve2 version 1.5.3
working on 6 core CPU without problem



srsieve2 -P 1446900638801[COLOR=Red][B] -W 6[/B][/COLOR] -w5e6 -i a.txt -o b.txt -O fact39.txt
srsieve2 v1.5.3, a program to find factors of k*b^n+c numbers for fixed b and variable k and n
Sieving with generic logic for p >= 100000000
Split 1 base 2 sequence into 204 base 2^360 sequences.
712822 bytes used for congruence tables
522 bytes used for Legendre tables
Sieve started: 1e8 < p < 1446900638801 with 1353444 terms (11924391 < n < 23704473, k*2^n+1) (expecting 463052 factors)
p=731561923, 679.7K p/sec, 142804 factors found at 476.8 f/sec (last 1 min), 0.0% done. ETC 2022-05-25 14:45

pepi37 2022-05-23 22:00

[QUOTE=rogue;606337]srsieve2 doesn't seem to perform nicely for large numbers of threads. I think it would require a rethinking of the framework to do that. You are probably off running multiple instances with a smaller number of threads, each with its own range. You can also bump -w to change the number of primes per worker thread. That might alleviate some of the thread contention. I do not have a CPU with more than 8 cores to run a test like this on.

I do not recall if sr1sieve is faster (compared to srsieve2). I know it is slower than srsieve2cl.[/QUOTE]




I noticed this on 12 core Ryzen ( HT is off) When I set 12 cores I only have 785% CPU utilization ( Linux)

dannyridel 2022-05-25 23:37

sgsieve returning CC 2nd kind numbers in sieve file
 
Hi rogue,

I used the sgsieve program (part of mtsieve) in order to sieve for my SG search. This is a sample command that I used:
sgsieve -P 10e13 --workers=5 -w1e7 -k 3 -K 2500000000 -n 143106 --outputterms=143105.txt -f NEWPGEN

However, when I opened my sieve file, I see the first line (for the example above):
100000000000067:C:0:2:5

According to the LLR documentation, the C indicates that the sieve file wants LLR to search "CC 2nd kind len 2", which isn't what I asked the sieve to produce. However, the numbers do come out to match the sg criteria (both the original prime and the to-be sg prime are in +1 form). Could you explain why the sieve sieves for +1 candidates instead of the vast majority of -1 primes seen on t5k?

Thank you,
Daniel

rogue 2022-05-26 12:14

[QUOTE=dannyridel;606511]Hi rogue,

I used the sgsieve program (part of mtsieve) in order to sieve for my SG search. This is a sample command that I used:
sgsieve -P 10e13 --workers=5 -w1e7 -k 3 -K 2500000000 -n 143106 --outputterms=143105.txt -f NEWPGEN

However, when I opened my sieve file, I see the first line (for the example above):
100000000000067:C:0:2:5

According to the LLR documentation, the C indicates that the sieve file wants LLR to search "CC 2nd kind len 2", which isn't what I asked the sieve to produce. However, the numbers do come out to match the sg criteria (both the original prime and the to-be sg prime are in +1 form). Could you explain why the sieve sieves for +1 candidates instead of the vast majority of -1 primes seen on t5k?

Thank you,
Daniel[/QUOTE]

Sophie Germain: A prime p is said to be a Sophie Germain prime if both p and 2p+1 are prime.
Cunningham Chain of the first kind of length 2: A prime p is said to be a Sophie Germain prime if both p and 2p+1 are prime.
Cunningham Chain of the second kind of length 2: A prime p is said to be a Sophie Germain prime if both p and 2p-1 are prime.

You can see that the first two are the same. A "C" in newpgen formst corresponds to that. You can run newgpen yourself and discover that.

newpgen can also sieve for Sophie Germain, but it uses p of the form k*2^n-1. sgsieve uses with p of the form k*2^n+1, since the main code originated from gfndsieve.

I can look into changing sgsieve to support either +1 or -1 terms for p.

dannyridel 2022-05-27 04:27

So in essence, both types of Cunningham Chains, if they result in prime pairs, can count as SG pairs? Eg. SG pairs can be either (p,2p-1) or (p,2p+1)?

Also, doesn't a C in newpgen correspond to CC 2nd kind? And S corresponds to first kind/SG?


All times are UTC. The time now is 03:56.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.