![]() |
I tried switching to the cygwin g++ compiler, but it won't link. It appears to be a bug in primesieve or the linker or compiler used by cygwin. So I am now trying to use the llvm-mingw compiler. It appears that I can debug with lldb or gdb for programs built on it.
This will require a bit of testing. It seems to crash when resetting the rounding mode for SSE, but only one sieve uses SSE, so I will probably just modify that sieve to use different logic and thus remove SSE assembler completely from the code. It is possible that something else I changed between releases triggered this crash, but it is odd that only srsieve2 seems to be affected by it. Without use of gdb on msys2 builds, I cannot diagnose the root cause of the crash. |
Found what seems to be a bug in the latest SVN code. I have not had time to recompile or test with [C]gdb[/C].
[CODE]$ ./srsieve2 -P 1e14 -W 64 -s "39*2^n+1" -o ferm39_10M_20M_sv1e14.txt -n 10e6 -N 20e6 srsieve2 v1.6.2, a program to find factors of k*b^n+c numbers for fixed b and variable k and n Sieving with generic logic for p >= 3 Sieve started: 3 < p < 1e14 with 10000001 terms (10000000 < n < 20000000, k*2^n+1) (expecting 9659200 factors) Sieving with single sequence c=1 logic for p >= 257 BASE_MULTIPLE = 30, POWER_RESIDUE_LCM = 720, LIMIT_BASE = 720 Split 1 base 2 sequence into 408 base 2^720 sequences. Legendre summary: Approximately 40 B needed for Legendre tables 1 total sequences 1 are eligible for Legendre tables 0 are not eligible for Legendre tables 1 have Legendre tables in memory 0 cannot have Legendre tables in memory 0 have Legendre tables loaded from files 1 required building of the Legendre tables 518400 bytes used for congruent q and ladder indices 311200 bytes used for congruent qs and ladders Unable to lock mutex thread_6_worker. Exiting.[/CODE] Interestingly, this only seems to happen for [C]39*2^n+1[/C] and not [C]37*2^n+1[/C] or [C]41*2^n+1[/C]. |
[QUOTE=ryanp;606332]Found what seems to be a bug in the latest SVN code. I have not had time to recompile or test with [C]gdb[/C].
[CODE]$ ./srsieve2 -P 1e14 -W 64 -s "39*2^n+1" -o ferm39_10M_20M_sv1e14.txt -n 10e6 -N 20e6 srsieve2 v1.6.2, a program to find factors of k*b^n+c numbers for fixed b and variable k and n Sieving with generic logic for p >= 3 Sieve started: 3 < p < 1e14 with 10000001 terms (10000000 < n < 20000000, k*2^n+1) (expecting 9659200 factors) Sieving with single sequence c=1 logic for p >= 257 BASE_MULTIPLE = 30, POWER_RESIDUE_LCM = 720, LIMIT_BASE = 720 Split 1 base 2 sequence into 408 base 2^720 sequences. Legendre summary: Approximately 40 B needed for Legendre tables 1 total sequences 1 are eligible for Legendre tables 0 are not eligible for Legendre tables 1 have Legendre tables in memory 0 cannot have Legendre tables in memory 0 have Legendre tables loaded from files 1 required building of the Legendre tables 518400 bytes used for congruent q and ladder indices 311200 bytes used for congruent qs and ladders Unable to lock mutex thread_6_worker. Exiting.[/CODE] Interestingly, this only seems to happen for [C]39*2^n+1[/C] and not [C]37*2^n+1[/C] or [C]41*2^n+1[/C].[/QUOTE] srsieve2 doesn't seem to perform nicely for large numbers of threads. I think it would require a rethinking of the framework to do that. You are probably off running multiple instances with a smaller number of threads, each with its own range. You can also bump -w to change the number of primes per worker thread. That might alleviate some of the thread contention. I do not have a CPU with more than 8 cores to run a test like this on. I do not recall if sr1sieve is faster (compared to srsieve2). I know it is slower than srsieve2cl. |
[QUOTE=rogue;606337]srsieve2 doesn't seem to perform nicely for large numbers of threads. I think it would require a rethinking of the framework to do that. You are probably off running multiple instances with a smaller number of threads, each with its own range. You can also bump -w to change the number of primes per worker thread. That might alleviate some of the thread contention. I do not have a CPU with more than 8 cores to run a test like this on.
I do not recall if sr1sieve is faster (compared to srsieve2). I know it is slower than srsieve2cl.[/QUOTE] I've been able to repro this on multiple machines. Observations: * it fails consistently and only with [C]-s "39*2^n+1"[/C] * it fails with even -W 4, though this produces: [CODE]518400 bytes used for congruent q and ladder indices 311200 bytes used for congruent qs and ladders corrupted size vs. prev_size Aborted[/CODE] instead of: [CODE]518400 bytes used for congruent q and ladder indices 311200 bytes used for congruent qs and ladders srsieve2: ../nptl/pthread_mutex_lock.c:81: __pthread_mutex_lock: Assertion `mutex->__data.__owner == 0' failed. Aborted[/CODE] Here's the result of debugging with [C]gdb[/C]: [CODE][New Thread 0x7ffff6ef9640 (LWP 206359)] [New Thread 0x7ffff5ef7640 (LWP 206360)] corrupted size vs. prev_size Thread 1 "srsieve2" received signal SIGABRT, Aborted. __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49 49 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49 #1 0x00007ffff7a4a546 in __GI_abort () at abort.c:79 #2 0x00007ffff7aa1eb8 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff7bbfa78 "%s\n") at ../sysdeps/posix/libc_fatal.c:155 #3 0x00007ffff7aa991a in malloc_printerr ( str=str@entry=0x7ffff7bbd714 "corrupted size vs. prev_size") at malloc.c:5628 #4 0x00007ffff7aaa816 in unlink_chunk (p=p@entry=0x5555555f9e00, av=<optimized out>) at malloc.c:1608 #5 0x00007ffff7aad24a in _int_malloc ( av=av@entry=0x7ffff7bf6ba0 <main_arena>, bytes=bytes@entry=112) at malloc.c:4263 #6 0x00007ffff7aae4b1 in __GI___libc_malloc (bytes=112) at malloc.c:3237 #7 0x00007ffff7e0e6bc in operator new(unsigned long) () from /lib/x86_64-linux-gnu/libstdc++.so.6 #8 0x000055555556f308 in __gnu_cxx::new_allocator<primesieve::SievingPrime>::allocate (this=0x7fffffffd5f0, __n=14) at /usr/include/c++/11/ext/new_allocator.h:127 #9 0x000055555556efb2 in std::allocator_traits<std::allocator<primesieve::SievingPrime> >::allocate (__a=..., __n=14) at /usr/include/c++/11/bits/alloc_traits.h:464 #10 0x000055555556ea3c in std::_Vector_base<primesieve::SievingPrime, std::allocator<primesieve::SievingPrime> >::_M_allocate (this=0x7fffffffd5f0, __n=14) at /usr/include/c++/11/bits/stl_vector.h:346 #11 0x000055555556e6c4 in std::vector<primesieve::SievingPrime, std::allocator<primesieve::SievingPrime> >::reserve (this=0x7fffffffd5f0, __n=14) at /usr/include/c++/11/bits/vector.tcc:78 #12 0x000055555556ba1c in primesieve::EratSmall::init (this=0x7fffffffd5d0, stop=1021020, l1CacheSize=17017, maxPrime=17) at sieve/EratSmall.cpp:57 #13 0x000055555556f9d4 in primesieve::PreSieve::initBuffer ( this=0x55555583efd8, maxPrime=17, primeProduct=510510) at sieve/PreSieve.cpp:86 #14 0x000055555556f8dc in primesieve::PreSieve::init (this=0x55555583efd8, start=11924379, stop=23704475) at sieve/PreSieve.cpp:68 #15 0x0000555555564079 in primesieve::Erat::init (this=0x55555583ec70, start=11924379, stop=23704475, sieveSize=512, preSieve=...) at sieve/Erat.cpp:79 #16 0x000055555557793b in primesieve::PrimeGenerator::initErat ( this=0x55555583ec70) at sieve/PrimeGenerator.cpp:159 #17 0x00005555555778ad in primesieve::PrimeGenerator::init ( this=0x55555583ec70, primes=std::vector of length 256, capacity 256 = {...}, size=0x7fffffffd8f8) at sieve/PrimeGenerator.cpp:147 #18 0x0000555555577bab in primesieve::PrimeGenerator::sieveSegment ( this=0x55555583ec70, primes=std::vector of length 256, capacity 256 = {...}, size=0x7fffffffd8f8) at sieve/PrimeGenerator.cpp:232 #19 0x0000555555577d56 in primesieve::PrimeGenerator::fill ( this=0x55555583ec70, primes=std::vector of length 256, capacity 256 = {...}, size=0x7fffffffd8f8) at sieve/PrimeGenerator.cpp:291 #20 0x00005555555850d5 in primesieve::iterator::generate_next_primes ( this=0x7fffffffd8f0) at sieve/iterator.cpp:67 #21 0x0000555555560110 in primesieve::iterator::next_prime ( this=0x7fffffffd8f0) at sieve/primesieve/iterator.hpp:69 #22 0x0000555555560632 in primesieve::store_n_primes<std::vector<unsigned long, std::allocator<unsigned long> > > (n=216804, start=1257, primes=std::vector of length 783196, capacity 1000000 = {...}) at sieve/primesieve/StorePrimes.hpp:87 #23 0x000055555556028c in primesieve::generate_n_primes<unsigned long> ( n=1000000, start=1258, primes=0x5555556fb390) at core/../sieve/primesieve.hpp:62 #24 0x0000555555561fff in Worker::ProcessNextPrimeChunk (this=0x5555556fb380, startFrom=1257, maxPrimeForChunk=1257) at core/Worker.cpp:155 #25 0x000055555555cbe1 in App::Sieve (this=0x5555555d98e0) at core/App.cpp:450 #26 0x000055555555ca3b in App::Run (this=0x5555555d98e0) at core/App.cpp:405 #27 0x0000555555563288 in main (argc=13, argv=0x7fffffffdba8) at core/main.cpp:87[/CODE] |
What is range of n?
[QUOTE=ryanp;606344]I've been able to repro this on multiple machines. Observations: * it fails consistently and only with [C]-s "39*2^n+1"[/C] * it fails with even -W 4, though this produces: [ |
When I have some time I will play around with this using the latest build as it uses a different compiler. I have also upgraded to the latest primesieve code. One of those changes might fix this issue.
|
fix for problem (temporary)
Make initial sieve with srsieve use srsieve2 version 1.5.3 working on 6 core CPU without problem srsieve2 -P 1446900638801[COLOR=Red][B] -W 6[/B][/COLOR] -w5e6 -i a.txt -o b.txt -O fact39.txt srsieve2 v1.5.3, a program to find factors of k*b^n+c numbers for fixed b and variable k and n Sieving with generic logic for p >= 100000000 Split 1 base 2 sequence into 204 base 2^360 sequences. 712822 bytes used for congruence tables 522 bytes used for Legendre tables Sieve started: 1e8 < p < 1446900638801 with 1353444 terms (11924391 < n < 23704473, k*2^n+1) (expecting 463052 factors) p=731561923, 679.7K p/sec, 142804 factors found at 476.8 f/sec (last 1 min), 0.0% done. ETC 2022-05-25 14:45 |
[QUOTE=rogue;606337]srsieve2 doesn't seem to perform nicely for large numbers of threads. I think it would require a rethinking of the framework to do that. You are probably off running multiple instances with a smaller number of threads, each with its own range. You can also bump -w to change the number of primes per worker thread. That might alleviate some of the thread contention. I do not have a CPU with more than 8 cores to run a test like this on.
I do not recall if sr1sieve is faster (compared to srsieve2). I know it is slower than srsieve2cl.[/QUOTE] I noticed this on 12 core Ryzen ( HT is off) When I set 12 cores I only have 785% CPU utilization ( Linux) |
sgsieve returning CC 2nd kind numbers in sieve file
Hi rogue,
I used the sgsieve program (part of mtsieve) in order to sieve for my SG search. This is a sample command that I used: sgsieve -P 10e13 --workers=5 -w1e7 -k 3 -K 2500000000 -n 143106 --outputterms=143105.txt -f NEWPGEN However, when I opened my sieve file, I see the first line (for the example above): 100000000000067:C:0:2:5 According to the LLR documentation, the C indicates that the sieve file wants LLR to search "CC 2nd kind len 2", which isn't what I asked the sieve to produce. However, the numbers do come out to match the sg criteria (both the original prime and the to-be sg prime are in +1 form). Could you explain why the sieve sieves for +1 candidates instead of the vast majority of -1 primes seen on t5k? Thank you, Daniel |
[QUOTE=dannyridel;606511]Hi rogue,
I used the sgsieve program (part of mtsieve) in order to sieve for my SG search. This is a sample command that I used: sgsieve -P 10e13 --workers=5 -w1e7 -k 3 -K 2500000000 -n 143106 --outputterms=143105.txt -f NEWPGEN However, when I opened my sieve file, I see the first line (for the example above): 100000000000067:C:0:2:5 According to the LLR documentation, the C indicates that the sieve file wants LLR to search "CC 2nd kind len 2", which isn't what I asked the sieve to produce. However, the numbers do come out to match the sg criteria (both the original prime and the to-be sg prime are in +1 form). Could you explain why the sieve sieves for +1 candidates instead of the vast majority of -1 primes seen on t5k? Thank you, Daniel[/QUOTE] Sophie Germain: A prime p is said to be a Sophie Germain prime if both p and 2p+1 are prime. Cunningham Chain of the first kind of length 2: A prime p is said to be a Sophie Germain prime if both p and 2p+1 are prime. Cunningham Chain of the second kind of length 2: A prime p is said to be a Sophie Germain prime if both p and 2p-1 are prime. You can see that the first two are the same. A "C" in newpgen formst corresponds to that. You can run newgpen yourself and discover that. newpgen can also sieve for Sophie Germain, but it uses p of the form k*2^n-1. sgsieve uses with p of the form k*2^n+1, since the main code originated from gfndsieve. I can look into changing sgsieve to support either +1 or -1 terms for p. |
So in essence, both types of Cunningham Chains, if they result in prime pairs, can count as SG pairs? Eg. SG pairs can be either (p,2p-1) or (p,2p+1)?
Also, doesn't a C in newpgen correspond to CC 2nd kind? And S corresponds to first kind/SG? |
All times are UTC. The time now is 13:39. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.