mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2023-03-21, 09:32   #1068
gd_barnes
 
gd_barnes's Avatar
 
"Gary"
May 2007
Overland Park, KS

22×32×349 Posts
Default

Quote:
Originally Posted by kar_bon View Post
Look at the "SVN" index in the "sierpinski_riesel" folder: main source is SierpinskiRieselApp.cpp.
Great! Thanks, Karsten.
gd_barnes is offline   Reply With Quote
Old 2023-03-21, 13:06   #1069
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

33·277 Posts
Default

To d/l all of the code you can use svn checkout.

srsieve2/srsieve2cl is by far the most complex code built upon the framework. There are the "Generic" classes which have the srsieve functionality. "CisOneWithMultipleSequences" classes have the sr2sieve functionality. "CisOneWithOneeSequence" classes have the sr1sieve functionality. All GPU code is in the .gpu files. Are build time these are run thru a converter to create the .h files, which are needed by the GPU worker classes. The .gpu files use OpenCL C, which is easily understood if you know C.

As I have stated before sr1sieve and sr2sieve are likely faster and srsieve2, but srsieve2cl is likely faster than sr1sieve and sr2sieve. The only reason to use srsieve2 on Windows is if you want to take advantage of multi-threading or if you cannot use sr1sieve/sr2sieve. I have no intention of changing srsieve2 to compete directly with sr1sieve/sr2sieve. That would require a lot of ASM code and I have avoided such code to ensure portability to other CPU architectures, such as ARM. Some less used sieves still have ASM. Unless asked, I will probably not update those for ARM support. Some sieves support AVX (which uses ASM), but they also have a non-AVX code path.

In short srsieve2 is not meant as a replacement for sr1sieve/sr2sieve. I was focused on srsieve2cl. At some point I will write the GPU equivalent code for sr2sieve. Fortunately the Generic code in srsieve2cl is fast enough to replace sr2sieve so it hasn't been too high on my priority list.

I would be happy to answer any questions.
rogue is offline   Reply With Quote
Old 2023-03-21, 15:48   #1070
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009
Oceanus Procellarum

302710 Posts
Default

fbncsieve fatal error:

Code:
D:\sieve>fbncsieve -P5e12 -i1897-3.abcd -o1897-4.abcd
fbncsieve v1.5, a program to find factors of k*b^n+c numbers for fixed b, n, and c and variable k
Sieve started: 1000000000039 < p < 5e12 with 30071 terms (1000010 < k < 1999980, k*18970509^3+1) (expecting 1655 factors)
Increasing worksize to 1600000 since each chunk is tested in less than a second
Increasing worksize to 200000000 since each chunk is tested in less than a second
Fatal Error:  1302598*18970509^3+1 mod 1014558378077 = 758131303968
The original series was "k*18970509^3+1." -k 1e6, -K 2e6. I had no problem running -P to 1e12. After trying to run the above, I dropped -P to 2e12. Same error.

I checked to make sure I had the latest build. Unless something has changed in the past day, it appears I do.
storm5510 is offline   Reply With Quote
Old 2023-03-21, 16:30   #1071
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

33×277 Posts
Default

Quote:
Originally Posted by storm5510 View Post
fbncsieve fatal error:

Code:
D:\sieve>fbncsieve -P5e12 -i1897-3.abcd -o1897-4.abcd
fbncsieve v1.5, a program to find factors of k*b^n+c numbers for fixed b, n, and c and variable k
Sieve started: 1000000000039 < p < 5e12 with 30071 terms (1000010 < k < 1999980, k*18970509^3+1) (expecting 1655 factors)
Increasing worksize to 1600000 since each chunk is tested in less than a second
Increasing worksize to 200000000 since each chunk is tested in less than a second
Fatal Error:  1302598*18970509^3+1 mod 1014558378077 = 758131303968
The original series was "k*18970509^3+1." -k 1e6, -K 2e6. I had no problem running -P to 1e12. After trying to run the above, I dropped -P to 2e12. Same error.

I checked to make sure I had the latest build. Unless something has changed in the past day, it appears I do.
Please send my your ABCD file. I will take a look.
rogue is offline   Reply With Quote
Old 2023-03-21, 18:42   #1072
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009
Oceanus Procellarum

3×1,009 Posts
Default

Quote:
Originally Posted by rogue View Post
Please send my your ABCD file. I will take a look.
Attached.
Attached Files
File Type: zip 1e12.zip (29.5 KB, 19 views)
storm5510 is offline   Reply With Quote
Old 2023-03-21, 20:16   #1073
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

33·277 Posts
Default

I can fix this, but lost some speed in the process. I don't know if I can restore the speed without re-introducing this issue. I need to look at it further.
rogue is offline   Reply With Quote
Old 2023-03-22, 02:01   #1074
gd_barnes
 
gd_barnes's Avatar
 
"Gary"
May 2007
Overland Park, KS

110001000101002 Posts
Default

Quote:
Originally Posted by rogue View Post
To d/l all of the code you can use svn checkout.

srsieve2/srsieve2cl is by far the most complex code built upon the framework. There are the "Generic" classes which have the srsieve functionality. "CisOneWithMultipleSequences" classes have the sr2sieve functionality. "CisOneWithOneeSequence" classes have the sr1sieve functionality. All GPU code is in the .gpu files. Are build time these are run thru a converter to create the .h files, which are needed by the GPU worker classes. The .gpu files use OpenCL C, which is easily understood if you know C.

As I have stated before sr1sieve and sr2sieve are likely faster and srsieve2, but srsieve2cl is likely faster than sr1sieve and sr2sieve. The only reason to use srsieve2 on Windows is if you want to take advantage of multi-threading or if you cannot use sr1sieve/sr2sieve. I have no intention of changing srsieve2 to compete directly with sr1sieve/sr2sieve. That would require a lot of ASM code and I have avoided such code to ensure portability to other CPU architectures, such as ARM. Some less used sieves still have ASM. Unless asked, I will probably not update those for ARM support. Some sieves support AVX (which uses ASM), but they also have a non-AVX code path.

In short srsieve2 is not meant as a replacement for sr1sieve/sr2sieve. I was focused on srsieve2cl. At some point I will write the GPU equivalent code for sr2sieve. Fortunately the Generic code in srsieve2cl is fast enough to replace sr2sieve so it hasn't been too high on my priority list.

I would be happy to answer any questions.
Thanks for the info Mark. It's interesting to see all of the complex code. As a former mainframe programmer, I know barely enough to be dangerous in C/C++ having had single classes of Pascal and C+ in college. Mainly I'd like to dabble a little bit with creating my own builds. I don't know enough to change general logic but it would be interesting to tweak the cosmetics of the output in srsieve2.

What all would be involved in creating my own executable?

It's interesting that you bring up srsieve2 not being able to compete with sr2sieve/sr1sieve as far as overall throughput on multi-core machines. I've generally found that to be true but I have found a major exception: CRUS Sierp base 66. I'm getting much more overall throughput with srsieve2 vs. multiple instances of sr2sieve with the -x switch on 3 different machines: Intel 8-core/8-thread, Intel 8-core/16-thread, and AMD 16-core/32-thread. Perhaps srsieve2 is faster when you have to use the -x switch in sr2sieve due to the many large k-values. But based on your explanations in various places, I don't know why.

Eventually I want to fiddle with running srsieve2cl. I don't know anything about GPU's but I believe my Ryzen 3950X has one that would do quite well with this.
gd_barnes is offline   Reply With Quote
Old 2023-03-22, 12:37   #1075
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

33·277 Posts
Default

To build on Windows I use clang 14.0.0 (from the llvm project on github). The build the GPU executables you will also need perl. With those installed you just need to use "make" or "make <program>" from the command line from the directory with the makefile.

I have seen the similar results with sr2sieve -x vs srsieve2. In other words some conjectures sieve faster with sr2sieve -x, but others sieve faster with srsieve2. I have not investigated why. As you stated it likely has something to do with large k, but it isn't obvious in looking at either sr2sieve or srsieve2 since they have very different implementations.

FYI all command line output is generated with calls to WriteToConsole(). Many of these are in App.cpp. You will find most (but not all) of the rest in the xxApp.cpp class specific to the sieve. https://www.mersenneforum.org/rogue/mtsieve.html has more detail on the framework including descriptions of the framework classes and methods. I would be happy to answer any questions.
rogue is offline   Reply With Quote
Old 2023-03-22, 14:26   #1076
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

33×277 Posts
Default

Quote:
Originally Posted by rogue View Post
I can fix this, but lost some speed in the process. I don't know if I can restore the speed without re-introducing this issue. I need to look at it further.
I found the issue. For larger bases it requires different logic. twinsieve is also impacted by this, but I think I can use the faster logic for ccsieve for some forms.
rogue is offline   Reply With Quote
Old 2023-03-24, 16:17   #1077
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

11101001101112 Posts
Default

I have posted mtsieve 2.4.5 at sourceforge. Here are a list of changes:

Code:
framework:
   Replace vsprintf with vsnprintf.
   
srsieve2/srsieve2cl: version 1.6.9
   Fix an issue that occurs when logging factors and using multiple threads.

gcwsieve/gcwsievecl: version 1.5.1
   Log terms of GFN or Mersenne forms as they are removed.

fbncsieve: version 1.6
   Implement different logic (which is 5x slower) for larger bsaes to avoid invalid factors.
   Only verify first factor for the first k for each prime.
   Reduce memory usage for odd bases since we only track even k.
   Reduce memory usage for base 2 since we only track odd k.
   Output primes to a separate file.

twincsieve: version 1.6
   Implement different logic (which is 5x slower) for larger bsaes to avoid invalid
   factors.  This only applies to b^n forms.
   Add support to sieve for factorial/primorial twins.
   Reduce memory usage for odd bases since we only track even k.
   Reduce memory usage for base 2 since we only track odd k.
   Only verify first factor for the first k for each prime.
   
ccsieve: version 1.2
   Implement different logic (which is up to 2x faster) for b^n forms.
rogue is offline   Reply With Quote
Old 2023-03-25, 17:29   #1078
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009
Oceanus Procellarum

BD316 Posts
Default

Code:
...287 factors found at 234 sec per factor (last 163 min)...
From srsieve2: What is this time keeping method? It is certainly not real-time like a clock, other than the elapsed time at the end.
storm5510 is offline   Reply With Quote
Reply

Thread Tools


All times are UTC. The time now is 20:54.


Thu Sep 28 20:54:42 UTC 2023 up 15 days, 18:37, 0 users, load averages: 1.15, 1.25, 1.16

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔