mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2022-12-04, 13:24   #804
Jean Penné
 
Jean Penné's Avatar
 
May 2004
FRANCE

10011110112 Posts
Default Does the static binary work for you?

Quote:
Originally Posted by pepi37 View Post
Nope, just for linux , and as I know it is not fast enough ( in my case)
Nevertheless, I need to know if the static binary of llrCUDA works for you, even if it is not fast enough...

Thank you by advance,

Jean
Jean Penné is offline   Reply With Quote
Old 2022-12-04, 15:14   #805
pepi37
 
pepi37's Avatar
 
Dec 2011
After 1.58M nines:)

24·3·37 Posts
Default

Quote:
Originally Posted by Jean Penné View Post
Nevertheless, I need to know if the static binary of llrCUDA works for you, even if it is not fast enough...

Thank you by advance,

Jean
Yes it works :) But one instance "eat one CPU core"
Using trick with libsleep.I can reduce it to 50% of one CPU core. Speed is same as Ryzen7 3700x per core: since both need around 17 minutes for test of 535000 digits candidate

Quote:
root@OMICRON:~/LLR# ./sllrCUDA -d -q"4569*2^1778899+1"
Starting Proth prime test of 4569*2^1778899+1
Using complex irrational base DWT, FFT length = 262144, a = 5
^Ceration: 160000 / 1778910 [8.49%], ms/iter: 0.596, ETA: 00:16:04
Caught signal. Terminating.
Stopping Proth prime test of 4569*2^1778899+1 at iteration 164342 [9.23%]


root@OMICRON:~/LLR# LD_PRELOAD="/usr/local/lib/libsleep.so" ./sllrCUDA -d -q"4569*2^1778899+1"
libsleep: Sleep time: 50usec
Resuming Proth prime test of 4569*2^1778899+1 at bit 164343 [9.23%]
Using complex irrational base DWT, FFT length = 262144, a = 5
^Ceration: 310000 / 1778910 [16.93%], ms/iter: 0.593, ETA: 00:14:30
Caught signal. Terminating.
Stopping Proth prime test of 4569*2^1778899+1 at iteration 317616 [17.85%]
pepi37 is online now   Reply With Quote
Old 2022-12-04, 20:51   #806
Citrix
 
Citrix's Avatar
 
Jun 2003

65F16 Posts
Default

I am getting the following error. What settings do I need to change?

Code:
srsieve2cl.exe -i sr_2.abcd -W4  -p 10000000000000 -P 11000000000000  -Ofactors.txt -osr_2_new.abcd  -G12 -M100000 -l1000

srsieve2cl v1.6.5, a program to find factors of k*b^n+c numbers for fixed b and variable k and n
Sieving with multi-sequence c=1 logic for p >= 10000000000000
BASE_MULTIPLE = 2, POWER_RESIDUE_LCM = 720, LIMIT_BASE = 720
Assertion failed: m <= HASH_MAX_ELTS, file sierpinski_riesel/AbstractSequenceHelper.cpp, line 272
Citrix is offline   Reply With Quote
Old 2022-12-04, 21:59   #807
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

7,481 Posts
Default

Quote:
Originally Posted by Citrix View Post
I am getting the following error. What settings do I need to change?

Code:
srsieve2cl.exe -i sr_2.abcd -W4  -p 10000000000000 -P 11000000000000  -Ofactors.txt -osr_2_new.abcd  -G12 -M100000 -l1000

srsieve2cl v1.6.5, a program to find factors of k*b^n+c numbers for fixed b and variable k and n
Sieving with multi-sequence c=1 logic for p >= 10000000000000
BASE_MULTIPLE = 2, POWER_RESIDUE_LCM = 720, LIMIT_BASE = 720
Assertion failed: m <= HASH_MAX_ELTS, file sierpinski_riesel/AbstractSequenceHelper.cpp, line 272
I ran into this in the past week so I have a solution for it. I posted an experimental build over at sourceforge that should address this.
rogue is offline   Reply With Quote
Old 2022-12-04, 22:05   #808
Citrix
 
Citrix's Avatar
 
Jun 2003

65F16 Posts
Default

I get with new

Code:
srsieve2cl.exe -i sr_2.abcd -W2  -p 10000000000000 -P 11000000000000  -Ofactors.txt -osr_2_new.abcd   -M1000 -l10000  -w1000 -G12
srsieve2cl v1.6.5, a program to find factors of k*b^n+c numbers for fixed b and variable k and n
Sieving with multi-sequence c=1 logic for p >= 10000000000000
BASE_MULTIPLE = 2, POWER_RESIDUE_LCM = 720, LIMIT_BASE = 720
Split 204 base 2 sequences into 9182 base 2^720 sequences.
Legendre summary:  Approximately 4752 B needed for Legendre tables
       204 total sequences
       204 are eligible for Legendre tables
         0 are not eligible for Legendre tables
       204 have Legendre tables in memory
         0 cannot have Legendre tables in memory
         0 have Legendre tables loaded from files
       204 required building of the Legendre tables
17625600 bytes used for congruent subseq indices
1360000 bytes used for congruent subseqs
Fatal Error:  Must use generic worker if using GPU with multiple sequences by specifying -l0
With generic code
Code:
srsieve2cl.exe -i sr_2.abcd -W2  -p 10000000000000 -P 11000000000000  -Ofactors.txt -osr_2_new.abcd   -M1000   -w1000 -G6
srsieve2cl v1.6.5, a program to find factors of k*b^n+c numbers for fixed b and variable k and n
Must use generic sieving logic because -l was not specified for mutiple sequences
Sieving with generic logic for p >= 10000000000000
Split 204 base 2 sequences into 20555 base 2^2880 sequences.
bestQ = 2880 yields bs = 6077, gs = 1, sieveLow = 868, sieveRange = 6077
bestQ = 2880 yields bs = 6077, gs = 1, sieveLow = 868, sieveRange = 6077
GPU primes per worker is 57344
Sieve started: 1e13 < p < 11e12 with 134418 terms (2500875 < n < 20000000, k*2^n-1) (expecting 427 factors)
Increasing worksize to 16000 since each chunk is tested in less than a second

OpenCL Error: Out of host memory
       in call to clEnqueueNDRangeOpenCLKernel
       kernelName: generic_kernel  globalworksize 57344  localworksize 256

Last fiddled with by Citrix on 2022-12-04 at 22:08
Citrix is offline   Reply With Quote
Old 2022-12-05, 03:39   #809
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

748110 Posts
Default

Quote:
Originally Posted by Citrix View Post
With generic code
Code:
srsieve2cl.exe -i sr_2.abcd -W2  -p 10000000000000 -P 11000000000000  -Ofactors.txt -osr_2_new.abcd   -M1000   -w1000 -G6
srsieve2cl v1.6.5, a program to find factors of k*b^n+c numbers for fixed b and variable k and n
Must use generic sieving logic because -l was not specified for mutiple sequences
Sieving with generic logic for p >= 10000000000000
Split 204 base 2 sequences into 20555 base 2^2880 sequences.
bestQ = 2880 yields bs = 6077, gs = 1, sieveLow = 868, sieveRange = 6077
bestQ = 2880 yields bs = 6077, gs = 1, sieveLow = 868, sieveRange = 6077
GPU primes per worker is 57344
Sieve started: 1e13 < p < 11e12 with 134418 terms (2500875 < n < 20000000, k*2^n-1) (expecting 427 factors)
Increasing worksize to 16000 since each chunk is tested in less than a second

OpenCL Error: Out of host memory
       in call to clEnqueueNDRangeOpenCLKernel
       kernelName: generic_kernel  globalworksize 57344  localworksize 256
You should use -g to increase GPU primes per worker as opposed to the number of GPU threads. The framework, at this time, does not support one executable running concurrently on multiple GPUs.

Using -G impacts GPU memory usage, but with that many subsequences I suggest that you use -b (a value less than 1.0) to reduce the size of the hash table that the GPU will use. You might also want to use -K to split the sequences across multiple chunks. This will require some trial and error on your part. There is no way (that I am aware of) to compute the memory required for a kernel so the code cannot "auto-tune" these parameters.

You cannot use -l > 0 with the GPU when you have multiple sequences. srsieve2cl does not support it at this time.

I also do not recommend mixing -W and -G. The factor rate calculation does not work correctly when using both CPU and GPU workers.

You can use -p10e12 -P11e12 if that is easier to read.
rogue is offline   Reply With Quote
Old 2022-12-05, 18:44   #810
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009
Oceanus Procellarum

22·757 Posts
Default

@rogue

Q.: Does srsieve2cl generate an exit code when it finishes? Running small sieves from a batch sometimes would fail because I had the -M set too low. It was at 3,500. Now, it is 10,000. It varied based on what the k value was. Some k's caused problems and others did not. All used the same values for -n, -N, and -P.
storm5510 is offline   Reply With Quote
Old 2022-12-05, 19:24   #811
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

7,481 Posts
Default

Quote:
Originally Posted by storm5510 View Post
@rogue

Q.: Does srsieve2cl generate an exit code when it finishes? Running small sieves from a batch sometimes would fail because I had the -M set too low. It was at 3,500. Now, it is 10,000. It varied based on what the k value was. Some k's caused problems and others did not. All used the same values for -n, -N, and -P.
For normal completion it will output the number of terms written to the output file and the time it took to run.

SEGFAULTs will just give you the command prompt without any of that. If that happens let me know.

Last fiddled with by rogue on 2022-12-05 at 19:25
rogue is offline   Reply With Quote
Old 2022-12-05, 23:41   #812
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009
Oceanus Procellarum

22·757 Posts
Default

Quote:
Originally Posted by rogue View Post
For normal completion it will output the number of terms written to the output file and the time it took to run.

SEGFAULTs will just give you the command prompt without any of that. If that happens let me know.
Forgive me, but I didn't specify it correctly. An error code?

For a normal program run and exit, an error code of zero is expected. If there is an error, a non-zero code is returned.

Quote:
Originally Posted by Jean Penné
Nevertheless, I need to know if the static binary of llrCUDA works for you, even if it is not fast enough...

Thank you by advance,

Jean
Off-topic: I am running it as a test. According to nvidia-smi, it is using about 30% of the GPU's capability. I am running "1955*2^n+1" for the test. The k is my birth year. The n's are around 102K presently. Despite not being all that fast, it is quite stable in my case. Ubuntu 20.04.4 LTS using a GTX 1080. The iteration time holds steady at 0.14 seconds. The overall time is increasing gradually.
storm5510 is offline   Reply With Quote
Old 2022-12-06, 00:09   #813
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

1D3916 Posts
Default

Quote:
Originally Posted by storm5510 View Post
Forgive me, but I didn't specify it correctly. An error code?

For a normal program run and exit, an error code of zero is expected. If there is an error, a non-zero code is returned.
It will be zero upon successful completion. A FatalError (caught and output to the console) is -1. I'm not certain what assert() with exit with.

I do not understand why you care. The error code is not output to the console.
rogue is offline   Reply With Quote
Old 2022-12-06, 03:34   #814
Citrix
 
Citrix's Avatar
 
Jun 2003

7·233 Posts
Default

@Rogue

I can get the program to work but it is extremely slow without the Legendre tables.

Couple of other questions/thoughts

1. I get the following error with the CPU code as well (srsieve2). Can you release a fix.
Code:
Assertion failed: m <= HASH_MAX_ELTS, file sierpinski_riesel/AbstractSequenceHelper.cpp, line 272
2. For BASE_MULTIPLE there is a limit of 60 ... can this be increased to 256 or higher.

3. Possible bug:- The GPU code seems to crash if the n range is large (~15M); seems to produce false factors if n range is large and LIMIT_BASE is huge.

4. For what type of sequences is it best to use GPU and for which ones should you stick to CPU.

Thanks

Last fiddled with by Citrix on 2022-12-06 at 04:00 Reason: Sp
Citrix is offline   Reply With Quote
Reply

Thread Tools


All times are UTC. The time now is 11:04.


Fri Sep 29 11:04:48 UTC 2023 up 16 days, 8:47, 0 users, load averages: 0.69, 1.01, 0.96

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔