mersenneforum.org mtsieve
 Register FAQ Search Today's Posts Mark Forums Read

2022-12-02, 14:44   #793
rogue

"Mark"
Apr 2003
Between here and the

2×3,527 Posts

Quote:
 Originally Posted by storm5510 After doing more "digging" on my external drive, I found the batch files I had written for srsieve and sr1sieve. srsieve runs to a point then sr1sieve takes over after srfile does a conversion. This may take some time. Many thanks!
There is no reason to using srsieve anymore. Use srsieve2, even if you don't have a GPU. Without a GPU you sieve to 1e6 with srsieve2, then switch to sr1sieve/sr2sieve. The default output format from srsieve2 (ABCD) can be read by the current versions of sr1sieve/sr2sieve.

2022-12-02, 17:43   #794
storm5510
Random Account

Aug 2009
Not U. + S.A.

1010001101112 Posts

Quote:
 Originally Posted by rogue There is no reason to using srsieve anymore. Use srsieve2, even if you don't have a GPU. Without a GPU you sieve to 1e6 with srsieve2, then switch to sr1sieve/sr2sieve. The default output format from srsieve2 (ABCD) can be read by the current versions of sr1sieve/sr2sieve.
I did. srsieve2cl. The GPU utilization never dropped below 80%. Consider the following:

Code:
srsieve2cl -n 1e3 -N 15e6 -P 5e9 -M 3500 -s "101*2^n+1"
This is me experimenting with the switches. What I ended up with was 831,020 remaining terms. This is, by far, too many to be practical for any LLR process. The largest result in an output file never appears to exceed the value of -N. The quoted above took five minutes to run. I believe I need to greatly increase the value of -P. I will try this again with -P at 100e9 and see what is left over.

Note: I used 101 in the sequence because I knew it was a prime number, just not Mersenne. I was receiving GPU messages until -M was at 3,500.

2022-12-02, 19:28   #795
rogue

"Mark"
Apr 2003
Between here and the

2·3,527 Posts

Quote:
 Originally Posted by storm5510 I did. srsieve2cl. The GPU utilization never dropped below 80%. Consider the following: Code: srsieve2cl -n 1e3 -N 15e6 -P 5e9 -M 3500 -s "101*2^n+1" This is me experimenting with the switches. What I ended up with was 831,020 remaining terms. This is, by far, too many to be practical for any LLR process. The largest result in an output file never appears to exceed the value of -N. The quoted above took five minutes to run. I believe I need to greatly increase the value of -P. I will try this again with -P at 100e9 and see what is left over. Note: I used 101 in the sequence because I knew it was a prime number, just not Mersenne. I was receiving GPU messages until -M was at 3,500.
Hopefully the speed is to your liking. Use -g. The default is 8. I recommend a power of 2, such as -g16 or -g32. That should increase GPU utilization. Once you get to P of about 1e9 or maybe even 1e10 you will need need to add -M. In the future I will modify the code so that -M is adjusted automatically while running.

Yes, no n in the output file will be outside of the range you specified on the command line.

2022-12-02, 22:43   #796
storm5510
Random Account

Aug 2009
Not U. + S.A.

261510 Posts

Quote:
 Originally Posted by rogue Hopefully the speed is to your liking. Use -g. The default is 8. I recommend a power of 2, such as -g16 or -g32. That should increase GPU utilization. Once you get to P of about 1e9 or maybe even 1e10 you will need need to add -M. In the future I will modify the code so that -M is adjusted automatically while running. Yes, no n in the output file will be outside of the range you specified on the command line.
I adjusted -P to 100e9. The run took 1.3 hours. The number of remaining terms was about half of what I had before. The GPU stayed below 50°C, so that is really good. I will give -g 16 a try. Perhaps the high number of terms is related to the size of the number at the front of the series. I should use something larger.

2022-12-02, 23:30   #797
rogue

"Mark"
Apr 2003
Between here and the

1B8E16 Posts

Quote:
 Originally Posted by storm5510 I adjusted -P to 100e9. The run took 1.3 hours. The number of remaining terms was about half of what I had before. The GPU stayed below 50°C, so that is really good. I will give -g 16 a try. Perhaps the high number of terms is related to the size of the number at the front of the series. I should use something larger.
The range of n determines how many terms you start with so I'm not certain what you mean by "use something larger".

2022-12-03, 05:58   #798
storm5510
Random Account

Aug 2009
Not U. + S.A.

5×523 Posts

Quote:
 Originally Posted by rogue The range of n determines how many terms you start with so I'm not certain what you mean by "use something larger".
Using a larger value for k.

You are to be congratulated for the amazing performance increase. I can run sieves in a few hours which three years ago may have taken days when I was running Riesel's for Prime Wiki.

Off-topic: I am not aware of anything being done with the LLR group of programs. In my case, there was a major slow-down which started around 800K for n. The k value did not much matter.

2022-12-03, 14:06   #799
rogue

"Mark"
Apr 2003
Between here and the

705410 Posts

Quote:
 Originally Posted by storm5510 You are to be congratulated for the amazing performance increase. I can run sieves in a few hours which three years ago may have taken days when I was running Riesel's for Prime Wiki. Off-topic: I am not aware of anything being done with the LLR group of programs. In my case, there was a major slow-down which started around 800K for n. The k value did not much matter.
Thank you!

The speed of llr/pfgw is a result of the FFT size needed to do the PRP/primary test. This primarily driven by n since k is only a handful of bits and n is many thousands of bits. There are some GPU programs that can do PRP/primarity tests, such as llrCUDA, proth20, and various versions of genefer. proth20 is limited to base 2. genefer is limited to GFNs.

2022-12-03, 15:53   #800
kruoli

"Oliver"
Sep 2017
Porta Westfalica, DE

5A016 Posts

Quote:
 Originally Posted by storm5510 I am not aware of anything being done with the LLR group of programs. In my case, there was a major slow-down which started around 800K for n. The k value did not much matter.
There is a CUDA version of LLR. If you want to test larger numbers, you might be interested in trying this on your 2080.

2022-12-04, 00:30   #801
storm5510
Random Account

Aug 2009
Not U. + S.A.

1010001101112 Posts

Quote:
 Originally Posted by kruoli There is a CUDA version of LLR. If you want to test larger numbers, you might be interested in trying this on your 2080.
Thanks. I did on my Linux box, (Ubuntu 20.04.4 LTS). That system has a GTX 1080 in it. It is not a good performer because of limited power. Tests started out taking 13 seconds and got slower as it went on. The same system is a dual-boot, of sorts. Windows 7 exists on one drive and Ubuntu lives on another. I switch the drive cables. The Windows version began the same tests taking about 3 seconds. It is an HP workstation probably 8 years old. Long-in-the-tooth. My 2080 is in a Windows 10 system from 2018. I make do with what I have.

Quote:
 Originally Posted by rogue Thank you!
I give recognition when due. It was decidedly due here! I am quite pleased with it.

 2022-12-04, 08:31 #802 Honza     Feb 2011 33 Posts Is there Windows binary for latest version of LLRCUDA somewhere?
2022-12-04, 11:24   #803
pepi37

Dec 2011
After milion nines:)

110011001102 Posts

Quote:
 Originally Posted by Honza Is there Windows binary for latest version of LLRCUDA somewhere?
Nope, just for linux , and as I know it is not fast enough ( in my case)

All times are UTC. The time now is 03:02.

Thu Mar 23 03:02:24 UTC 2023 up 217 days, 30 mins, 0 users, load averages: 1.26, 0.88, 0.84