mersenneforum.org  

Go Back   mersenneforum.org > Prime Search Projects > And now for something completely different

Reply
 
Thread Tools
Old 2017-07-29, 01:48   #1
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

2·32·347 Posts
Default rogue's sieves

I will use this thread to post announcements for any new or updated sieving software I write.

You can find most of my software here. If you can't find it here, then it is likely in sourceforge.
rogue is offline   Reply With Quote
Old 2017-07-29, 01:57   #2
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

2·32·347 Posts
Default

I have released an OpenCL version of pixsieve called pixsievecl. The OpenCL version is about 5.5x faster than the x86-64 version on the laptop which I've tested it on.

"pixsieve" is short for "primes in x" with x being any arbitrary decimal value. For example if you want to sieve terms of the decimal expansion of pi from 900,000 to 1,000,000 digits in length, then this is the program you want to use. It will remove terms will small factors and output a file in DECIMAL format, which can be used as input to pfgw in the hopes of finding a large PRP.

Both versions along with source and 64-bit Windows builds can be found on my website. Although I did not create a makefile, these program should compile and link on OS X and Linux, hopefully out of the box, but if not, with small changes.
rogue is offline   Reply With Quote
Old 2017-07-30, 14:47   #3
J F
 
J F's Avatar
 
Sep 2013

23×7 Posts
Default

Yay, thanks a lot!
Runs out of the box and produces the same results as pixsieve.
6-year-old HD7950 (Tahiti) @stock 850MHz is ~5x faster than one
6600K core @3.9GHz. I will play around to see if there is a bit
more possible with different block sizes etc.
Minor confusion: pixsieveCL states 'OpenCL 2.0 AMD-APP (2442.0)'.
The software framework on my machine might be 2.0-capable, but
the card hardware is only 1.2.


Some questions:
1. Old pixsieve had options
-s --stringfile=s File containing a decimal representation of any number
-S --searchstring=S Starting point of substring to start factoring
Now big-S is stringfile and I don't see a searchstring-option, is it gone?
(workaround is easy enough, just deleting the part up to searchstring
in my Pi1Mio-file)

2. what does '-t --nthreads=N Start N threads' do in the CL-version?
CPU threads preparing stuff / feeding the GPU?
Doesn't seem to make any speed differences.
J F is offline   Reply With Quote
Old 2017-07-30, 21:54   #4
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

2×32×347 Posts
Default

Quote:
Originally Posted by J F View Post
Yay, thanks a lot!
Runs out of the box and produces the same results as pixsieve.
6-year-old HD7950 (Tahiti) @stock 850MHz is ~5x faster than one
6600K core @3.9GHz. I will play around to see if there is a bit
more possible with different block sizes etc.
Minor confusion: pixsieveCL states 'OpenCL 2.0 AMD-APP (2442.0)'.
The software framework on my machine might be 2.0-capable, but
the card hardware is only 1.2.


Some questions:
1. Old pixsieve had options
-s --stringfile=s File containing a decimal representation of any number
-S --searchstring=S Starting point of substring to start factoring
Now big-S is stringfile and I don't see a searchstring-option, is it gone?
(workaround is easy enough, just deleting the part up to searchstring
in my Pi1Mio-file)

2. what does '-t --nthreads=N Start N threads' do in the CL-version?
CPU threads preparing stuff / feeding the GPU?
Doesn't seem to make any speed differences.
1) -S doesn't exist in pixsievecl, but I can create it. It will not use the -S option for that though. The workaround is easy enough for an end user so I didn't include it. If you "had" to, use pixsieve to create an output file, then continue with pixsievecl.

2) -t will change the number of concurrent GPU threads. If you can't run with enough blocks to keep the GPU busy, then increase the number of threads. When the program ends it tells you how much time was spent in the GPU and how much time it was waiting for the GPU before giving it more work. If the percent of time waiting for the GPU is low, add more threads.
rogue is offline   Reply With Quote
Old 2017-07-31, 19:36   #5
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

624610 Posts
Default

I have released both x86-64 and OpenCL versions of an alternating factorial sieve, achieve and afsievecl. The OpenCL version is about 10x faster than the x86-64 version on the laptop which I've tested it on.

Alternating Factorials are defined as the sum of consecutive factorials with alternating signs.

The file that is output is used as input to a pfgw script, which is included and is called alternate.txt. Before using that script you need to delete the ABC line from the file. Note this is not a valid ABC format for pfgw. It is only used by the sieving code to stop and restart sieving.

FYI, per this link I am searching to n=100,000 so this is for anyone wanting to search beyond that.

Both versions along with source and 64-bit Windows builds can be found on my website. Although I did not create a makefile, these program should compile and link on OS X and Linux, hopefully out of the box, but if not, with small changes.
rogue is offline   Reply With Quote
Reply

Thread Tools


All times are UTC. The time now is 02:06.

Sat Mar 6 02:06:32 UTC 2021 up 92 days, 22:17, 0 users, load averages: 1.95, 1.48, 1.38

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.