![]() |
![]() |
#2 |
"Mark"
Apr 2003
Between here and the
3×2,357 Posts |
![]()
I have released an OpenCL version of pixsieve called pixsievecl. The OpenCL version is about 5.5x faster than the x86-64 version on the laptop which I've tested it on.
"pixsieve" is short for "primes in x" with x being any arbitrary decimal value. For example if you want to sieve terms of the decimal expansion of pi from 900,000 to 1,000,000 digits in length, then this is the program you want to use. It will remove terms will small factors and output a file in DECIMAL format, which can be used as input to pfgw in the hopes of finding a large PRP. Both versions along with source and 64-bit Windows builds can be found on my website. Although I did not create a makefile, these program should compile and link on OS X and Linux, hopefully out of the box, but if not, with small changes. |
![]() |
![]() |
![]() |
#3 |
Sep 2013
23·7 Posts |
![]()
Yay, thanks a lot!
Runs out of the box and produces the same results as pixsieve. 6-year-old HD7950 (Tahiti) @stock 850MHz is ~5x faster than one 6600K core @3.9GHz. I will play around to see if there is a bit more possible with different block sizes etc. Minor confusion: pixsieveCL states 'OpenCL 2.0 AMD-APP (2442.0)'. The software framework on my machine might be 2.0-capable, but the card hardware is only 1.2. Some questions: 1. Old pixsieve had options -s --stringfile=s File containing a decimal representation of any number -S --searchstring=S Starting point of substring to start factoring Now big-S is stringfile and I don't see a searchstring-option, is it gone? (workaround is easy enough, just deleting the part up to searchstring in my Pi1Mio-file) 2. what does '-t --nthreads=N Start N threads' do in the CL-version? CPU threads preparing stuff / feeding the GPU? Doesn't seem to make any speed differences. |
![]() |
![]() |
![]() |
#4 | |
"Mark"
Apr 2003
Between here and the
3·2,357 Posts |
![]() Quote:
2) -t will change the number of concurrent GPU threads. If you can't run with enough blocks to keep the GPU busy, then increase the number of threads. When the program ends it tells you how much time was spent in the GPU and how much time it was waiting for the GPU before giving it more work. If the percent of time waiting for the GPU is low, add more threads. |
|
![]() |
![]() |
![]() |
#5 |
"Mark"
Apr 2003
Between here and the
3·2,357 Posts |
![]()
I have released both x86-64 and OpenCL versions of an alternating factorial sieve, achieve and afsievecl. The OpenCL version is about 10x faster than the x86-64 version on the laptop which I've tested it on.
Alternating Factorials are defined as the sum of consecutive factorials with alternating signs. The file that is output is used as input to a pfgw script, which is included and is called alternate.txt. Before using that script you need to delete the ABC line from the file. Note this is not a valid ABC format for pfgw. It is only used by the sieving code to stop and restart sieving. FYI, per this link I am searching to n=100,000 so this is for anyone wanting to search beyond that. Both versions along with source and 64-bit Windows builds can be found on my website. Although I did not create a makefile, these program should compile and link on OS X and Linux, hopefully out of the box, but if not, with small changes. |
![]() |
![]() |