If I did this right, then it looks fine on windows 7 x64 with nvidia gtx 760.
Code:
D:\...\testing>gcwsievecl64 -v -C -W -b200 -B201 -n2 -t2 -N1e5 -P1e6
gcwsievecl v1.0.2, a GPU program to find factors of Cullen and Woodall numbers (n*b^n+c where b and n are fixed)
Quick elimination of terms info (in order of check):
99998 because the term is even
63411 because the term is divisible by a prime < 100
Platform 0 is a NVIDIA Corporation NVIDIA CUDA, version OpenCL 1.1 CUDA 6.5.12
Device 0 is a NVIDIA Corporation GeForce GTX 760
workGroupSize = 38400 = 200 * 32 * 6 (blocks * workGroupSizeMultiple * deviceComputeUnits)
Running with 2 threads
Allocated memory (prior to sieving): 25 MB in CPU, 9 MB in GPU
Sieve started: (cmdline) 0 <= p < 1000000 with 36589 terms
Sieve complete: 3 <= p < 1000000 116898 primes tested
Clock time: 2.02 seconds at 57810 p/sec. Factors found: 26022
Processor time: 1.28 sec. (0.27 init + 1.01 sieve).
Seconds spent in CPU and GPU: 2.76 (cpu), 1.76 (gpu)
Percent of time spent in CPU vs. GPU: 61.10 (cpu), 38.90 (gpu)
CPU/GPU utilization: 0.63 (cores), 0.87 (devices)
Started with 36589 terms and sieved to 1000000. 10567 remaining terms written to gcw_201.pfgw
Attached is a zip containing gcwsieve.log and gcw_201.pfgw
Let me know if I did this right, or if any other tests I do can be of any help...
Thanks,
-1998golfer