Thread: gcwsieve
View Single Post
Old 2007-08-13, 05:25   #47
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13×89 Posts
Default gcwsieve 1.0.16

Version 1.0.16 has support for software prefetching, using the prefetchnta instruction available for SSE machines, or GCC's __builtin_prefetch() function for non x86/x86-64 builds.

Prefetching should result in a speedup in the case that the sieve is too large to fit in L2 cache (each sieve term takes 8 bytes), but on some machines it results in a slowdown instead, probably because it interferes with the automatic hardware prefetcher.

So before sieving starts some test runs are made with and without prefetch, and the faster method selected. Use the --verbose switch to see whether prefetch was selected. To override the automatic selection, use these new switches:

--prefetch: Force use of prefetch.
--no-prefetch: Prevent use of prefetch.


Here are some times for a 216000 term sieve (Primegrid Cullen 10M) at p=1000e9:
Code:
                      --no-prefetch   --prefetch
P3 450MHz, 512Kb L2:   1167 p/sec      1502 p/sec       +29%
P3 600MHz, 256Kb L2:   1462 p/sec      1993 p/sec       +36%
P4 2.9GHz, 512Kb L2:  12224 p/sec     11711 p/sec        -4%
geoff is offline