![]() |
![]() |
#100 | |
Mar 2003
New Zealand
100100001012 Posts |
![]() Quote:
These are some timings for sr2sieve 1.4.x vs proth_sieve 0.42 done on two of my machines, both running Debian Linux. I tested at p=100e12 (100T) because the proth_sieve speed starts to drop when p becomes too much larger than this, and I think this may be a problem with the code rather than a true indication of performance. (The speed per p should increase as p increases, as there are fewer primes to test). Times are kp/s (1000's increase in p per CPU second) to 3 s.f. where known. The hyperthreaded times were taken by running two instances of the program and adding the kp/s times for both. Pentium 3 @ 600MHz (Coppermine EB, 16Kb L1, 256Kb L2), p=100e12 Code:
8k SoB.dat 19k SoB.dat 69k riesel.dat ---------- ----------- -------------- proth_sieve_cmov 0.42 151 86 31 sr2sieve-i686 1.4.18 122 75.9 45.6 sr2sieve-i686 1.4.21 138 81.5 47.0 sr2sieve-i686 1.4.23 145 85.4 48.9 Code:
Single thread 8k SoB.dat 19k SoB.dat 69k riesel.dat ------------- ---------- ----------- -------------- proth_sieve_sse2 0.42 342 201 82 sr2sieve-pentium4 1.4.18 279 177 107 sr2sieve-pentium4 1.4.21 318 189 113 sr2sieve-pentium4 1.4.23 328 197 116 Two hyperthreads 8k SoB.dat 19k SoB.dat 69k riesel.dat ---------------- ---------- ----------- -------------- proth_sieve_sse2 0.42 554 330 130 sr2sieve-pentium4 1.4.18 413 262 157 sr2sieve-pentium4 1.4.21 469 279 162 sr2sieve-pentium4 1.4.23 488 288 167 |
|
![]() |
![]() |
![]() |
#101 |
Jul 2005
2×193 Posts |
![]()
That looks great Geoff.
I hope my message didn't come over as "my sieve is faster than yours", it certainly wasn't meant that way. If we work together and share code/results we can make each others code even faster! |
![]() |
![]() |
![]() |
#102 | |
Mar 2003
New Zealand
13×89 Posts |
![]() Quote:
I don't know how much of that is due to the effort to make proth sieve run fast for SoB.dat without regard to riesel.dat speed, and how much is because of differences between the proth sieve and sr2sieve algorithms. I suspect that sr2sieve does a lot less work in trying to eliminate candidates before running BSGS, and that may be a better approach when the range of n is small. The 20 million range of riesel.dat vs the 50 million range of SoB.dat could be the important factor, rather than the number of k in the sieve. |
|
![]() |
![]() |
![]() |
#103 |
Mar 2003
New Zealand
22058 Posts |
![]()
Does anyone know how to detect the size of the L1 and L2 data cache on ppc64? Is 32Kb L1, 512Kb L2 a reasonable default if it can't be detected?
|
![]() |
![]() |
![]() |
#104 |
Oct 2006
On a Suzuki Boulevard C90
2×3×41 Posts |
![]()
Geoff,
That's what it has been for every ppc64 that I've encountered. I don't know an easy way to do it for Linux; there are external tools but they can't be depended upon. For example, on my home PowerMac, /proc/cpuinfo shows the 512K unified L2 cache, but doesn't mention the L1. lshw says that the same machine has 128 terabytes of L1 and 2 petabytes of L2. However, lshw on the IBM blades shows the correct L1 & L2. |
![]() |
![]() |
![]() |
#105 |
"Mark"
Apr 2003
Between here and the
32·739 Posts |
![]()
For 64-bit PowerPC CPUs, 512Kb is the minimum L2 cache size. Some have 1Mb. I don't know if there is an easy way to determine the L2 cache size.
|
![]() |
![]() |
![]() |
#106 |
Jul 2005
2·193 Posts |
![]()
MacOS X command line:-
sysctl hw.l1icachesize sysctl hw.l1dcachesize sysctl hw.l2cachesize So I'm guessing there'll be somewhere in the sysctl() function call...indeed, in /usr/include/sys/sysctl.h #define HW_L1ICACHESIZE 17 /* int: L1 I Cache Size in Bytes */ #define HW_L1DCACHESIZE 18 /* int: L1 D Cache Size in Bytes */ #define HW_L2SETTINGS 19 /* int: L2 Cache Settings */ #define HW_L2CACHESIZE 20 /* int: L2 Cache Size in Bytes */ #define HW_L3SETTINGS 21 /* int: L3 Cache Settings */ #define HW_L3CACHESIZE 22 /* int: L3 Cache Size in Bytes */ Don't have any time right now to knock up an example program but the stuff on the sysctl() man page (on MacOS X) should help. [EDIT] For my Quad G5 (2.5GHz PPC) I've got 64KB L1 instruction cache, 32KB L2 data cache and 1MB L2 Cache (per cpu). Last fiddled with by Greenbank on 2007-03-09 at 15:20 |
![]() |
![]() |
![]() |
#107 |
Jul 2005
2×193 Posts |
![]()
Must be compiled with -m64
Only tested on MacOS X on 64-bit PPC, not Linux (not sure if the sysctl interface is the same). Code:
#include <stdio.h> #include <stdint.h> #include <sys/sysctl.h> int main(void) { int64_t i; int ret; size_t len=8; ret=sysctlbyname( "hw.l1icachesize", &i, &len, NULL, 0 ); if( ret == -1 ) { perror( "sysctl:" ); } else { printf( "l1icachesize=%d\n", i ); } ret=sysctlbyname( "hw.l1dcachesize", &i, &len, NULL, 0 ); if( ret == -1 ) { perror( "sysctl:" ); } else { printf( "l1dcachesize=%d\n", i ); } ret=sysctlbyname( "hw.l2cachesize", &i, &len, NULL, 0 ); if( ret == -1 ) { perror( "sysctl:" ); } else { printf( "l2cachesize=%d\n", i ); } return(0); } l1dcachesize=32768 l2cachesize=1048576 which matches the real output. Last fiddled with by Greenbank on 2007-03-09 at 15:49 |
![]() |
![]() |
![]() |
#108 |
Mar 2003
New Zealand
13×89 Posts |
![]() |
![]() |
![]() |
![]() |
#109 | ||
Oct 2006
On a Suzuki Boulevard C90
2×3×41 Posts |
![]() Quote:
Quote:
Here's an ugly, probably non-portable hack: Code:
#include <stdio.h> int main(void) { FILE *fp; unsigned data = 0; fp = fopen("/proc/device-tree/cpus/PowerPC,970@0/d-cache-size","r"); fread(&data, sizeof(data), 1, fp); fclose(fp); printf("d-cache-size: %d\n", data); fp = fopen("/proc/device-tree/cpus/PowerPC,970@0/i-cache-size","r"); fread(&data, sizeof(data), 1, fp); fclose(fp); printf("i-cache-size: %d\n", data); fp = fopen("/proc/device-tree/cpus/PowerPC,970@0/l2-cache/d-cache-size","r"); fread(&data, sizeof(data), 1, fp); fclose(fp); printf("l2-cache/d-cache-size: %d\n", data); return(0); } |
||
![]() |
![]() |
![]() |
#110 | |
Mar 2003
New Zealand
100100001012 Posts |
![]() Quote:
Do you know which compiler symbols I should test to decide whether this code should be included? I assume __linux__ and __powerpc64__ and one other for the CPU type. |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
srsieve/sr2sieve enhancements | rogue | Software | 304 | 2021-11-06 13:51 |
32-bit of sr1sieve and sr2sieve for Win | pepi37 | Software | 5 | 2013-08-09 22:31 |
sr2sieve question | SaneMur | Information & Answers | 2 | 2011-08-21 22:04 |
sr2sieve client | mgpower0 | Prime Sierpinski Project | 54 | 2008-07-15 16:50 |
How to use sr2sieve | nuggetprime | Riesel Prime Search | 40 | 2007-12-03 06:01 |