mersenneforum.org  

Go Back   mersenneforum.org > Prime Search Projects > Sierpinski/Riesel Base 5

Reply
 
Thread Tools
Old 2007-02-17, 02:26   #100
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13×89 Posts
Default

Quote:
Originally Posted by Greenbank View Post
And the program is not optimised for Riesel at all. I wanted Sierpinski sieving to be as fast as possible. Sorting it out properly for Riesel is on my big list of stuff to do.
I think part of the reason that sr2sieve does well with riesel.dat (I hear that it is even faster at riesel.dat than JJsieve on x86) is not so much the large number of k, but more because of the narrower range of n. Perhaps because it puts less effort into trying to reduce the work done in BSGS, as the range of n widens and BSGS becomes more expensive, so sr2sieve becomes slower than proth_sieve.

These are some timings for sr2sieve 1.4.x vs proth_sieve 0.42 done on two of my machines, both running Debian Linux.

I tested at p=100e12 (100T) because the proth_sieve speed starts to drop when p becomes too much larger than this, and I think this may be a problem with the code rather than a true indication of performance. (The speed per p should increase as p increases, as there are fewer primes to test).

Times are kp/s (1000's increase in p per CPU second) to 3 s.f. where known. The hyperthreaded times were taken by running two instances of the program and adding the kp/s times for both.


Pentium 3 @ 600MHz (Coppermine EB, 16Kb L1, 256Kb L2), p=100e12
Code:
                             8k SoB.dat    19k SoB.dat    69k riesel.dat
                             ----------    -----------    --------------
proth_sieve_cmov 0.42        151           86             31
sr2sieve-i686 1.4.18         122           75.9           45.6
sr2sieve-i686 1.4.21         138           81.5           47.0
sr2sieve-i686 1.4.23         145           85.4           48.9
Pentium 4 @ 2.9GHz (Northwood C, 8Kb L1, 512Kb L2), p=100e12
Code:
Single thread                8k SoB.dat    19k SoB.dat    69k riesel.dat
-------------                ----------    -----------    --------------
proth_sieve_sse2 0.42        342           201            82
sr2sieve-pentium4 1.4.18     279           177            107
sr2sieve-pentium4 1.4.21     318           189            113
sr2sieve-pentium4 1.4.23     328           197            116

Two hyperthreads             8k SoB.dat    19k SoB.dat    69k riesel.dat
----------------             ----------    -----------    --------------
proth_sieve_sse2 0.42        554           330            130
sr2sieve-pentium4 1.4.18     413           262            157
sr2sieve-pentium4 1.4.21     469           279            162
sr2sieve-pentium4 1.4.23     488           288            167
geoff is offline   Reply With Quote
Old 2007-02-20, 12:00   #101
Greenbank
 
Greenbank's Avatar
 
Jul 2005

2·193 Posts
Default

That looks great Geoff.

I hope my message didn't come over as "my sieve is faster than yours", it certainly wasn't meant that way.

If we work together and share code/results we can make each others code even faster!
Greenbank is offline   Reply With Quote
Old 2007-02-21, 03:58   #102
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

22058 Posts
Default

Quote:
Originally Posted by Greenbank View Post
I hope my message didn't come over as "my sieve is faster than yours", it certainly wasn't meant that way.
Not at all :-) I just found it interesting that the SoB.dat times could be so much faster than the riesel.dat times, when for sr2sieve it is the other way around.

I don't know how much of that is due to the effort to make proth sieve run fast for SoB.dat without regard to riesel.dat speed, and how much is because of differences between the proth sieve and sr2sieve algorithms.

I suspect that sr2sieve does a lot less work in trying to eliminate candidates before running BSGS, and that may be a better approach when the range of n is small. The 20 million range of riesel.dat vs the 50 million range of SoB.dat could be the important factor, rather than the number of k in the sieve.
geoff is offline   Reply With Quote
Old 2007-03-09, 00:07   #103
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

100100001012 Posts
Default

Does anyone know how to detect the size of the L1 and L2 data cache on ppc64? Is 32Kb L1, 512Kb L2 a reasonable default if it can't be detected?
geoff is offline   Reply With Quote
Old 2007-03-09, 03:27   #104
BlisteringSheep
 
BlisteringSheep's Avatar
 
Oct 2006
On a Suzuki Boulevard C90

F616 Posts
Default

Geoff,
That's what it has been for every ppc64 that I've encountered. I don't know an easy way to do it for Linux; there are external tools but they can't be depended upon. For example, on my home PowerMac, /proc/cpuinfo shows the 512K unified L2 cache, but doesn't mention the L1. lshw says that the same machine has 128 terabytes of L1 and 2 petabytes of L2. However, lshw on the IBM blades shows the correct L1 & L2.
BlisteringSheep is offline   Reply With Quote
Old 2007-03-09, 13:21   #105
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

26·103 Posts
Default

Quote:
Originally Posted by geoff View Post
Does anyone know how to detect the size of the L1 and L2 data cache on ppc64? Is 32Kb L1, 512Kb L2 a reasonable default if it can't be detected?
For 64-bit PowerPC CPUs, 512Kb is the minimum L2 cache size. Some have 1Mb. I don't know if there is an easy way to determine the L2 cache size.
rogue is online now   Reply With Quote
Old 2007-03-09, 15:19   #106
Greenbank
 
Greenbank's Avatar
 
Jul 2005

1100000102 Posts
Default

MacOS X command line:-

sysctl hw.l1icachesize
sysctl hw.l1dcachesize
sysctl hw.l2cachesize

So I'm guessing there'll be somewhere in the sysctl() function call...indeed, in /usr/include/sys/sysctl.h

#define HW_L1ICACHESIZE 17 /* int: L1 I Cache Size in Bytes */
#define HW_L1DCACHESIZE 18 /* int: L1 D Cache Size in Bytes */
#define HW_L2SETTINGS 19 /* int: L2 Cache Settings */
#define HW_L2CACHESIZE 20 /* int: L2 Cache Size in Bytes */
#define HW_L3SETTINGS 21 /* int: L3 Cache Settings */
#define HW_L3CACHESIZE 22 /* int: L3 Cache Size in Bytes */

Don't have any time right now to knock up an example program but the stuff on the sysctl() man page (on MacOS X) should help.

[EDIT] For my Quad G5 (2.5GHz PPC) I've got 64KB L1 instruction cache, 32KB L2 data cache and 1MB L2 Cache (per cpu).

Last fiddled with by Greenbank on 2007-03-09 at 15:20
Greenbank is offline   Reply With Quote
Old 2007-03-09, 15:37   #107
Greenbank
 
Greenbank's Avatar
 
Jul 2005

6028 Posts
Default

Must be compiled with -m64

Only tested on MacOS X on 64-bit PPC, not Linux (not sure if the sysctl interface is the same).
Code:
#include <stdio.h>
#include <stdint.h>
#include <sys/sysctl.h>

int main(void)
{
        int64_t i;
        int ret;
        size_t len=8;
        ret=sysctlbyname( "hw.l1icachesize", &i, &len, NULL, 0 );
        if( ret == -1 ) {
                perror( "sysctl:" );
        } else {
                printf( "l1icachesize=%d\n", i );
        }
        ret=sysctlbyname( "hw.l1dcachesize", &i, &len, NULL, 0 );
        if( ret == -1 ) {
                perror( "sysctl:" );
        } else {
                printf( "l1dcachesize=%d\n", i );
        }
        ret=sysctlbyname( "hw.l2cachesize", &i, &len, NULL, 0 );
        if( ret == -1 ) {
                perror( "sysctl:" );
        } else {
                printf( "l2cachesize=%d\n", i );
        }
        return(0);
}
l1icachesize=65536
l1dcachesize=32768
l2cachesize=1048576

which matches the real output.

Last fiddled with by Greenbank on 2007-03-09 at 15:49
Greenbank is offline   Reply With Quote
Old 2007-03-09, 22:05   #108
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13×89 Posts
Default

Quote:
Originally Posted by Greenbank View Post
Only tested on MacOS X on 64-bit PPC, not Linux (not sure if the sysctl interface is the same).
Thanks, I'll use this in the next version, with a 32Kb/512Kb default if sysctl fails or the detected value doesn't make sense.
geoff is offline   Reply With Quote
Old 2007-03-10, 04:47   #109
BlisteringSheep
 
BlisteringSheep's Avatar
 
Oct 2006
On a Suzuki Boulevard C90

2·3·41 Posts
Default

Quote:
Originally Posted by Greenbank View Post
Only tested on MacOS X on 64-bit PPC, not Linux (not sure if the sysctl interface is the same).
Unfortunately, the interfaces are very different, and the Linux version provides completely different information.

Quote:
Originally Posted by geoff View Post
Thanks, I'll use this in the next version, with a 32Kb/512Kb default if sysctl fails or the detected value doesn't make sense.
As an aside, I did figure out what's wrong with lshw; it reads the sizes into an unsigned long, but they're only int long.

Here's an ugly, probably non-portable hack:
Code:
#include <stdio.h>

int main(void)
{
  FILE *fp;
  unsigned data = 0;


  fp = fopen("/proc/device-tree/cpus/PowerPC,970@0/d-cache-size","r");
  fread(&data, sizeof(data), 1, fp);
  fclose(fp);
  printf("d-cache-size: %d\n", data);

  fp = fopen("/proc/device-tree/cpus/PowerPC,970@0/i-cache-size","r");
  fread(&data, sizeof(data), 1, fp);
  fclose(fp);
  printf("i-cache-size: %d\n", data);

  fp = fopen("/proc/device-tree/cpus/PowerPC,970@0/l2-cache/d-cache-size","r");
  fread(&data, sizeof(data), 1, fp);
  fclose(fp);
  printf("l2-cache/d-cache-size: %d\n", data);

  return(0);
}
BlisteringSheep is offline   Reply With Quote
Old 2007-03-13, 03:29   #110
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

22058 Posts
Default

Quote:
Originally Posted by BlisteringSheep View Post
fp = fopen("/proc/device-tree/cpus/PowerPC,970@0/d-cache-size","r");
fread(&data, sizeof(data), 1, fp);
Are the sizes in bytes or kilobytes?

Do you know which compiler symbols I should test to decide whether this code should be included? I assume __linux__ and __powerpc64__ and one other for the CPU type.
geoff is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
srsieve/sr2sieve enhancements rogue Software 304 2021-11-06 13:51
32-bit of sr1sieve and sr2sieve for Win pepi37 Software 5 2013-08-09 22:31
sr2sieve question SaneMur Information & Answers 2 2011-08-21 22:04
sr2sieve client mgpower0 Prime Sierpinski Project 54 2008-07-15 16:50
How to use sr2sieve nuggetprime Riesel Prime Search 40 2007-12-03 06:01

All times are UTC. The time now is 15:54.


Wed May 18 15:54:54 UTC 2022 up 34 days, 13:56, 1 user, load averages: 2.44, 2.23, 2.21

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔