mersenneforum.org  

Go Back   mersenneforum.org > Other Stuff > Archived Projects > Prime Cullen Prime

 
 
Thread Tools
Old 2007-07-22, 23:21   #34
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13×89 Posts
Default

Quote:
Originally Posted by jasong View Post
The 64-bit code works perfectly. When I unzipped the 32-bit version to the same directory and tried to run it, the OS claimed the file didn't exist, even though the 'ls' command listed it as being there.
OK, that is probably because you don't have 32-bit system libraries installed.

My main concern was to check that the 64-bit code works correctly, as it hasn't been tested before, so thanks for helping with that.
geoff is offline  
Old 2007-07-23, 04:50   #35
jasong
 
jasong's Avatar
 
"Jason Goatcher"
Mar 2005

5·701 Posts
Default

Quote:
Originally Posted by geoff View Post
OK, that is probably because you don't have 32-bit system libraries installed.

My main concern was to check that the 64-bit code works correctly, as it hasn't been tested before, so thanks for helping with that.
So, I won't be able to run ANY 32-bit apps?
jasong is offline  
Old 2007-07-24, 00:29   #36
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13×89 Posts
Default

Quote:
Originally Posted by jasong View Post
So, I won't be able to run ANY 32-bit apps?
If the problem is missing 32-bit libraries, then I guess you will only be able to run statically linked 32-bit apps. It should be simple enoughto install the 32-bit libraries though, unless you are using a live CD or something like that.
geoff is offline  
Old 2007-07-24, 03:03   #37
jasong
 
jasong's Avatar
 
"Jason Goatcher"
Mar 2005

5·701 Posts
Default

Quote:
Originally Posted by geoff View Post
If the problem is missing 32-bit libraries, then I guess you will only be able to run statically linked 32-bit apps. It should be simple enoughto install the 32-bit libraries though, unless you are using a live CD or something like that.
I've got more than a week before gcwsieve completes the range, so not a big concern.
jasong is offline  
Old 2007-07-30, 23:49   #38
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13×89 Posts
Default

In version 1.0.9 the 32-bit code was actually faster than the 64-bit code. But in version 1.0.10 the 64-bit code is faster again.

A quick test run on a primegrid range with a C2D @ 2.67GHz:
Code:
version 1.0.9 64-bit:    61 kp/s
version 1.0.11 32-bit:   83 kp/s
version 1.0.11 64-bit:  100 kp/s
edit: from the Cullen 2M sieve, p=1000e9

Last fiddled with by geoff on 2007-07-30 at 23:51
geoff is offline  
Old 2007-08-01, 01:16   #39
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

2·3·937 Posts
Default

Version 1.0.11 on PowerPC64 (at 2.5 GHz):

457243 p/sec


Last fiddled with by rogue on 2007-08-01 at 01:17
rogue is offline  
Old 2007-08-01, 02:01   #40
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13×89 Posts
Default

Quote:
Originally Posted by rogue View Post
Version 1.0.11 on PowerPC64 (at 2.5 GHz):

457243 p/sec

Which sieve file was that with? If it is with the current 5.0M < n < 7.5M file for this project then that is a good time, but not too surprising.

For comparison a 2.9GHz P4 does about 500 kp/s on that file at p=100e9.

There is room for improvement in the ppc64 code. Currently the ppc64 uses the same method as the non-SSE2 x86 machines, which process the candidates one at a time, while the SSE2 and x86-64 code does them 4 at a time.
geoff is offline  
Old 2007-08-01, 02:32   #41
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

15F616 Posts
Default

That was with the above file (after fixing the input). It was at p=1000e9.

Are you saying it doesn't have the improvements that were done to sr2sieve and sr5sieve?
rogue is offline  
Old 2007-08-01, 03:36   #42
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

48516 Posts
Default

Quote:
Originally Posted by rogue View Post
That was with the above file (after fixing the input). It was at p=1000e9.

Are you saying it doesn't have the improvements that were done to sr2sieve and sr5sieve?
No, the main loop in gcwsieve doesn't benefit from those improvements because each new computation a*b (mod p) has new values of a and b.

The x86 and ppc64 main loop looks a bit like this:
Code:
for (i=0; i<n; i++)
  X[i] = X[i] * Y[i] (mod p)
  if (X[i] == Z[i])
    /* Found a factor */
With the SSE2 and (from 1.0.10) the x86-64 versions it is vectorised a bit like this:
Code:
m = n/4
for (i = 0; i < m; i++)
  X[i+0*m] = X[i+0*m] * Y[i+0*m] (mod p)
  ...
  X[i+3*m] = X[i+3*m] * Y[i+3*m] (mod p)
  if (X[i+0*m] == Z[i+0*m] || .. || X[i+3*m] == Z[i+3*m])
    /* Found a factor */
The vectorisation can't be done automatically by the C compiler because the initial values of X[0], X[m], X[2*m], X[3*m] can't be inferred from the original loop. (They are computed seperately with powmod).
geoff is offline  
Old 2007-08-01, 04:23   #43
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13·89 Posts
Default

Just to correct the previous post: Yes the improvements to the ppc64 assembler are in gcwsieve 1.0.10. They may speed up some other parts of the code, but they don't help with the main loop, there is still room for improvement there.
geoff is offline  
Old 2007-08-03, 01:24   #44
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13×89 Posts
Default gcwsieve 1.0.13

This version fixes a memory allocation bug that could cause the program to abort at the end of a sieve range, or a memory leak if there were multiple ranges queued up in the work file.

No work needs to be repeated, as all results for the range would have been written to file before the abort. The affected builds were:

Windows: versions 1.0.0 - 1.0.10.
OS X: versions 1.0.0 - 1.0.12.

The bug didn't affect the Linux builds. Thanks rogue for finding it.
geoff is offline  
 

Thread Tools


All times are UTC. The time now is 10:35.

Thu Apr 9 10:35:00 UTC 2020 up 15 days, 8:08, 1 user, load averages: 1.10, 1.30, 1.30

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.