![]() |
![]() |
#34 | |
Mar 2003
New Zealand
100100001012 Posts |
![]() Quote:
My main concern was to check that the 64-bit code works correctly, as it hasn't been tested before, so thanks for helping with that. |
|
![]() |
![]() |
#35 |
"Jason Goatcher"
Mar 2005
3×7×167 Posts |
![]() |
![]() |
![]() |
#36 |
Mar 2003
New Zealand
22058 Posts |
![]()
If the problem is missing 32-bit libraries, then I guess you will only be able to run statically linked 32-bit apps. It should be simple enoughto install the 32-bit libraries though, unless you are using a live CD or something like that.
|
![]() |
![]() |
#37 |
"Jason Goatcher"
Mar 2005
3·7·167 Posts |
![]()
I've got more than a week before gcwsieve completes the range, so not a big concern.
|
![]() |
![]() |
#38 |
Mar 2003
New Zealand
13·89 Posts |
![]()
In version 1.0.9 the 32-bit code was actually faster than the 64-bit code. But in version 1.0.10 the 64-bit code is faster again.
A quick test run on a primegrid range with a C2D @ 2.67GHz: Code:
version 1.0.9 64-bit: 61 kp/s version 1.0.11 32-bit: 83 kp/s version 1.0.11 64-bit: 100 kp/s Last fiddled with by geoff on 2007-07-30 at 23:51 |
![]() |
![]() |
#39 |
"Mark"
Apr 2003
Between here and the
2·23·157 Posts |
![]()
Version 1.0.11 on PowerPC64 (at 2.5 GHz):
457243 p/sec ![]() Last fiddled with by rogue on 2007-08-01 at 01:17 |
![]() |
![]() |
#40 |
Mar 2003
New Zealand
13·89 Posts |
![]()
Which sieve file was that with? If it is with the current 5.0M < n < 7.5M file for this project then that is a good time, but not too surprising.
For comparison a 2.9GHz P4 does about 500 kp/s on that file at p=100e9. There is room for improvement in the ppc64 code. Currently the ppc64 uses the same method as the non-SSE2 x86 machines, which process the candidates one at a time, while the SSE2 and x86-64 code does them 4 at a time. |
![]() |
![]() |
#41 |
"Mark"
Apr 2003
Between here and the
2×23×157 Posts |
![]()
That was with the above file (after fixing the input). It was at p=1000e9.
Are you saying it doesn't have the improvements that were done to sr2sieve and sr5sieve? |
![]() |
![]() |
#42 | |
Mar 2003
New Zealand
13·89 Posts |
![]() Quote:
The x86 and ppc64 main loop looks a bit like this: Code:
for (i=0; i<n; i++) X[i] = X[i] * Y[i] (mod p) if (X[i] == Z[i]) /* Found a factor */ Code:
m = n/4 for (i = 0; i < m; i++) X[i+0*m] = X[i+0*m] * Y[i+0*m] (mod p) ... X[i+3*m] = X[i+3*m] * Y[i+3*m] (mod p) if (X[i+0*m] == Z[i+0*m] || .. || X[i+3*m] == Z[i+3*m]) /* Found a factor */ |
|
![]() |
![]() |
#43 |
Mar 2003
New Zealand
13·89 Posts |
![]()
Just to correct the previous post: Yes the improvements to the ppc64 assembler are in gcwsieve 1.0.10. They may speed up some other parts of the code, but they don't help with the main loop, there is still room for improvement there.
|
![]() |
![]() |
#44 |
Mar 2003
New Zealand
115710 Posts |
![]()
This version fixes a memory allocation bug that could cause the program to abort at the end of a sieve range, or a memory leak if there were multiple ranges queued up in the work file.
No work needs to be repeated, as all results for the range would have been written to file before the abort. The affected builds were: Windows: versions 1.0.0 - 1.0.10. OS X: versions 1.0.0 - 1.0.12. The bug didn't affect the Linux builds. Thanks rogue for finding it. |
![]() |