mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   Early Beta of version 24.11 (https://www.mersenneforum.org/showthread.php?t=3934)

Prime95 2005-03-30 23:21

Early Beta of version 24.11
 
Version 24.11 has two major improvements.

1) There is a 64-bit version for 64-bit Windows. It contains all new faster factoring code.

2) An Athlon64 optimization was found for both the 32-bit and 64-bit versions of prime95. You'll get about a 15% performance boost. Still not as fast as a similarly clocked P4, but it is much closer.

Only AMD64 machines should try this version. It is not well QA'ed. Save your work before installing, just in case there is a problem.

You can download from:

Windows: [url]ftp://mersenne.org/gimps/p95v2411.zip[/url]
Windows 64-bit: [url]ftp://mersenne.org/gimps/p64v2411.zip[/url]
Windows NT service: [url]ftp://mersenne.org/gimps/winnt2411.zip[/url]
Linux: [url]ftp://mersenne.org/gimps/mprime2411.tar.gz[/url]
Linux (static link): [url]ftp://mersenne.org/gimps/sprime2411.tar.gz[/url]

Let me know if you find any problems.

Prime95 2005-03-30 23:23

and source is at [url]ftp://mersenne.org/gimps/source24.zip[/url]

If any AMD64 gurus want to try and improve the factoring code in factor64.asm, I'm more than willing to incorporate any improvements.

Peter Nelson 2005-03-31 07:16

Excellent!
 
Excellent to see this and AMD users will be delighted to finally get their hands on it.

Few thoughts occured....

a) will you be building a [B]64 bit version for LINUX [/B] users (eg Suse 9.2 on AMD64 and other distros which are 64-bit)?

b) as the Pentium 6xx series now have Intel EM64T which has almost same instruction set as Athlon64, [B]might the code run on 6xx too?[/B]
I appreciate the architectures are different eg cache, and that there are very few of these in the field yet but it might be forward thinking to support Intel's 64 bit efforts too.

c) will you be able to maintain a single source code tree or have to have different versions for Intel/AMD/32/64 bits? The target architecture (32/64) could be specified at compile time, other differences as existing code by cpu detection. The range of cpu types is likely to further increase ie dualcore where two processors share memory subsystem which will slow mem accesses outside the L2 cache. This may mean different optimisations.

d) please would it be possible to include a short test of trial factoring speed in the benchmark (of the release version) because this would be very useful to know and compare the benefit of your optimisations. Maybe it could just appear when Fullbench option is specified.

sonjohan 2005-03-31 11:17

It is likely to be a silly question, but how do I know wether I have a 64-bit Windows or not??

dsouza123 2005-03-31 13:30

Unless you downloaded Window XP PRO 64 or got it through MSDN
you don't have it. It is/was a beta release from Microsoft, and
hasn't shipped on PCs for retail yet.

Has gone Gold today, shipping to manufacturing, so probably on
shipping PCs later part of April.

Prime95 2005-03-31 14:47

a) I have no immediate plans to try a 64-bit Linux port. I'm not sure if objcopy can convert the MASM object files into ELF64 format.

b) The code should run on Intel 6xx machines too.

c) There is just one source tree. Right now there are 4 different versions of the FFT:

x87 - optimized for Pentium Pro runs on any x87 machine
x87 - optimized for Athlons (also used by P3s and later)
SSE2 - P4 optimized
SSE2 - AMD64 optimized

d) Maybe

penguin22 2005-04-01 17:39

I have been using version 24.6 with the CPUSupportsSSE2=0 option and was wondering if it would be better to delete that line now that this version is out and has support for the features in the A64?

Thanks for your hard work.

Jeff Gilchrist 2005-04-01 17:51

Can anyone with A64/Opteron boxes post some benchmarks with 23.x, 24.6, and 24.11 comparisons please?

Prime95 2005-04-01 20:20

[QUOTE=penguin22]I have been using version 24.6 with the CPUSupportsSSE2=0 option and was wondering if it would be better to delete that line now that this version is out and has support for the features in the A64?[/QUOTE]

Time it both ways. I'm betting the SSE2 code is now faster for the same FFT size.

TheJudger 2005-04-01 20:59

[QUOTE=Prime95]a) I have no immediate plans to try a 64-bit Linux port. I'm not sure if objcopy can convert the MASM object files into ELF64 format.
[/QUOTE]

very sad to hear :(

is it possible for you to make a binary with an other asm compiler for non-windoze os'ses (even if it's a bit slower)?

thejudger

Prime95 2005-04-01 21:21

[QUOTE=TheJudger]is it possible for you to make a binary with an other asm compiler for non-windoze os'ses (even if it's a bit slower)?[/QUOTE]

No. Converting all that assembly code to another assembler format would be a monumental task.

I'm sure the binutils guys will make objcopy work eventually if they haven't done so already. The source is available if someone wants to try a 64-bit linux port.

My next task is more optimizations, especially making use of the extra SSE2 registers in 64-bit mode. Don't expect much - a few percent on AMD64, perhaps a little more on the P4.


All times are UTC. The time now is 06:40.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.