![]() |
![]() |
#1 |
P90 years forever!
Aug 2002
Yeehaw, FL
2×4,127 Posts |
![]()
As some of you know the I've been rewriting the FFT code to bring big speed gains for SoB, LLR, OpenPFGW, and other projects. This rewrite is now complete.
The good news: Athlons (except 64-bit CPUs) are about 15% faster. ![]() The bad news: P3s are 33% slower. ![]() Athlon owners running Windows may want to try the new version at ftp://mersenne.org/gimps/p95v246.zip Let me know if you find any bugs. Running a doublecheck or two would be nice. In the meantime, I'll work on further fine tuning the new FFT code and see if I can recover some of the loss in P3 timings. Last fiddled with by Prime95 on 2004-12-08 at 00:26 |
![]() |
![]() |
![]() |
#2 |
6809 > 6502
"""""""""""""""""""
Aug 2003
101×103 Posts
2·3·11·167 Posts |
![]()
I went from about .156 to ~.114 a 13,xxx,xxx number on my 1.2 GHz ath.
THANKS!! |
![]() |
![]() |
![]() |
#3 |
Aug 2002
North San Diego Coun
19×43 Posts |
![]()
Neat!
I'll switch my Athlon 1900MP (2x 1.6GHz) box over to double checking as soon as the current factoring assignment finishes. Probably take a little less than a month for the first results. Given the various sizes of L1/L2 cache in the Athlon/Duron/Sempron processors, is the new code optimized for one version in particular? Would you like benchmarks posted? I'm sure SalemTheCat100 will be pleased ![]() -Scott- |
![]() |
![]() |
![]() |
#4 |
Jul 2004
Nowhere
809 Posts |
![]()
why dont u do best of boath world i mean you already has a good speed for pents why not check first what cpu is installed then use the right tweeking like a driver kinda only for fft right drivers match right processcer
|
![]() |
![]() |
![]() |
#5 |
6809 > 6502
"""""""""""""""""""
Aug 2003
101×103 Posts
2·3·11·167 Posts |
![]()
I am already doing DC's with my ath. Here are the benches in "non-safe mode" (ie how I typically run my machine.
The new: Code:
AMD Athlon(tm) processor CPU speed: 1127.85 MHz CPU features: RDTSC, CMOV, PREFETCH, MMX L1 cache size: 64 KB L2 cache size: 256 KB L1 cache line size: 64 bytes L2 cache line size: 64 bytes L1 TLBS: 24 L2 TLBS: 256 Prime95 version 24.6, RdtscTiming=1 Best time for 512K FFT length: 65.019 ms. Best time for 640K FFT length: 87.656 ms. Best time for 768K FFT length: 105.915 ms. Best time for 896K FFT length: 129.876 ms. Best time for 1024K FFT length: 145.119 ms. Best time for 1280K FFT length: 195.657 ms. Best time for 1536K FFT length: 235.093 ms. Best time for 1792K FFT length: 280.729 ms. Best time for 2048K FFT length: 321.764 ms. Code:
AMD Athlon(tm) processor CPU speed: 1127.80 MHz CPU features: RDTSC, CMOV, PREFETCH, MMX L1 cache size: 64 KB L2 cache size: 256 KB L1 cache line size: 64 bytes L2 cache line size: 64 bytes L1 TLBS: 24 L2 TLBS: 256 Prime95 version 23.5, RdtscTiming=1 Best time for 384K FFT length: 78.067 ms. Best time for 448K FFT length: 90.489 ms. Best time for 512K FFT length: 97.280 ms. Best time for 640K FFT length: 123.678 ms. Best time for 768K FFT length: 149.556 ms. Best time for 896K FFT length: 175.282 ms. Best time for 1024K FFT length: 200.824 ms. Best time for 1280K FFT length: 271.674 ms. Best time for 1536K FFT length: 328.457 ms. Best time for 1792K FFT length: 399.580 ms. Best time for 2048K FFT length: 477.366 ms. |
![]() |
![]() |
![]() |
#6 |
Jun 2004
Chicago
22·7 Posts |
![]()
my athlon XP 2400+ seems to working well with it, times are down, productivity is up... excellent work.
|
![]() |
![]() |
![]() |
#7 |
Jun 2004
Chicago
1C16 Posts |
![]()
The new:
Code:
AMD Athlon(tm) XP 2400+ CPU speed: 1991.65 MHz CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE L1 cache size: 64 KB L2 cache size: 256 KB L1 cache line size: 64 bytes L2 cache line size: 64 bytes L1 TLBS: 32 L2 TLBS: 256 Prime95 version 24.6, RdtscTiming=1 Best time for 512K FFT length: 36.591 ms. Best time for 640K FFT length: 47.987 ms. Best time for 768K FFT length: 58.558 ms. Best time for 896K FFT length: 69.761 ms. Best time for 1024K FFT length: 78.666 ms. Best time for 1280K FFT length: 109.193 ms. Best time for 1536K FFT length: 132.532 ms. Best time for 1792K FFT length: 157.138 ms. Best time for 2048K FFT length: 176.963 ms. Code:
AMD Athlon(tm) XP 2400+ CPU speed: 1991.13 MHz CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE L1 cache size: 64 KB L2 cache size: 256 KB L1 cache line size: 64 bytes L2 cache line size: 64 bytes L1 TLBS: 32 L2 TLBS: 256 Prime95 version 23.8, RdtscTiming=1 Best time for 384K FFT length: 46.405 ms. Best time for 448K FFT length: 57.387 ms. Best time for 512K FFT length: 59.590 ms. Best time for 640K FFT length: 77.468 ms. Best time for 768K FFT length: 91.095 ms. Best time for 896K FFT length: 108.234 ms. Best time for 1024K FFT length: 121.520 ms. Best time for 1280K FFT length: 165.685 ms. Best time for 1536K FFT length: 190.610 ms. Best time for 1792K FFT length: 242.729 ms. Best time for 2048K FFT length: 273.883 ms. Last fiddled with by jebeagles on 2004-12-08 at 16:33 |
![]() |
![]() |
![]() |
#8 |
Dec 2003
Paisley Park & Neverland
B916 Posts |
![]()
I don't know if this has anything to do with the new version. It's not hamful either, just unexpected:
I only did LMH Factoring lately, so I downloaded 24.6 and requested some doublechecks because my queue was empty. Now: All of them were expected to be completed by tomorrow. So they kept coming in and in and in... till I clicked Stop! Why were they expected to complete immediately? |
![]() |
![]() |
![]() |
#9 |
Aug 2002
3×43×67 Posts |
![]()
Will there be a Mprime version?
|
![]() |
![]() |
![]() |
#10 | |
"Juan Tutors"
Mar 2004
571 Posts |
![]() Quote:
![]() |
|
![]() |
![]() |
![]() |
#11 |
"Nancy"
Aug 2002
Alexandria
2,467 Posts |
![]()
Dear George,
does this mean you implemented Colin's general DWT for non-SSE2 architectures as well? Alex |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
LLR beta Version 3.8.13 (deprecated) | Jean Penné | Software | 111 | 2015-01-26 21:41 |
Prime95 beta version 28.3 | Prime95 | Software | 68 | 2014-02-23 05:42 |
Beta version 24.12 available | Prime95 | Software | 33 | 2005-06-14 13:19 |
Early Beta of version 24.11 | Prime95 | Software | 113 | 2005-05-24 17:05 |
Beta version of PRP | Prime95 | PSearch | 15 | 2004-09-17 19:21 |