mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2004-02-21, 02:23   #1
Digital Concepts
 
Digital Concepts's Avatar
 
Aug 2002

2×33 Posts
Default Any changes planned for the larger Prescott cache?

Over in the hardware forum I found two threads talking about cache...

Larger Prescott Cache = Speed Improvement? and P4 Prescott - 31 Stage Pipeline ? Bad news for Prime95?.

But the question still remains. Will modding the client to use a larger cache space afforded by Prescott provide much of a gain? Without said mod, I'm not seeing much advantage to an upgrade (actually maybe a bit worse).

Notice below, the current client doesn't properly recognize the cache size for the Prescott!

3.361GHz P4E (Prescott) vs 3.000Ghz P4C

Intel(R) Pentium(R) 4 CPU 2.80GHz
CPU speed: 3361.04 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: unknown
L2 cache size: 1024 KB
L1 cache line size: unknown
L2 cache line size: 128 bytes
TLBS: 64
Prime95 version 23.8, RdtscTiming=1
Best time for 384K FFT length: 12.593 ms.
Best time for 448K FFT length: 15.319 ms.
Best time for 512K FFT length: 17.278 ms.
Best time for 640K FFT length: 20.659 ms.
Best time for 768K FFT length: 24.788 ms.
Best time for 896K FFT length: 29.935 ms.
Best time for 1024K FFT length: 33.392 ms.
Best time for 1280K FFT length: 44.390 ms.
Best time for 1536K FFT length: 53.239 ms.
Best time for 1792K FFT length: 63.824 ms.
Best time for 2048K FFT length: 71.640 ms.

Intel(R) Pentium(R) 4 CPU 2.40GHz
CPU speed: 3000.23 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: 8 KB
L2 cache size: 512 KB
L1 cache line size: 64 bytes
L2 cache line size: 128 bytes
TLBS: 64
Prime95 version 23.7, RdtscTiming=1
Best time for 384K FFT length: 12.224 ms.
Best time for 448K FFT length: 14.527 ms.
Best time for 512K FFT length: 16.582 ms.
Best time for 640K FFT length: 19.824 ms.
Best time for 768K FFT length: 24.127 ms.
Best time for 896K FFT length: 28.491 ms.
Best time for 1024K FFT length: 32.009 ms.
Best time for 1280K FFT length: 42.270 ms.
Best time for 1536K FFT length: 51.663 ms.
Best time for 1792K FFT length: 61.227 ms.
Best time for 2048K FFT length: 69.714 ms.
Digital Concepts is offline   Reply With Quote
Old 2004-02-21, 03:07   #2
ColdFury
 
ColdFury's Avatar
 
Aug 2002

5008 Posts
Default

Quote:
Notice below, the current client doesn't properly recognize the cache size for the Prescott!
I believe this is because Prime95 just looks up the processor type and retrieves all the cache info from a table. Since it doesn't recognize the processor, it doesn't have a table entry for it.

If it's defaulting to a code path that's non-optimal, that could be a cause for some lack of performance.
ColdFury is offline   Reply With Quote
Old 2004-02-21, 16:01   #3
PrimeCruncher
 
PrimeCruncher's Avatar
 
Sep 2003
Borg HQ, Delta Quadrant

2·33·13 Posts
Default

Quote:
Originally Posted by ColdFury
If it's defaulting to a code path that's non-optimal, that could be a cause for some lack of performance.
Some lack!

Quote:
Originally Posted by Digital Concepts
Best time for 2048K FFT length: 71.640 ms.
...
Best time for 2048K FFT length: 69.714 ms.
Prime95 is 2ms SLOWER on the Prescott, which happens to be running 300 MHz FASTER!

Last fiddled with by PrimeCruncher on 2004-02-21 at 16:02
PrimeCruncher is offline   Reply With Quote
Old 2004-02-21, 18:14   #4
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

22·43·47 Posts
Default

Prime95 does not care about the L1 cache size, but I'll look into why it is not recognizing the CPUID return code.

There is code in prime95 to use a 1MB L2 cache. It is in theory a little faster for some FFT sizes, but not proven. Please post benchmarks with and without "CpuL2CacheSize=512" in local.ini. Thanks.

Last fiddled with by Prime95 on 2004-02-21 at 18:15
Prime95 is offline   Reply With Quote
Old 2004-02-22, 17:41   #5
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

192 Posts
Default

I think, some of the reasons for Prescott's performance are that the latencies for many instructions (including SSE2) and cache accesses increased.

Intel's main goals in developing this CPU were different to what many expected. It was not meant to increase per clock performance but to allow higher clock frequencies. Besides this the CPU will have a positive effect on Intel due to production using 300mm wafers and the 90nm node, which allows smaller dice (even with larger cache). It will be hard to modify a heavily optimized app to reach the same IPC on Prescott if key performance factors like latencies changed that much that they cause a severe slowdown compared to Northwood.

As I wrote in another thread, one advantage for Prescott will be clock speed. Near the end of this year the fastest sold Prescotts will outperform any Northwood ever sold in Prime95.

Citing the CPU-car-analogy: Prescott is like moving into the next gear for Intel.
Dresdenboy is offline   Reply With Quote
Old 2004-02-24, 13:56   #6
TauCeti
 
TauCeti's Avatar
 
Mar 2003
Braunschweig, Germany

2×113 Posts
Default

It's only slightly off-topic here, but i strongly recommend to anyone interested in chip-design to watch this very interesting 90 minute talk from Bob Colwell, former Intel chief architect of the IA32.
TauCeti is offline   Reply With Quote
Old 2004-03-04, 23:58   #7
GSV3MiaC
 
Jan 2004
Shropshire, UK

24 Posts
Default

Quote:
Originally Posted by Dresdenboy
Citing the CPU-car-analogy: Prescott is like moving into the next gear for Intel.

Yes, but it's the next =lowest= gear - i.e. when they get the engine revving to 3.6Ghz, they may actually achieve the same roadspeed as they had before.
GSV3MiaC is offline   Reply With Quote
Old 2004-03-05, 07:54   #8
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

192 Posts
Default

Quote:
Originally Posted by GSV3MiaC
Yes, but it's the next =lowest= gear - i.e. when they get the engine revving to 3.6Ghz, they may actually achieve the same roadspeed as they had before.
I thought of the amount of gas vs. acceleration relation
Dresdenboy is offline   Reply With Quote
Old 2004-03-06, 06:54   #9
QuintLeo
 
QuintLeo's Avatar
 
Oct 2002
Lost in the hills of Iowa

26×7 Posts
Default

Quote:
Originally Posted by Dresdenboy
I thought of the amount of gas vs. acceleration relation
Oh, you've been seeing the "heat problem" reports too, eh?

QuintLeo is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Planned Abscesses davar55 Game 3 - ♚♛♝♞♜♟ - Morphy's Maniacs 3 2015-02-19 09:24
Planned downtime on GB servers mdettweiler No Prime Left Behind 35 2009-11-22 01:50
larger L2 cache, slower iterations? ixfd64 Hardware 3 2008-05-19 20:46
Prescott ET_ Hardware 6 2004-07-06 02:38
Larger Prescott Cache = Speed Improvement? ColdFury Hardware 7 2003-10-12 16:43

All times are UTC. The time now is 09:03.


Thu Dec 1 09:03:40 UTC 2022 up 105 days, 6:32, 0 users, load averages: 0.71, 1.02, 1.10

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔