mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2006-03-10, 12:19   #1
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

192 Posts
Default Upcoming Prime95 monsters (processors)

This week definitely was Conroe's week, when we're speaking about processors. This CPU, which will later this year, will cause a boost in Prime95 performance per clock thanks to full width 128 bit SSE execution with throughput of 1/cycle and the bigger and better cache subsystem.

It looks like AMD's next core with improved FPU will arrive not earlier than in 2007.

For a start a nice article on Realworldtech:
http://www.realworldtech.com/page.cf...0906143144&p=1
Dresdenboy is offline   Reply With Quote
Old 2006-03-10, 15:53   #2
dsouza123
 
dsouza123's Avatar
 
Sep 2002

2·331 Posts
Default

Will they Merom/Conroe/Woodcrest (mobile/desktop/server)
have the extra SSE2 registers like the Athlon64/Opteron ?

(Are the extra SSE2 (AMD) for 64 bit modes only or also in a 32 bit OS ?)

Is the 128 bit SSE also for SSE2 ?

Is it only multiple data, ie 2 quad words (64 bit data) or 4 dwords (32 bit data) etc,
or is there a 128 data type ?

What are the L2 cache sizes ?
dsouza123 is offline   Reply With Quote
Old 2006-03-10, 16:24   #3
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

192 Posts
Default

Quote:
Originally Posted by dsouza123
Will they Merom/Conroe/Woodcrest (mobile/desktop/server)
have the extra SSE2 registers like the Athlon64/Opteron ?
Yes, they include the x64 stuff.

Quote:
Originally Posted by dsouza123
(Are the extra SSE2 (AMD) for 64 bit modes only or also in a 32 bit OS ?)
I assume, there won't be any exception here.

Quote:
Originally Posted by dsouza123
Is the 128 bit SSE also for SSE2 ?
It's for the whole lot of SSEn implementations. Else they would have wasted ressources.

Quote:
Originally Posted by dsouza123
Is it only multiple data, ie 2 quad words (64 bit data) or 4 dwords (32 bit data) etc,
or is there a 128 data type ?
Maybe in SSE4. But so far it will just be compatible to SSEn with n up to 3 like these extensions are implemented on existing architectures.

Quote:
Originally Posted by dsouza123
What are the L2 cache sizes ?
Conroe has a shared 4 MB L2 cache. If a task on core 1 needs more cache than the task on core 2, then the first task will also be able to utilize more of the L2 cache.

Also L1-L1 connections between the cores are better and the 64 bit implementations will surely be better than on Prescott. This could also mean faster running 64 bit TF code.
Dresdenboy is offline   Reply With Quote
Old 2006-03-10, 18:01   #4
nngs
 
nngs's Avatar
 
Jun 2004

3C16 Posts
Default

Quote:
Originally Posted by Dresdenboy
This week definitely was Conroe's week, when we're speaking about processors. This CPU, which will later this year, will cause a boost in Prime95 performance per clock thanks to full width 128 bit SSE execution with throughput of 1/cycle and the bigger and better cache subsystem.

It looks like AMD's next core with improved FPU will arrive not earlier than in 2007.

For a start a nice article on Realworldtech:
http://www.realworldtech.com/page.cf...0906143144&p=1
Quoted from the article
Quote:
...However, the bottom line is that we expect the Core microarchitecture to provide a 20-40% performance boost over the prior generation products, and more in certain cases. At the same time, power consumption will drop dramatically for the desktop and server devices, in the range of 30-40% and possibly more. As a result, the performance/watt will improve substantially for Intel...
very attractive to GIMPS farmers

Last fiddled with by nngs on 2006-03-10 at 18:01
nngs is offline   Reply With Quote
Old 2006-03-10, 20:33   #5
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

22×13×157 Posts
Default

Quote:
Originally Posted by Dresdenboy
For a start a nice article on Realworldtech:
http://www.realworldtech.com/page.cf...0906143144&p=1
A nice article. While the full 128-bit SSEn implementation with FADD and FMUL on separate ports looks very, very promising, this will likely shift the GIMPS bottleneck to another part of the CPU. For example, if the latency on add/mul is high, then the bottleneck will become "register pressure" (not enough SSE2 registers to schedule independent floating point operations). If the add/mul latency is reasonable, then the bottleneck will move to how fast data can be stored and loaded -- L1 and L2 cache latency & bandwidth may be the bottleneck.

In any event, it will be interesting to read more and get some benchmarks in the coming months!
Prime95 is online now   Reply With Quote
Old 2006-03-10, 20:52   #6
Jeff Gilchrist
 
Jeff Gilchrist's Avatar
 
Jun 2003
Ottawa, Canada

3·17·23 Posts
Default

Yes, I saw them mention that SSE instructions would now take 1 clock cycle instead of the 2 cycles on average before. I figured that should give GIMPS a nice speed boost assuming that another bottleneck didn't get hit really fast.

It will be interesting to see.
Jeff Gilchrist is offline   Reply With Quote
Old 2006-03-10, 21:00   #7
ColdFury
 
ColdFury's Avatar
 
Aug 2002

26×5 Posts
Default

Quote:
For example, if the latency on add/mul is high, then the bottleneck will become "register pressure" (not enough SSE2 registers to schedule independent floating point operations).
All the more reason to write an AMD64/EMT64T version!
ColdFury is offline   Reply With Quote
Old 2006-03-10, 22:28   #8
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

816410 Posts
Default

Quote:
Originally Posted by Jeff Gilchrist
I saw them mention that SSE instructions would now take 1 clock cycle instead of the 2 cycles on average before.
Just to clarify, the "1 clock cycle" figure is for maximum throughput in a pipelined architecture. Latency refers to how fast a single add or mul operation takes. The doubling in maximum thoughput is definitely good news but won't result in a doubling of prime95 speed.

BTW, AMD has typically been a clock or two faster in latency with the AMD64 and P4 equal in throughput.
Prime95 is online now   Reply With Quote
Old 2006-03-10, 22:34   #9
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

22×13×157 Posts
Default

Quote:
Originally Posted by ColdFury
All the more reason to write an AMD64/EMT64T version!
Uh, raise your hand if you are running 64-bit Windows.... I don't see many hands raised
Prime95 is online now   Reply With Quote
Old 2006-03-10, 23:06   #10
ColdFury
 
ColdFury's Avatar
 
Aug 2002

26·5 Posts
Default

Quote:
Originally Posted by Prime95
Uh, raise your hand if you are running 64-bit Windows.... I don't see many hands raised
True, but lucky people could at least run mprime on Linux.
ColdFury is offline   Reply With Quote
Old 2006-03-11, 00:10   #11
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

22·13·157 Posts
Default

Quote:
Originally Posted by ColdFury
True, but lucky people could at least run mprime on Linux.
Not until binutils is upgraded to support 64-bit COFF object files
Prime95 is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
AMD's 8- and 12-core CPU monsters joblack Hardware 4 2010-04-02 14:23
Upcoming features Xyzzy Forum Feedback 1 2007-11-26 18:57
Prime95 and Dual Processors AntonVrba Hardware 6 2006-06-14 19:49
Prime95, hyperthreading, multiple processors, Win2003, etc... pcr Software 8 2005-12-22 14:43
Monsters and Monster farms Unregistered Data 6 2004-08-12 00:28

All times are UTC. The time now is 16:42.


Sun Feb 5 16:42:06 UTC 2023 up 171 days, 14:10, 1 user, load averages: 0.98, 0.77, 0.83

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔