mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Hardware (https://www.mersenneforum.org/forumdisplay.php?f=9)
-   -   Quad Core and P95 (https://www.mersenneforum.org/showthread.php?t=6937)

sgrupp 2007-01-12 16:29

Quad Core and P95
 
Now that the Quad core (QX7600) systems are coming out, any experience with Prime95? Clearly the throughput/$$ issue will be a challenge in these expensive systems, but what are people seeing as regards:

1) Memory contention when all 4 cores are running P95. Are iteration times still OK?

2) HEAT. Can an air cooling solution handle all 4 cores doing LL testing?

3) Anyone running a water cooled overclocked rig like Dell's latest?

4) Power. Any notion of what power consumption is on a QX6700 running 4 LL tests?

dsouza123 2007-01-12 17:20

The thermal envelope for the quad core QX6700 is 130W
which is double the dual core E6700's 65W. Both run at 2.66 Ghz.

The new quad core Q6600 runs at 2.4 Ghz with 80W envelope.

S485122 2007-01-12 18:31

I am busy with a quad core at the moment.

1. Iteration times are consistent : running one instance of the benchmark or running 4 has no significant impact. Although I obtained more of the minimum times for a given FFT size when running only one instance of the benchmarks, the differences are well within the standard deviation on the benchmark figures. The standard deviation for the FFT tests is about 2%, and for the factoring tests it is 1%, the maximum difference can go up to 20%.

There is one point and I intend to post about it : the Level 2 cache sizes are not recognised :[Code]L1 cache size: 32 KB
L2 cache size: unknown
L1 cache line size: 64 bytes
L2 cache line size: unknown[/CODE]And when running 4 LL tests at the same time, the iteration times are much higher than the results of the benchmarks. There have been some posts about this problem. I will investigate further.

2. Air cooling is fine, it will dissipate as much heat as a Pentium IV D830 or D840 processor. You have to use a good coler though. I use a Zalman 9700 and core temperatures measured by the thermal diodes are well within specs.

3. I use no water cooling.
I do not overclock for now because the machine is breaking in.

4. I did not measure the power consumptiun, but it should be equivalent to 2 Core 2 Duo E6700 or one D830 processor. The overal conusumptiun of the system is not twice that of a E6700 system because of all the elements in common. I would say count some 70 Watts more than a E6700.

sgrupp 2007-01-12 18:39

What kind of iteration times are you getting on the quad core? With a Core 2 Duo 2.66 processor and a 36M LL test, I am getting .05 sec/iteration with each of the 2 processes.

On the unrecognized L2 cache problem, undoc.txt has this to say:

You can explicitly specify the L2 cache size although this shouldn't be necessary since the program uses the CPUID
instruction to determine the L2 cache size. In local.ini enter:
CpuL2CacheSize=128 or 256 or 512
CpuL2CacheLineSize=32 or 64 or 128
CpuL2SetAssociative=4 or 8

Prime95 2007-01-12 19:29

[QUOTE=S485122;95907]Iteration times are consistent : running one instance of the benchmark or running 4 has no significant impact.[/QUOTE]

The benchmark is a poor tool to use. Since it reports the BEST iteration time.

You really need to run 4 LL tests which will report the more important average iteration time.

sgrupp 2007-01-12 19:34

What is your guess on memory contention or bus limitations for 4 LL tests running on a quad core with a total of 8M of L2 cache, George? A significant issue or not?

sgrupp 2007-01-12 19:40

And a second question - is the right value for L2 Cache size 2048 (i. e. 2 MB for each core)?

CpuL2CacheSize=2048 or
CpuL2CacheSize=8192 ?

R.D. Silverman 2007-01-12 19:42

Quad Core with NFS
 
[QUOTE=sgrupp;95920]What is your guess on memory contention or bus limitations for 4 LL tests running on a quad core with a total of 8M of L2 cache, George? A significant issue or not?[/QUOTE]

I am curious how NFS will perform on a quad-core. I am certain that
cache and memory contention will be a major problem. Running 4 instances
may even be slower (in aggregate output) than running 2 instances.

If I provide code and data for a Windows system, can someone run an
NFS benchmark???

Andi47 2007-01-12 20:02

[QUOTE=sgrupp;95925]And a second question - is the right value for L2 Cache size 2048 (i. e. 2 MB for each core)?

CpuL2CacheSize=2048 or
CpuL2CacheSize=8192 ?[/QUOTE]

This is not exactly answering the question, but:

According to [URL="http://de.wikipedia.org/wiki/Intel_Core_2#Kentsfield_3"]Wikipedia[/URL] (german Wikipedia seems to be more informative than the english one), the QX6700 ("Core 2 Extreme Kentsfield") consists of two Dice of Dual-Core, not one Die of Quad-Core.
Each Die has got one 4096 MB L2 Cache (shared by both cores), if I understand correctly.

S485122 2007-01-12 20:29

[QUOTE=sgrupp;95909]What kind of iteration times are you getting on the quad core? With a Core 2 Duo 2.66 processor and a 36M LL test, I am getting .05 sec/iteration with each of the 2 processes.

On the unrecognized L2 cache problem, undoc.txt has this to say:

You can explicitly specify the L2 cache size although this shouldn't be necessary since the program uses the CPUID
instruction to determine the L2 cache size. In local.ini enter:
CpuL2CacheSize=128 or 256 or 512
CpuL2CacheLineSize=32 or 64 or 128
CpuL2SetAssociative=4 or 8[/QUOTE]On the quadcore testing 4 differnet 27M numbers concurently I get 0,042 s iteration times. On the benchmark it is 0,33 s. The difference is HUGE. Yesterday I already tried to set the missing L2 values, the results where not better. I followed the indications provided by George in the thread [thread=6598]L2 cache unknown with new CPUs[/thread]

CPUID says the following about the cache on the QX6700 :
L1 data cache : 4 x 32 Kbytes, 8 way set associative, 64 bytes line size
L1 code (or instruction) cache : 4 x 32 Kbytes, 8 way set associative, 64 bytes line size
L2 : 2 x 4096 KBytes, 16 way set associative, 64 bytes line size

A QX6700 is the same as two E67000 on one board through one socket. So the shared cache issue is the same as on an E6700. But it could be memory contention : 4 instances of Prime95 being to much for the memory bus ? I could try to fiddle with the memory settings of my board, but I want to wait a bit first.

Or is it something to do with[QUOTE=Prime95;94743]I had the same problem in 64-bit Windows (dual core Pentium 4). It turns out to be some weird Windows problem reading the time stamp counter. I "fixed" it by adding the /usepmtimer to the boot.ini file[/QUOTE]

Cruelty 2007-01-12 22:56

Maybe it is FSB that's holding back this CPU? I would lower the multiplier to x8 and increase the FSB from 266 to 333 MHz to see if that helps.
Also setting an affinity for every instance would be advisable for quad-core since all the communication between the cores happens through FSB.
BTW: does running 2 or 3 instances instead of 4 make any difference?


All times are UTC. The time now is 06:41.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.