mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Hardware (https://www.mersenneforum.org/forumdisplay.php?f=9)
-   -   George's dream build (https://www.mersenneforum.org/showthread.php?t=20795)

Prime95 2015-12-30 00:53

George's dream build
 
I'm interested in building a personal "supercomputer" primarily for LL testing and looking for your help/suggestions.

Goals:

1) Good total cost of ownership assuming 3 or 4 year lifespan
2) Small and portable enough to move back and forth to the Summer home
3) Good energy efficiency
4) Reasonably quiet
5) Reliably reboot automatically after power failure.

Here are my random thoughts in no particular order:

To achieve small and portable, I'm looking at 4-6 mini-ITX motherboards. Stack them using some kind of home-made system (like long screws or metal rods with plastic spacers maybe 2 inches long between the motherboards). To keep it small, all motherboards will be powered by one power supply and 3-5 picoPSUs. I'd mount the contraption in a standard case with a network switch (12V DC so I can run it from the power supply and ditch the wall wart). I looked into dual socket 2011-3 CPU motherboard with 8 core Xeon processors
but that seems pretty pricey with less performance -- easier setup though.

To reduce cost and power, I'd like to have one SSD. Obviously, no power hungry video cards. Linux, not Windows. This means network booting. One master CPU with 4 or 5 slaves. I have no experience doing this!

Since LL tests are limited by memory bandwidth memory overclocking should be strongly considered.

For good energy efficiency, I'm looking at getting "a good match" between CPU speed and memory speed. For example, I consider my 4GHz Haswell with DDR3-2400 a poor match. The 4th core does not increase throughput very significantly. I've not done the actual measurements, but I suspect by lowering the clock speed and CPU core voltage I would likely reduce energy usage by 20% while reducing thoughput only 5% -- a pretty obvious increase in energy efficiency. Deciding what "good match" means will require some study!

For good energy efficiency, an efficient power supply is a must.

To get reasonably quiet, I suspect watercooling is the solution. I've never done this either! It is expensive. Preliminary research indicates about 200 + 100/motherboard. I'm thinking to start just use the Intel heatsink/fan and look at upgrading if necessary later. One advantage of the watercooling might be designing a radiator that lifts away easily to place it an insulated window box in the Summer. Dump all that damn heat outdoors!

Here is a first cut at the needed components for a sample 5 mobo system (20 cores of computing goodness):

Mobo: ASRock Z170M-ITX/ac 5 * 130 = 650
Memory: DDR4-3200 2x4GB 5 * 60 = 300
CPU: I5-6600 (3.3GHz, 65W) 5 * 230 = 1150
SSD: Samsung 850 EVO 1 * 90 = 90
PicoPSU picoPSU-120 4 * 40 = 160
Case & power supply & network switch -- already own that.

I'm thinking each mobo will consume 65W CPU, 4W memory, 10W(?) mobo or about 400W total. Add in power supply inefficiency for a total of 450W at the wall.

Total cost of 3 year ownership = 2350 parts + 3 * 450 = 3700 (or 3700 / 3.3 = 1121 dollars/GHz)

Another sample:

Mobo: H110M-ITX (can't OC RAM) 5 * 70 = 350
Memory: DDR4-2133 2x4GB 5 * 40 = 200
CPU: I5-6400 (2.7GHz, 65W) 5 * 190 = 950
SSD: Samsung 850 EVO 1 * 90 = 90
PicoPSU picoPSU-120 4 * 40 = 160
Case & power supply & network switch -- already own that.

I'm thinking each mobo will consume 60W CPU, 3W memory, 10W(?) mobo or about 360W total. Add in power supply inefficiency for a total of 400W at the wall.

Total cost of 3 year ownership = 1750 parts + 3 * 400 = 2950 (or 2950/2.7 = 1090 dollars/GHz)


Questions:

1) Has anyone here had any success with network booting? Best linux version for network boot?
2) If you own a locked Intel CPU, does turbo boost kick in at all when prime95 is running?
3) What is the energy efficiency sweet spot for power supply utilization?
4) Has anyone studied prime95 throughput for various CPU speed and memory speeds?


Suggestions? Thoughts on why this is a hare-brained idea?

Mark Rose 2015-12-30 03:38

[QUOTE=Prime95;420471]Questions:

1) Has anyone here had any success with network booting? Best linux version for network boot?
2) If you own a locked Intel CPU, does turbo boost kick in at all when prime95 is running?
3) What is the energy efficiency sweet spot for power supply utilization?
4) Has anyone studied prime95 throughput for various CPU speed and memory speeds?


Suggestions? Thoughts on why this is a hare-brained idea?[/QUOTE]

I've been thinking of building such a system myself.

1. Networking booting is easy enough. Here are [url=https://help.ubuntu.com/community/DisklessUbuntuHowto]instructions for Ubuntu[/url], but the steps will be similar for any major distribution. I would suggest Ubuntu 14.04 Server (or 16.04 when it comes out) for ease of use and large online community. If you're more familiar with a different distribution, use it. If you want to make life easy, get a second network card for one computer. One connection will be for the outside internet, the other for the internal cluster. The ASRock Z170M-ITX/ac has that.

2. The 4770 I have at work is running a 0.1 GHZ turbo boost right now. Turbo boost will kick in any time the CPU has TDW headroom. I bet AVX2 impacts that. Do note that the more cores that are boosting, the lower the boost (each eats into the headroom). As you are no doubt aware, different chips will have different stock voltages. As both the 6600 and the 6400 have a 65 W TDW, yet the 6400 is clocked significantly slower, I bet the 6400 chips may be binned as requiring a higher voltage to reach the same clock speed. I would not expect the 6400 to boost better than the 6600.

3. The sweet spot for power supply efficiency is generally around 50%. Anything that's [url=https://en.wikipedia.org/wiki/80_Plus#Efficiency_level_certifications]80 Plus[/url] rated is efficient between 20% and 100%. The sweet spot for cost efficiency are the 80 Plus Platinum supplies. If you're running the power supply hard all year, the more efficient power supply will [url=http://www.mersenneforum.org/showpost.php?p=418977&postcount=9]pay for itself[/url]. I would avoid running a power supply above 80% load, so it will continue to work properly if the ambient temperature is higher or if a dust filter is impeding airflow. If you want quiet, aim for 50%. The efficiency of the picoPSU-120 is apparently over 95%, but that doesn't count getting to 12 volt in the first place. In your cost calculation you assumed an 89% efficiency. That may be a bit generous. I would assume power draw is closer to 450 to 500 W for the 6400 and 6600 systems respectively.

4. Madpoo can probably answer this best. From my own observations it seems each LL core requires about 400 MHz of single channel memory per GHz. I believe 3000 or 3200 MHz memory would be a good fit for the 6600. The 6400 would be a little memory starved with 2133 MHz memory.

For memory bandwidth and TDW/voltage reasons, I would go with the 6600 solution you proposed.

I haven't done water cooling myself, but I have studied it a fair amount. The rule of thumb is 120 mm of radiator per 100 watts. I would use the radiators that come in 140 mm units as the have quieter, 140 mm fans. As you're not overclocking, you can let the water temperature get a little warmer and run the CPUs hotter, for greater thermal efficiency in the loop and with the radiator fans. So for 4 of either of those CPUs you'll need a 2x140 mm radiator, and for 6, 3x140mm. As you care about silence, I would look at low fin density radiators and size up: go 3x140 or 4x140 and run more fans at slower speeds.

Every component in the water cooling loop, every bend in the pipe, and every inch of pipe will increase the static pressure, so you may need to use two pumps or run two separate loops. When planning the build, be sure you include a T at the bottom of the loop so you can drain it. The order of the components does not matter. [url=https://www.youtube.com/user/Jayztwocents]Jayztwocents[/url] has a lot of videos on doing [url=https://www.youtube.com/playlist?list=PLOXo4ndvQK79n8Zv28IZ0ASaSEpJOB0iQ]watercooling[/url].

I've also thought about using a outdoor radiator. LaurV is the expert there. It's safe as long as the air outside is warmer than the air inside. If the water in the pipes is colder than the indoor air they will collect condensation and eventually let the magic smoke out of your kit.

I have two questions myself though:

1. What case are you planning to use that will comfortably seat 4 to 6 mini-ITX motherboards?

2. Which power supply do you own already?

VBCurtis 2015-12-30 04:30

If Turbo headroom is dependent on power draw, perhaps 3 i5-6400 cores on P95 will get enough turbo to nearly match 4 cores at default speed, since 3 tests will likely saturate 2133 memory anyway?

In theory, 3x3.2 might get just as much work done at 4x2.7, since the former is likely to saturate the memory anyway. Or is this a situation where 2 2-threaded tests uses less memory, leaving 3-core usage as inefficient?

George-
Look into Dell C6100 used servers. $900 gets you 4 nodes of dual Xeon 5500-5600 CPUs (usually quad core, but hex possible), each CPU has 3 channels DDR3-1333. A single 1100w power supply, drawing something like 700w at full blast. $1400 savings in parts, $800 extra power, but 32 cores @ 2.33 or 2.5Ghz rather than 20.

Are Xeon 5600s too old for the spiffy P95 instructions?

Prime95 2015-12-30 04:40

[QUOTE=Mark Rose;420478]

Thanks for the good info. Nice to know I'm on a sane track...

1. What case are you planning to use that will comfortably seat 4 to 6 mini-ITX motherboards?

2. Which power supply do you own already?[/QUOTE]

I have 4 old cases lying around. I haven't investigated how well my contraption would fit inside any of them. If it won't fit in them then I'll have to cobble together my own case so it can be easily moved.

The spare power supply is at the mountain condo. I'll pick it up in February. I think it is either 550W or 700W and I don't remember it's efficiency. Buying a new one is no big deal and I will certainly do so if it pays for itself.

axn 2015-12-30 04:40

[QUOTE=VBCurtis;420481]Are Xeon 5600s too old for the spiffy P95 instructions?[/QUOTE]

Yep, unfortunately.

Prime95 2015-12-30 05:00

[QUOTE=VBCurtis;420481]If Turbo headroom is dependent on power draw, perhaps 3 i5-6400 cores on P95 will get enough turbo to nearly match 4 cores at default speed, since 3 tests will likely saturate 2133 memory anyway?[/quote]

If 3 cores at full turbo maxed out the memory bandwidth, then it may well be that a 3 worker solution would be more energy efficient.

[quote]In theory, 3x3.2 might get just as much work done at 4x2.7, since the former is likely to saturate the memory anyway. Or is this a situation where 2 2-threaded tests uses less memory, leaving 3-core usage as inefficient?[/quote]

3x3.2 = 9.6 GHz 4x2.7 = 10.8 GHz. Not sure 9.6 GHz will max out 2133 memory. I need to run a lot of benchmarks on my Haswell machine. The Skylake architecture needs a little more memory bandwidth per GHz (some AVX2 floating point latencies are lower).

Using Mark's 400MHz bandwidth per Ghz (a figure I'm extremely interested in confirming), the 2133 machine could handle 4266/400 = 10.7 GHz of CPU power.


[quote]Are Xeon 5600s too old for the spiffy P95 instructions?[/QUOTE]

It looks like those are pre-AVX, so yes those Xeons are too old to consider seriously.

Mark Rose 2015-12-30 05:09

[QUOTE=Prime95;420483]I have 4 old cases lying around. I haven't investigated how well my contraption would fit inside any of them. If it won't fit in them then I'll have to cobble together my own case so it can be easily moved.

The spare power supply is at the mountain condo. I'll pick it up in February. I think it is either 550W or 700W and I don't remember it's efficiency. Buying a new one is no big deal and I will certainly do so if it pays for itself.[/QUOTE]

If you're recycling an old case, make sure you have airflow over the RAM and the chips on each motherboard or they may overheat. The fans on the radiators can also serve as system fans to provide that airflow, perhaps one 280 mm radiator at the front and the other at the back. The exhaust air from the radiators is more than cool enough. Do leave space between the boards to not impede the air flowing through the radiators.

If the power supply is 80 Plus Bronze or better it may not be worth replacing.

Prime95 2015-12-30 05:11

[QUOTE=Mark Rose;420478]The efficiency of the picoPSU-120 is apparently over 95%, but that doesn't count getting to 12 volt in the first place. In your cost calculation you assumed an 89% efficiency. That may be a bit generous.[/QUOTE]

I'm assuming the picoPSU is 100% efficient on delivering 12V power and that the amount of power used on the 5V and 3.3V is negligible. So, in total the picoPSU is not adding much to power loss. Yes, 89% is optimistic. I'm hoping 85% is realistic.

Mark Rose 2015-12-30 05:16

[QUOTE=Prime95;420486]If 3 cores at full turbo maxed out the memory bandwidth, then it may well be that a 3 worker solution would be more energy efficient.

3x3.2 = 9.6 GHz 4x2.7 = 10.8 GHz. Not sure 9.6 GHz will max out 2133 memory. I need to run a lot of benchmarks on my Haswell machine. The Skylake architecture needs a little more memory bandwidth per GHz (some AVX2 floating point latencies are lower).[/quote]

I don't know if you can get 3 cores at max turbo running AVX2 with its voltage bump. I may play around with the BIOS at work tomorrow and see what I come up with.

[quote]
Using Mark's 400MHz bandwidth per Ghz (a figure I'm extremely interested in confirming), the 2133 machine could handle 4266/400 = 10.7 GHz of CPU power.
[/quote]

Do keep in mind it's only a ballpark figure. I haven't done extensive testing on multiple systems to back that up. Madpoo has far more insight into memory bandwidth requirements. I hope he chimes in.

Madpoo 2015-12-30 07:06

[QUOTE=Mark Rose;420478]4. Madpoo can probably answer this best. From my own observations it seems each LL core requires about 400 MHz of single channel memory per GHz. I believe 3000 or 3200 MHz memory would be a good fit for the 6600. The 6400 would be a little memory starved with 2133 MHz memory.[/QUOTE]

Faster memory is more better. :smile:

On my systems, I tend to just throw all of the cores of one CPU into a single worker. Is it the most efficient? Well, that depends. If I wanted to run a bunch of tests in the 35M range, I could run several on one CPU and the memory seems to be able to handle it okay. I didn't test just how many of those I could run on a 4-core Xeon before it starts to struggle, but I could run 2 of them with 2 cores each.

But I don't want to only run 35M tests... if you get up into the 38M+ exponents with whatever FFT size, then even running 2 of them on a 4-core CPU starts to show slowdowns with both running.

Regarding DDR3 versus DDR4, I can't say for sure how much is the memory and how much are from the v4 Xeons, but those effects of multiple workers and all that seem to nearly go away with DDR4 modules. I can run simultaneous tests on much larger exponents without any impact.

More formal testing would probably reveal where the cutoffs are for different memory speeds and FFT sizes, but I found the approximate sweet spots for me and stopped poking at it.

bgbeuning 2015-12-30 13:12

[QUOTE=Prime95;420471]
1) Good total cost of ownership assuming 3 or 4 year lifespan
[/QUOTE]

Add the cost of power from your electric company.
Someone else suggested this, and it changed my plans.

[QUOTE=Prime95;420471]
I'm looking at 4-6 mini-ITX motherboards. Stack them using some kind of home-made system (like long screws or metal rods with plastic spacers maybe 2 inches long between the motherboards).
[/QUOTE]

I considered this approach, but abandoned it for a couple of reasons.

Stacking motherboards probably means the CPU (heater) will be next to each other.
How will you know your cooling will be enough until you try it?

When servers do this, there is a metal sheet between motherboards and
I suspect it helps spread out the heat.

[QUOTE=Prime95;420471]
To keep it small, all motherboards will be powered by one power supply and 3-5 picoPSUs.
[/QUOTE]

Never seen PicoPSU before. That fixes one problem I had with stacking motherboards.
You want to be able to power down one motherboard while the rest keep running.

[QUOTE=Prime95;420471]
This means network booting. One master CPU with 4 or 5 slaves. I have no experience doing this!
[/QUOTE]

Network booting uses PXE.
Apparently PXE is part of the Ethernet standard so all NIC support it.
But not all BIOS will have support.
I looked at network booting to save watts on the disk drives but even after
reading the documents I have not figured it out. Still on my TODO list.


All times are UTC. The time now is 22:51.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.