mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2017-05-18, 01:17   #1
pool party
 
May 2016

1 Posts
Default Purpose Built rig?

Hi. So far through various accounts I have contributed about over14K gHz/days... over the last... 6? years.


The thought had come to mind to build a purpose built gpu? rig to crunch numbers all day long...


Have any ideas / sugestions? Is there a thread I should check out / missed?


Thanks!


-Chris
pool party is offline   Reply With Quote
Old 2017-05-18, 13:23   #2
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

2·31·47 Posts
Default

For doing LL as cheaply and efficiently as possible, see the George's dream build thread. The short version is to buy i5-7500 with dual rank, dual dimm DDR4-2400 and put four of them on the cheapest Gold rated power supply using cable splitters.

For LL speed, high end Xeons are best. AMD Zen isn't bad, but due to implementation details of Zen you get only half the performance per core.

For Trial Factoring on GPUs, the 1080 Ti is currently king. You can compare GPU TF rankings here. Trial Factoring is the most efficient use of GPUs.

For LL on GPUs, the 1080 Ti is also the best of the new cards. You can compare GPU LL rankings here.

If your electricity is free, or you use electric heat in the winter, then that changes game a little bit. Old Nvidia Fermi GPUs are very fast at TF, and you can often find GTX 580's for cheap. The original Titan is also great for LL.

If you're on a budget, the RX 470 isn't bad. If you want to run multiple GPUs though, and have to pay for supporting hardware like power supplies and PCIe slots, going with 1080 or 1080 Ti will be cheapest.
Mark Rose is offline   Reply With Quote
Old 2017-05-18, 13:40   #3
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

6,323 Posts
Default

LL on GPUs is strictly inferior to LL on second-hand Sandy Bridge Xeons - you can buy two whole boxes for the price of a GTX1080 card, they use only a bit more electricity and they LL significantly faster. I assure you I'm not being paid to recommend http://www.bargainhardware.co.uk/qua...gure-to-order/

(OK, they are not great things to share a room with, I am lucky enough to be able to confine them to an outbuilding)
fivemack is offline   Reply With Quote
Old 2017-05-19, 09:06   #4
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

22×7×11×29 Posts
Default

Quote:
Originally Posted by fivemack View Post
LL on GPUs is strictly inferior to LL on second-hand Sandy Bridge Xeons
Nope...
LaurV is offline   Reply With Quote
Old 2017-05-19, 16:45   #5
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

672 Posts
Default

Quote:
Originally Posted by LaurV View Post
Nope...
Example: I acquired an HP Z600 dual-Xeon 5650 (2x6 core 2.66ghz, Sandy Bridge) for $260 inc shipping. What sub-$300 GPU outruns it?
VBCurtis is online now   Reply With Quote
Old 2017-05-19, 21:18   #6
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

2·31·47 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
Example: I acquired an HP Z600 dual-Xeon 5650 (2x6 core 2.66ghz, Sandy Bridge) for $260 inc shipping. What sub-$300 GPU outruns it?
A pair of used GTX 590's. Would use more power though!
Mark Rose is offline   Reply With Quote
Old 2017-05-19, 23:14   #7
airsquirrels
 
airsquirrels's Avatar
 
"David"
Jul 2015
Ohio

11×47 Posts
Default

According to the mersenne.org benchmarks, a core of the 5650@2.67GHz is good for either 56 or 66ms/iteration at 4M FFT. I will give the benefit to the 5650 and say 56, giving 17.86 iterations/second. For ease of comparison, I will make iters/second@4M FFT my metric of performance.

Even though it isn't usually accurate, lets say we can scale all six cores perfectly and achieve 107.14 iters/s per chip. That would give your dual-X5650 214.28 iters/s best case. (On a 5930K Haswell 6 core chip, going from 1 to 6 cores is about 22% slower, so 175.64 iters/s is probably more likely for the pair of X5650s)

This is a 6 core chip with a 95W TDP, and the $269.99 system on eBay uses a 650W power supply. HP itself estimates your total system TDP at max load is 282W (Seems reasonable).

771408 iters/h for 282W ~= 2735.5 iterations per Watt.

There is a pair of Titan Blacks on eBay right now for $500, so $250ea. Timings at 4M are 2.56ms/iter, or 390 iter/s. Power consumption is a full 250W, plus a host system of around 100W. You will probably need to spend at least $200 on that host system too, but why not buy the $269 Xeon's above and use one PCIe slot to double your throughput with a GPU.

A RX480 pulls 3.657ms/iter @4M(273 iters/sec). The RX580 a bit faster (~294 iter/s) and sells for ~$290 new, looks like $200-250 on eBay. This will pull about 165 watts.

A Fury X achieves 2.2ms/iter @4M (454.5 iter/sec) at 250W and sells for about $630. An R9 Nano 2.7ms/iter (370 iter/sec) at about 180W and sells for $420

Dual Sandy Bridge Xeons:
Initial Cost: $270
Est. cost to run for 1 year: 282W *24*365.25 = 2472 kWh @ $0.10): $247.20
Iterations (Ideal): ~6,762,000,000 ~= about 91.4 74M Exponents
Cost per 74M exponent: $5.66
Incremental cost per exponent: $2.70

All GPU Host System initial cost (No cycles contributed): $250

Titan Black:
Initial Cost: $250 + $250 = $500
Est. cost to run for 1 year: ((250W+100W host) * 24 * 365.25 = 3068 kWh @ $0.10): $306.80
Iterations (Real): ~12,307,464,000 ~= 166.32 74M Exponents
Cost per 74M exponent: $4.85
Incremental cost per exponent: $1.84

RX580
Initial Cost: $290 + $250 = $540
Est. cost to run for 1 year: ((165W+100W host) * 24 * 365.25 = 2323 kWh @ $0.10): $232.30
Iterations (Real): ~9,277,934,400 ~= 125.38 74M Exponents
Cost per 74M exponent: $6.16
Incremental cost per exponent: $1.85

Fury X: 193.8 74M exponents, $6.12/per, $1.58 incremental.
Fury Nano: 157.8 74M exponents, $5.80/per, $1.55 incremental.

So it all depends on how cheap your power is, and how long you plan to operate the hardware. Note that I was intentionally biased towards the Sandy Bridge (Top performance numbers, minimal power specs). In reality you could put 3-4 GPUs in one 100W host and drop your incremental $/exponent down to $1.14 or so.

Last fiddled with by airsquirrels on 2017-05-19 at 23:16
airsquirrels is offline   Reply With Quote
Old 2017-05-20, 01:17   #8
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

672 Posts
Default

The sort of awesomeness you posted for this comparison makes me really happy that (1) I asked, and (2) that I'm wrong.

Thank you very much!

p.s. each CPU has 3 channels of DDR3, but I think it's 1333; that's not enough to feed all 6 cores, so I agree that your "ideal" production is an overestimate. I don't run mprime on it, so I don't have my own numbers.

Last fiddled with by VBCurtis on 2017-05-20 at 01:24
VBCurtis is online now   Reply With Quote
Old 2017-05-20, 01:20   #9
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

5×7×281 Posts
Default

Thanks for the detailed cost analysis, David - tiny quibble, it should 'iterations per Watt-hour' (or any other power*time unit.)

Quote:
Originally Posted by airsquirrels View Post
According to the mersenne.org benchmarks, a core of the 5650@2.67GHz is good for either 56 or 66ms/iteration at 4M FFT. I will give the benefit to the 5650 and say 56, giving 17.86 iterations/second. For ease of comparison, I will make iters/second@4M FFT my metric of performance.
While we're at, might as well look at an Intel manycore, since those are being developed in direct response to the market challenge posed by GPUs. On KNL I get ~90 msec/iter @4096K on a single core. Throw in George-level optimizations and figure about the same per-core throughput on the above 5650 as on the KNL. A 64-core version of the latter we can figure ~50x for the total throughput using all cores. You mentioned ~$3000 for the cost of a budget KNL rig - note that is a new rig, not a several-years-down-the-road used one - so the cost-per-FLOP of the hardware is comparable, but the total power draw will be much less than for the 8-10 5650s it would take to match the total FLOPS. I don't know the watts-under-load of your KNL system, but would appreciate if you could take that number and compute the cost/exponent for the KNL.
ewmayer is offline   Reply With Quote
Old 2017-05-20, 01:49   #10
GP2
 
GP2's Avatar
 
Sep 2003

29·89 Posts
Default

CPU vs GPU? There's a third contender here... I can't resist a little cloud propaganda.

Quote:
Originally Posted by airsquirrels View Post
771408 iters/h for 282W ~= 2735.5 iterations per Watt.
More precisely phrased, it's 2735.5 iterations per watt-hour, or 2.73 M iterations per kW-h

A more realistic estimate, as you noted, would multiply by a factor of (175.64 / 214.28) to account for slowdown when scaling up from using one core to using all six cores.

So 2.24 M iter / kW-h for a 5650@2.67GHz doing 4M FFT.


Meanwhile, a c4.large cloud instance on AWS can do 45 iters/sec for 4M FFT, or 162000 iters per hour at spot prices of around $0.013 / hour (in us-east-2 region), or about 12.4 M iter / dollar. All electricity costs are already indirectly included in this hourly rate and are not separated out and billed to the customer. Unlike the 5650, this scales linearly when you increase the number of instances, because each instance will surely run on a different physical server (there are millions of servers in the cloud).

Doing the math, this means that if your electricity costs more than 18 cents per kW-h, you are better off running six c4.large instances in the cloud instead of that 5650, based on electricity costs alone. In practice, the comparison is even less favorable since you had to shell out $260 upfront to purchase the 5650, versus $0 upfront for the cloud. You'll also incur a small cost for additional air conditioning when running your own hardware.

Of course, spot prices in AWS can fluctuate, but they've held fairly steady between 1.1 cents per hour and 1.3 cents per hour for more than three months now (can't look up historical price data beyond that). This is in the us-east-2 region, other AWS regions are pricier, sometimes much more.

To be sure, 18 cents per kW-h is well above the average for the US, but according to these government figures, electricity is more expensive than that in New England, California, Alaska, and Hawaii, and very close to that in New York state. And according to these other government figures (the chart is from here), electricity is more expensive than that in many European countries, including Belgium, Denmark, Germany, Ireland, Spain, Italy, Netherlands, Austria, Portugal, Sweden, and the UK (recall that 1 EUR = about $1.12 currently). In some cases much more, for instance Denmark at €0.29, or about 32 cents (US$), per kW-h.

VBCurtis's info in the left-hand side indicates a location of California, where power apparently costs 18.68 cents per kW-h. If so, you'd be better off re-selling that 5650 on eBay and getting whatever you can for it, and applying that money to running mprime in the cloud... maybe that's what the person who sold it to you did
GP2 is offline   Reply With Quote
Old 2017-05-20, 02:00   #11
science_man_88
 
science_man_88's Avatar
 
"Forget I exist"
Jul 2009
Dumbassville

836910 Posts
Default

Quote:
Originally Posted by ewmayer View Post
(or any other power*time unit.)
or just these:

Quote:
zettaWatt hour = 3.6 yottaJoule
1 yottaJoule = 277.77... exaWatt hours
exaWatt hour = 3.6 zettaJoule
1 zettaJoule = 277.77... petaWatt hours
petaWatt hour = 3.6 exaJoule
1 exaJoule = 277.77... teraWatt hours
teraWatt hour = 3.6 petaJoule
1 petaJoule = 277.77... gigaWatt hours
gigaWatt hour = 3.6 teraJoule
1 teraJoule = 277.77... megaWatt hours
megaWatt hour = 3.6 gigaJoule
1 gigaJoule = 277.77... kiloWatt hours
kiloWatt hour = 3.6 megaJoule
~0.746 kilowatt hour= 1 horsepower hour
1 megaJoule = 2.7777... hectoWatt hours
hectoWatt hour = 360 kiloJoule
~4.814 kiloJoule = 1 nutritional Calorie
~1.055 kiloJoule = 1 BTU
1 kiloJoule = 0.027777... dekaWatt hours
dekaWatt hour = 360 hectoJoule
1 hectoJoule = 0.027777... Watt hours
Watt hour = 360 dekaJoule
~4.814 Joule = 1 non nutritional calorie
science_man_88 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
What is the purpose of these these forums? Who is welcome? only_human Soap Box 78 2012-06-21 13:12
Prime95 version 26.5 built 5 slow! Unregistered Information & Answers 7 2012-06-17 05:22
New Sandy Bridge Computer Help (Built - WOW!) Prime95 Hardware 104 2011-05-24 00:32
petaflop computer built Fusion_power Hardware 7 2008-06-11 10:06
Purpose of p-1 factoring drew Marin's Mersenne-aries 2 2005-06-29 15:00

All times are UTC. The time now is 22:58.

Sun Nov 29 22:58:52 UTC 2020 up 80 days, 20:09, 3 users, load averages: 1.82, 1.37, 1.25

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.