![]() |
![]() |
#23 |
"Philip Rogers"
Feb 2017
San Francisco, CA
2 Posts |
![]()
Thank you for this awesome guide!
I did some cost analysis of EC2 configurations and found some interesting results:
Here's a link to the raw data: https://docs.google.com/document/d/1...LeHcSwpLk/view |
![]() |
![]() |
![]() |
#24 |
Aug 2006
22×3×499 Posts |
![]()
Very cool! Thanks for doing the experiment!
|
![]() |
![]() |
![]() |
#25 |
Sep 2003
3·863 Posts |
![]()
Hey, thanks for giving it a try.
Note: when using this guide, make sure you use the latest version of the user_data_body script (attached at the bottom of this message). Versions prior to 1.07 had a bug that failed to detect some of the existing (already-running) instances, which caused problems. Also, don't forget to edit the script to put in your own FILE_SYSTEM_ID values. $6.21 for a 80M exponent sounds like the right ballpark. I did a rough estimate just now and got about $6.70 for a 79M exponent, assuming 1.25 cents per hour for a c4.large. Of course, you have to search for the region with the cheapest spot prices (hint: the only American state whose name shares no letters with "mackerel"), because spot prices in other regions can be a lot higher, and there seems to be no effective arbitrage mechanism operating to make prices more consistent with one another. When running p2.xlarge, remember that you can simultaneously run mprime on the CPU and CUDALucas on the GPU (running different exponents on each, obviously), and they don't interfere with one another. The user_data_body script lets you do that. So this two-for-one should be factored into the cost calculation. On the other hand, spot prices for p2 instances can fluctuate fairly dramatically, much more so than the c4 instances, so it's hard to get a handle on how much a p2 is really costing you at any given time. You should probably exclude all the t2.* instances from benchmarks. They aren't intended for sustained use, and Amazon will drastically throttle them back after you use more than a certain rather small quota of CPU time per month. So they're basically unusable for number crunching applications. x1.16xlarge instances aren't really suited for LL testing, I use the nearly 1 TB of memory to run humungous GMP-ECM on a few of the 32 cores and mprime ECM on the remaining cores, hunting for additional factors of small exponents just for fun. It's hard to do exact benchmarks, because you're sharing physical machines with other AWS users, and their code is running on the other cores of the same physical machine and sometimes competing with you for cache, at least at the higher levels like L3. The mprime/Prime95 program is very sensitive to cache usage, so this can have some effect. However, there is a "dedicated tenancy" option that you can specify for your spot instances, so that you don't share hardware with other AWS users. When mprime runs on a multi-core AWS instance, it sometimes fails to properly detect which virtual cores share the same physical core (with hyperthreading). It does a runtime check at startup each time and sometimes this randomly fails to detect the layout of the cores correctly. This is usually only a problem for .2xlarge instances and higher. The solution would be to add the following AffinityScramble2 lines: For .xlarge instances, add this line to local.txt (local-init.txt): Code:
AffinityScramble2=0213 Code:
AffinityScramble2=04152637 Code:
AffinityScramble2=08192A3B4C5D6E7F Code:
AffinityScramble2=0I1J2K3L4M5N6O7P8Q9RASBTCUDVEWFXGYHZ Code:
AffinityScramble2=0G1H2I3J4K5L6M7N8O9PAQBRCSDTEUFV Code:
AffinityScramble2=0W1X2Y3Z4a5b6c7d8e9fAgBhCiDjEkFlGmHnIoJpKqLrMsNtOuPvQwRxSyTzU(V) The AffinityScramble2 lines will be obsolete in the next version 29.1 of mprime/Prime95, which will use improved code to automatically detect the layout of the cores. But if running version 28.10, you could try running the benchmarks with and without the AffinityScramble2 line and see if it makes a difference. The comparison is complicated by the fact that when the AffinityScramble2 line is omitted, sometimes the layout of the cores is correctly determined and sometimes it isn't, and it seems to be somewhat random. So you'd have to pay attention to the output at startup to see whether it did or didn't. If your benchmarks can confirm the effectiveness of AffinityScramble2 I can add it to the guide. It seemed to help when I ran it, but I just did a quick visual check rather than a proper benchmark. Last fiddled with by GP2 on 2017-03-04 at 22:12 Reason: delete paragraphs which misread spot prices in the data table |
![]() |
![]() |
![]() |
#26 |
Sep 2003
258910 Posts |
![]()
Here is the /proc/cpuinfo data for c4.8xlarge and x1.16xlarge instances, maybe someone can help verify the AffinityScramble2 lines in the previous message:
c4.8xlarge Code:
processor : 0 physical id : 0 core id : 0 processor : 1 physical id : 0 core id : 1 processor : 2 physical id : 0 core id : 2 processor : 3 physical id : 0 core id : 3 processor : 4 physical id : 0 core id : 4 processor : 5 physical id : 0 core id : 5 processor : 6 physical id : 0 core id : 6 processor : 7 physical id : 0 core id : 7 processor : 8 physical id : 0 core id : 8 processor : 9 physical id : 1 core id : 0 processor : 10 physical id : 1 core id : 1 processor : 11 physical id : 1 core id : 2 processor : 12 physical id : 1 core id : 3 processor : 13 physical id : 1 core id : 4 processor : 14 physical id : 1 core id : 5 processor : 15 physical id : 1 core id : 6 processor : 16 physical id : 1 core id : 7 processor : 17 physical id : 1 core id : 8 processor : 18 physical id : 0 core id : 0 processor : 19 physical id : 0 core id : 1 processor : 20 physical id : 0 core id : 2 processor : 21 physical id : 0 core id : 3 processor : 22 physical id : 0 core id : 4 processor : 23 physical id : 0 core id : 5 processor : 24 physical id : 0 core id : 6 processor : 25 physical id : 0 core id : 7 processor : 26 physical id : 0 core id : 8 processor : 27 physical id : 1 core id : 0 processor : 28 physical id : 1 core id : 1 processor : 29 physical id : 1 core id : 2 processor : 30 physical id : 1 core id : 3 processor : 31 physical id : 1 core id : 4 processor : 32 physical id : 1 core id : 5 processor : 33 physical id : 1 core id : 6 processor : 34 physical id : 1 core id : 7 processor : 35 physical id : 1 core id : 8 Code:
processor : 0 physical id : 0 core id : 0 processor : 1 physical id : 0 core id : 1 processor : 2 physical id : 0 core id : 2 processor : 3 physical id : 0 core id : 3 processor : 4 physical id : 0 core id : 4 processor : 5 physical id : 0 core id : 5 processor : 6 physical id : 0 core id : 6 processor : 7 physical id : 0 core id : 7 processor : 8 physical id : 0 core id : 8 processor : 9 physical id : 0 core id : 9 processor : 10 physical id : 0 core id : 10 processor : 11 physical id : 0 core id : 11 processor : 12 physical id : 0 core id : 12 processor : 13 physical id : 0 core id : 13 processor : 14 physical id : 0 core id : 14 processor : 15 physical id : 0 core id : 15 processor : 16 physical id : 1 core id : 0 processor : 17 physical id : 1 core id : 1 processor : 18 physical id : 1 core id : 2 processor : 19 physical id : 1 core id : 3 processor : 20 physical id : 1 core id : 4 processor : 21 physical id : 1 core id : 5 processor : 22 physical id : 1 core id : 6 processor : 23 physical id : 1 core id : 7 processor : 24 physical id : 1 core id : 8 processor : 25 physical id : 1 core id : 9 processor : 26 physical id : 1 core id : 10 processor : 27 physical id : 1 core id : 11 processor : 28 physical id : 1 core id : 12 processor : 29 physical id : 1 core id : 13 processor : 30 physical id : 1 core id : 14 processor : 31 physical id : 1 core id : 15 processor : 32 physical id : 0 core id : 0 processor : 33 physical id : 0 core id : 1 processor : 34 physical id : 0 core id : 2 processor : 35 physical id : 0 core id : 3 processor : 36 physical id : 0 core id : 4 processor : 37 physical id : 0 core id : 5 processor : 38 physical id : 0 core id : 6 processor : 39 physical id : 0 core id : 7 processor : 40 physical id : 0 core id : 8 processor : 41 physical id : 0 core id : 9 processor : 42 physical id : 0 core id : 10 processor : 43 physical id : 0 core id : 11 processor : 44 physical id : 0 core id : 12 processor : 45 physical id : 0 core id : 13 processor : 46 physical id : 0 core id : 14 processor : 47 physical id : 0 core id : 15 processor : 48 physical id : 1 core id : 0 processor : 49 physical id : 1 core id : 1 processor : 50 physical id : 1 core id : 2 processor : 51 physical id : 1 core id : 3 processor : 52 physical id : 1 core id : 4 processor : 53 physical id : 1 core id : 5 processor : 54 physical id : 1 core id : 6 processor : 55 physical id : 1 core id : 7 processor : 56 physical id : 1 core id : 8 processor : 57 physical id : 1 core id : 9 processor : 58 physical id : 1 core id : 10 processor : 59 physical id : 1 core id : 11 processor : 60 physical id : 1 core id : 12 processor : 61 physical id : 1 core id : 13 processor : 62 physical id : 1 core id : 14 processor : 63 physical id : 1 core id : 15 Last fiddled with by GP2 on 2017-03-04 at 21:49 |
![]() |
![]() |
![]() |
#27 |
Just call me Henry
"David"
Sep 2007
Liverpool (GMT/BST)
23×757 Posts |
![]() |
![]() |
![]() |
![]() |
#28 |
Einyen
Dec 2003
Denmark
23·431 Posts |
![]()
About 1 week ago I ordered a c4x8large spot instance, and I managed to choose "Dedicated - run a dedicated instance". I did not think much of it, just that it sounded preferable over "shared hardware instance".
Luckily I did it only a few days before the end of the month, so I noticed it on the final bill. The cost is an extra $2 per hour if you have at least one dedicated instance running, over 7 times more than the instance itself cost, and it does not mention that anywhere on the spot request form, I just went back and checked. That little click just cost me an extra ~ $300 until I just noticed it today, so be careful. It could have been much worse, I saw here that back in 2013 it was lowered from $10 per hour to $2 per hour: https://aws.amazon.com/blogs/aws/ec2...ice-reduction/ |
![]() |
![]() |
![]() |
#29 | |
Sep 2003
A1D16 Posts |
![]() Quote:
But with spot instances, one user could be running on a physical machine that someone else was running on mere minutes earlier. Maybe the paranoid fear is that the second user could somehow read leftover data in memory on or the local hard drive, or find some way to install an exploit that would let them read the data of subsequent users. Maybe best security practices dictate that some very time-consuming decontamination procedure is needed before that physical machine can be used by others, including reflashing the firmware and reformatting the hard drives. Who knows? You could try contacting customer support, plead ignorance and try to get unexpected charges reversed. |
|
![]() |
![]() |
![]() |
#30 |
Romulan Interpreter
"name field"
Jun 2011
Thailand
5×112×17 Posts |
![]()
"Learn more about RDS reserved instances", ha! should I click on it? I am a bit afraid it may say something about my mother or tell me I am futile and send me back to the books...
![]() Edit: I clicked on it! It says "error 404"... (on the link provided by ATH, pricing tab, last paragraph, learn more about...).. haha... either he has no reserved instances, or they banned him too ![]() Last fiddled with by LaurV on 2017-04-05 at 05:19 |
![]() |
![]() |
![]() |
#31 | |
"Curtis"
Feb 2005
Riverside, CA
572510 Posts |
![]() Quote:
(I laughed out loud at the RDS instances! Thanks) |
|
![]() |
![]() |
![]() |
#32 |
Einyen
Dec 2003
Denmark
D7816 Posts |
![]()
They did actually did refund all the $296 I spent on the dedicated hardware, so that is a great service.
I did not specifically ask for a refund in my ticket, so maybe that was the correct way to do it. I just said I got this unexpected extra cost (which was over 7 times the actually cost of the instance) and I think they should add a warning in the future to the spot request form about the extra charge. Last fiddled with by ATH on 2017-04-09 at 14:41 |
![]() |
![]() |
![]() |
#33 |
Sep 2003
3·863 Posts |
![]()
One additional factor to consider when trying to determine which instance type is most cost-effective:
If your AWS account is more than one year old, then certain charges which were free in your first year become non-free. In particular, each c4 instance of any size (c4.large, c4.xlarge, etc) uses 8 GB of EBS-backed storage for the root filesystem. In your first year of usage, this is free, but after that it is charged at $0.10 per GB-month, or $0.80 per month per instance. This adds up, for instance if you are running 100 instances then obviously you would pay an additional $80 per month in total. Unfortunately you can't use less than 8 GB for the root filesystem, and you can't specify an "instance store" AMI to try to avoid the EBS charges, because those aren't compatible with c4 instances. There are 720 hours in a 30-day month, so if you have a c4.large instance (one core) with a spot price of, say, 1.6 cents per hour, then the additional $0.80 per month would be the equivalent of an additional 50 hours per month that you are billed for, on top of the actual 720 hours, or about an additional 7%. If the spot price for a c4.large instance were to fall to 1.0 cents per hour, then that same $0.80 would be the equivalent of an additional 80 hours per month, or approximately an additional 11%. The additional charges are less significant for large instances, for instance a c4.xlarge instance (two cores) with a spot price of, say, 3.2 cents per hour, then the additional $0.80 per month would represent only an additional 3.5%. So, the EBS charges slightly worsen the cost-effectiveness of the one-core c4.large instances versus the two-core c4.xlarge instances. Of course there are other factors as well. The actual spot prices fluctuate, and it will very often not be the case that the two-core c4.xlarge instances cost exactly twice as much per hour as the one-core c4.large instances. And performance-wise, the c4.xlarge will usually have a throughput that is a few percent less than the throughput of two c4.large instances when doing double-check exponents in the 40M range, although the throughputs seem to be nearly equivalent for first-time exponents in the 70M range. So there are multiple factors to consider when deciding whether running c4.large or c4.xlarge instances is more cost-effective at any given time. Note that anything bigger, such as the four-core c4.2xlarge all the way up to the 18-core (not 16-core) c4.8xlarge is rarely worthwhile, because the price of an N-core instance for larger N will usually be a lot more than N times the cost of a one-core c4.large instance, and the throughput of an N-core instance running mprime will usually be significantly less than the total throughput of N one-core c4.large instances. |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
How-to guide for running LL tests on Google Compute Engine cloud | GP2 | Cloud Computing | 4 | 2020-08-03 11:21 |
Is it possible to disable benchmarking while torture tests are running? | ZFR | Software | 4 | 2018-02-02 20:18 |
Amazon Cloud Outrage | kladner | Science & Technology | 7 | 2017-03-02 14:18 |
running single tests fast | dragonbud20 | Information & Answers | 12 | 2015-09-26 21:40 |
LL tests running at different speeds | GARYP166 | Information & Answers | 11 | 2009-07-13 19:39 |