mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2017-06-25, 18:15   #1
GP2
 
GP2's Avatar
 
Sep 2003

1010000110012 Posts
Default Hyperthreading broken in Skylake and Kaby Lake?

The source of this information is this post in the Debian mailing list.

There are reddit threads in r/linux and in r/hardware and r/intel and r/programming.

Anyone heard about this?

PS,
The Skylake processors on Google Compute Engine are model 85, stepping 3. I have turned HyperthreadLL=1 on for mprime and ran about a dozen or so successful double checks, and am currently running several instances of Mlucas with hyperthreading (using -cpu 0:1 with a single-core virtual machine) but they haven't run to completion yet.

For the cloud, I'd probably keep hyperthreading on despite the risk, because there are no consequences to data corruption in a virtual machine other than a bad LL result, and on GCE (unlike Amazon AWS) the benchmarks are considerably faster with hyperthreading enabled.

Last fiddled with by GP2 on 2017-06-25 at 18:29
GP2 is offline   Reply With Quote
Old 2017-06-25, 23:55   #2
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
Rep├║blica de California

2×13×449 Posts
Default

Quote:
Originally Posted by GP2 View Post
For the cloud, I'd probably keep hyperthreading on despite the risk, because there are no consequences to data corruption in a virtual machine other than a bad LL result, and on GCE (unlike Amazon AWS) the benchmarks are considerably faster with hyperthreading enabled.
No consequences - other than wasted cycles and $, you mean? (I realize you are likely still using your free-trial-GCE-$ for your testing there, but 'industrial strength' GCE users will be unlikely to share your 'so what' attitude. :) Since it is important to get a sense of frequency of occurrence of the HT bugs, may I presume you are running DCs on GCE?
ewmayer is offline   Reply With Quote
Old 2017-06-26, 00:06   #3
GP2
 
GP2's Avatar
 
Sep 2003

5·11·47 Posts
Default

Quote:
Originally Posted by ewmayer View Post
No consequences - other than wasted cycles and $, you mean? (I realize you are likely still using your free-trial-GCE-$ for your testing there, but 'industrial strength' GCE users will be unlikely to share your 'so what' attitude. :) Since it is important to get a sense of frequency of occurrence of the HT bugs, may I presume you are running DCs on GCE?
It already did a dozen or so correct DCs using mprime, and from what I gather from reading the reddit threads, the bug is only triggered under rare circumstances. So I am leaving hyperthreading on for Mlucas and we'll see what happens. On GCE (unlike AWS) there would be a major performance impact to turning it off.

What I meant was, with a virtual machine you don't have to worry about data corruption somehow affecting the operating system files, since you can just create a new machine in seconds.

Last fiddled with by GP2 on 2017-06-26 at 00:06
GP2 is offline   Reply With Quote
Old 2017-06-26, 01:34   #4
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
Rep├║blica de California

2·13·449 Posts
Default

Quote:
Originally Posted by GP2 View Post
It already did a dozen or so correct DCs using mprime, and from what I gather from reading the reddit threads, the bug is only triggered under rare circumstances. So I am leaving hyperthreading on for Mlucas and we'll see what happens. On GCE (unlike AWS) there would be a major performance impact to turning it off.

What I meant was, with a virtual machine you don't have to worry about data corruption somehow affecting the operating system files, since you can just create a new machine in seconds.
Were you using single-threaded mode for your mprime runs? IIRC you got best performance for mprime 1-threaded, vs mlucas 2-threaded. I'm wondering whether having HT enabled but running 1-threaded is less likely to hit the bug than running 2-threaded, i.e. 1 thread for each of the 2 logical cores enabled by HT. In any event, your Mlucas DC results, once they complete, will tell us more.
ewmayer is offline   Reply With Quote
Old 2017-06-26, 02:08   #5
GP2
 
GP2's Avatar
 
Sep 2003

5×11×47 Posts
Default

Quote:
Originally Posted by ewmayer View Post
Were you using single-threaded mode for your mprime runs? IIRC you got best performance for mprime 1-threaded, vs mlucas 2-threaded. I'm wondering whether having HT enabled but running 1-threaded is less likely to hit the bug than running 2-threaded, i.e. 1 thread for each of the 2 logical cores enabled by HT. In any event, your Mlucas DC results, once they complete, will tell us more.
I used HyperthreadLL=1 for the mprime double-checks.

I recall finding that enabling it made mprime run faster, although it's hard to quantify exactly how much. There is considerably more variability in the benchmarks when running on GCE compared to AWS, I"m not sure why. GCE has the interesting property that they can migrate a running process from one server to another without stopping it, and there might be some heterogeneity in terms of Skylake processor speeds and models for all I know.
GP2 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Kaby Lake Memory Speed airsquirrels Hardware 12 2017-06-22 14:48
Kaby Lake / Asrock disappointment, RAM weirdness Prime95 Hardware 17 2017-01-27 21:09
Kaby Lake processors: bor-ing ! tServo Hardware 11 2016-12-18 10:32
Kaby Lake chip Prime95 Hardware 0 2016-10-26 23:23
Skylake AVX-512 clarke Software 15 2015-03-04 21:48

All times are UTC. The time now is 07:24.


Tue Nov 30 07:24:31 UTC 2021 up 130 days, 1:53, 0 users, load averages: 0.98, 0.93, 0.98

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.