mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2016-02-02, 23:55   #1
tha
 
tha's Avatar
 
Dec 2002

787 Posts
Default Skylake FMA3 round off error

I have my new Skylake at work and it completed its first DC exponent, which was in the 43M range. The system appears to be stable. Before the work was started mprime did a test to see if it could be done in a FFT size of 2240K. The round off error was just below the acceptable value. The DC Lucas-Lehmer test completed successfully.
The system then started to work on its second exponent, 43175681. Same procedure, except that halfway in the DC test it started to show a round off error. Now, at 90% work done there are already 10 round off errors.

A stress test that I did after a few errors had shown up did not show up any errors.

I am thinking of running a second test on this exponent using FMA2 instead of FMA3 to see if the different rounding in the two methods might be the cause of this.

Any thoughts?
tha is offline   Reply With Quote
Old 2016-02-03, 05:51   #2
S485122
 
S485122's Avatar
 
Sep 2006
Brussels, Belgium

32×181 Posts
Default

My experience with a Haswell-E (5820K) is that Prime95 is a bit too aggressive when near the limit for a FFT size. I fiddled with the "SoftCrossoverAdjust" parameter in prime.txt to go round this.

Jacob
S485122 is offline   Reply With Quote
Old 2016-02-03, 09:39   #3
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

19·157 Posts
Default

I queued it up for a triple check.

FMA3 is part of AVX2. There is nothing called FMA2, the other FFT type in Prime95 is just called AVX.

You can use: SoftCrossover=0.1 or SoftCrossover=0 in prime.txt then it will use a larger FFT size at a lower exponent.


Edit: Reading undoc.txt again it might be better to use SoftCrossoverAdjust=-0.002 or SoftCrossoverAdjust=-0.004 instead of disabling the SoftCrossover feature (with SoftCrossover=0).

Last fiddled with by ATH on 2016-02-03 at 09:44
ATH is offline   Reply With Quote
Old 2016-02-03, 23:07   #4
tha
 
tha's Avatar
 
Dec 2002

11000100112 Posts
Default

Quote:
Originally Posted by ATH View Post
I queued it up for a triple check.

FMA3 is part of AVX2. There is nothing called FMA2, the other FFT type in Prime95 is just called AVX.

You can use: SoftCrossover=0.1 or SoftCrossover=0 in prime.txt then it will use a larger FFT size at a lower exponent.


Edit: Reading undoc.txt again it might be better to use SoftCrossoverAdjust=-0.002 or SoftCrossoverAdjust=-0.004 instead of disabling the SoftCrossover feature (with SoftCrossover=0).
I have set the values in prime.txt to:

SoftCrossover=0.3
SoftCrossoverAdjust=-0.020

Let me see how this works out

The exponent 43175683 is now being done witt FFT size 2304K instead of 2240K. This bigger size turns out to be about 4-5% faster than the smaller one. Maybe some interference with the memory speed or so.
tha is offline   Reply With Quote
Old 2016-02-04, 00:55   #5
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Quote:
Originally Posted by tha View Post
The exponent 43175683 is now being done witt FFT size 2304K instead of 2240K. This bigger size turns out to be about 4-5% faster than the smaller one. Maybe some interference with the memory speed or so.
George, might this merit some tweaks to Prime95?
Dubslow is offline   Reply With Quote
Old 2016-02-04, 02:09   #6
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

3·7·19·29 Posts
Default

My FFT-error-modeling-based length-setting function suggests a maxP in the range 43005794-43235170 for 2240K, with 'where in the range' depending on the details of the short-length DFT algos needed to build up that FFT length: 2240K = 2^16.5.7, and whether the FMA version of same give greater or lesser accuracy than the non-FMA ones. p = 43175683 is definitely pushing things.

In my own FMA3 code deployment I found that certain DFT radices showed surprisingly higher ROE levels when coded in the obvious fashion, and much of this was due to certain common arithmetic combinations (such as the twiddle-multiply by the odd-indexed 8th roots of unity, (±1±I)/sqrt(2) in the radix-8 DFT) when coded up to take advantage of FMA. My workaround for these was based on a lot of experimentation with 'strategic wrong-way rounding' of various key arithmetic constants, i.e. round the LSB float64 bit in the opposite direction of that indicated by the quad-float high-precision version of the same constant. The current state of my various sensitivity analyses here is as much voodoo as rigorous science, however.

Last fiddled with by ewmayer on 2016-02-04 at 02:11
ewmayer is offline   Reply With Quote
Old 2016-02-04, 09:13   #7
tha
 
tha's Avatar
 
Dec 2002

787 Posts
Default

Quote:
Originally Posted by Dubslow View Post
George, might this merit some tweaks to Prime95?
With another machine, I also noticed this before, sometimes a larger FFT is faster than the smaller one. It would be worth to have the benchmark routine extended with a routine (or introduce a seperate one) that successively tests each FFT size for speed and then software wise disable the sizes that are slower than their both bigger and faster ones.

Also, I have to make a confession. The errors started showing up after I closed the cabinet. The GTX580 in it has a power connector that presses against the cover. That may very well have exerted too much pressure on the motherboard. I am not willing to close the cabinet again, before having obtained or custom make a new power cable.

ATH, are you doing this exponent on a Skylake, or a less recent machine?
tha is offline   Reply With Quote
Old 2016-02-04, 09:19   #8
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

19·157 Posts
Default

Quote:
Originally Posted by tha View Post
ATH, are you doing this exponent on a Skylake, or a less recent machine?
I'm doing it on my Titan Black. It will start in about 7 hours and take 20-30 hours depending on how much I use my computer, I have a few days off.
ATH is offline   Reply With Quote
Old 2016-02-04, 18:59   #9
tha
 
tha's Avatar
 
Dec 2002

787 Posts
Default

I now believe the cause of the errors was the power cable of the GPU exerting pressure on the motherboard. I took the tie wraps off the cable and they now bend easily against the side panel.

Can I have this one faulty result taken from the machine's track record?
tha is offline   Reply With Quote
Old 2016-02-04, 22:52   #10
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

CCD16 Posts
Default

Quote:
Originally Posted by tha View Post
I now believe the cause of the errors was the power cable of the GPU exerting pressure on the motherboard. I took the tie wraps off the cable and they now bend easily against the side panel.

Can I have this one faulty result taken from the machine's track record?
Probably not... EDIT: I pulled up some wrong results... hang on...

Okay, after I looked up the correct user's info... your stats for all of your systems combined are:
1 suspect result (M43175681) ... currently unknown if it's bad until a triple-check...hang in there!
329 good
0 bad
44 unknown

When I do my own statistical analysis of systems, I break down a machine's results by user, cpu, year, and app version. So if that does end up being bad, it will only affect that one cpu and app version for any 2016 results. And I only use that info to guess at tie-breakers for mismatched results. I currently consider any system with zero bad and >= 15 good to be the winner, and that's really all the guessing involved.

Last fiddled with by Madpoo on 2016-02-04 at 22:58
Madpoo is offline   Reply With Quote
Old 2016-02-05, 03:47   #11
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

22×2,239 Posts
Default

Quote:
Originally Posted by tha View Post
Can I have this one faulty result taken from the machine's track record?
Yes, Madpoo can easily do that, after he clear out all my bad results I had between January and March 2012 when I was testing the new cudalucas (switching from powers-of-two-FFT to non-powers-of-two-FFT). Didn't I say that many times?

The rule was that the bad results stay bad. For whatever reason. Otherwise we completely mess the statistics... Are we doing statistics or not?

Last fiddled with by LaurV on 2016-02-05 at 03:47
LaurV is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Round off error Androx72 Software 2 2013-02-28 00:00
mprime ROUND OFF ERROR: Triple-check advised? Bdot Software 5 2012-12-22 22:34
HDT55TWFK6DGR voltage and round off error RickC Hardware 2 2011-02-19 04:07
Error: Round Off??? edorajh Software 27 2007-11-10 06:26
Another Round Off Error Issue PhilF Software 12 2005-07-02 19:03

All times are UTC. The time now is 01:50.

Thu Dec 3 01:50:17 UTC 2020 up 83 days, 23:01, 1 user, load averages: 3.15, 2.41, 2.08

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.