mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2014-08-13, 07:12   #1
TheMawn
 
TheMawn's Avatar
 
May 2013
East. Always East.

11×157 Posts
Default Lots of roundoff errors

Hey, folks.

I've been getting a number of roundoff errors in a DC. I wouldn't be heartbroken if the test came out bad since it's a smaller exponent (less wasted time) but I'd still like to get to the bottom of this.

I've attached a screenshot of the worker in question.

Here's the bits of results.txt relevant to the exponent in question.

Code:
[Tue Aug 12 00:32:29 2014]

Trying 1000 iterations for exponent 32562559 using 1680K FFT.
If average roundoff error is above 0.24236, then a larger FFT will be used.
Final average roundoff error is 0.23838, using 1680K FFT for exponent 32562559.

[Tue Aug 12 07:50:20 2014]

Iteration: 6557188/32562559, Possible error: round off (0.4375) > 0.40
Continuing from last save file.
Disregard last error.  Result is reproducible and thus not a hardware problem.
For added safety, redoing iteration using a slower, more reliable method.
Continuing from last save file.

[Tue Aug 12 22:27:40 2014]

Iteration: 7679973/32562559, Possible error: round off (0.4375) > 0.40
Continuing from last save file.

[Wed Aug 13 01:03:36 2014]

Iteration: 7679973/32562559, Possible error: round off (0.4375) > 0.40
Continuing from last save file.
Disregard last error.  Result is reproducible and thus not a hardware problem.
For added safety, redoing iteration using a slower, more reliable method.
Continuing from last save file.
I was having some strange stuttering problems (audio and video) during a Civ5 round with friends. I was going to stop a Prime95 worker, but when I saw all of this crap, I stopped everything altogether and rebooted. All technical issues vanished.

When I started my worker up again, I took the screenshot.

7:50 on Tuesday is some random time of the morning.

22:27 is when we started the round.

1:03 on Wednesday is when I started everything up again.


The part that scares me is "2 roundoff errors of which 1 is repeatable"

EDIT: Now that I think of it, the 22:27 and 1:03 errors are probably the same ones. I think I stopped the program before it had the chance to double-check the questionable iteration.
Attached Thumbnails
Click image for larger version

Name:	Untitled.png
Views:	275
Size:	12.9 KB
ID:	11579  

Last fiddled with by TheMawn on 2014-08-13 at 07:13
TheMawn is offline   Reply With Quote
Old 2014-08-13, 07:49   #2
axn
 
axn's Avatar
 
Jun 2003

2×7×389 Posts
Default

Everything is consistent with an exponent right on the crossover point between FFT sizes. Things should be fine the way they are.

EDIT:- Increase your checkpointing frequency so that you lose fewer iterations when restarting.

Last fiddled with by axn on 2014-08-13 at 07:50
axn is offline   Reply With Quote
Old 2014-08-13, 16:59   #3
TheMawn
 
TheMawn's Avatar
 
May 2013
East. Always East.

11×157 Posts
Default

Or should I just start over and force a larger FFT...?
Attached Thumbnails
Click image for larger version

Name:	Untitled.png
Views:	281
Size:	213.4 KB
ID:	11584  
TheMawn is offline   Reply With Quote
Old 2014-08-13, 17:16   #4
axn
 
axn's Avatar
 
Jun 2003

2×7×389 Posts
Default

Quote:
Originally Posted by TheMawn View Post
Or should I just start over and force a larger FFT...?
Perhaps that is better. At least you wouldn't have to worry this much.

Last fiddled with by axn on 2014-08-13 at 17:17
axn is offline   Reply With Quote
Old 2014-08-13, 17:35   #5
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

236568 Posts
Default

Quote:
Originally Posted by axn View Post
Perhaps that is better. At least you wouldn't have to worry this much.
I have recently restarted a couple of assignments with a higher FFT, though I did not have the "reproducible error" message. Part of my problem was that there were voltage stability problems, I think. I have worked on addressing these. I am trying the next higher FFT on the rationale that if the hardware is still causing errors, it will deliver a hard round off error, even with the higher FFT.

The above is a follow up to trying to balance my set voltage against Load Line Calibration so that it does not dip below a threshold voltage which I consider safe, but does not end up too high under load. This is somewhat complicated, because the load varies widely between idle, P95 running, and P95 plus two hungry GTX 500 series GPUs. I think I have it under control now after considerable testing with various combinations of GPUs and the different Torture Test modes.

I will have to complete a few more assignments (DC) to be really confident about the situation.

Last fiddled with by kladner on 2014-08-13 at 17:36
kladner is offline   Reply With Quote
Old 2014-08-13, 17:47   #6
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2×4,099 Posts
Default

Quote:
Originally Posted by TheMawn View Post
Or should I just start over and force a larger FFT...?
Your roundoff errors of 0.4375 are nowhere near 0.5. Relax.
Prime95 is offline   Reply With Quote
Old 2014-08-13, 20:27   #7
TObject
 
TObject's Avatar
 
Feb 2012

34·5 Posts
Default

Maybe these should be called roundoff warnings. LOL
TObject is offline   Reply With Quote
Old 2014-08-13, 22:16   #8
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

5·2,351 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Your roundoff errors of 0.4375 are nowhere near 0.5. Relax.
While 0.4375 is mostly safe (~99% in my experience), if you get multiple such during a test, don't get too relaxed.

George, do you have any large-dataset stats on ROEs at the above level, vs bad-results? A histogram of "number of 0.4375 errors during test vs % of such tests which failed" would be really useful.
ewmayer is offline   Reply With Quote
Old 2014-08-13, 22:29   #9
TheMawn
 
TheMawn's Avatar
 
May 2013
East. Always East.

172710 Posts
Default

I think Misters Lucas and Lehmer would be pretty proud of what we're doing. Needing a special method using Sine and Cosine just to square a (big) integer.
TheMawn is offline   Reply With Quote
Old 2014-08-13, 22:42   #10
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2·4,099 Posts
Default

Quote:
Originally Posted by ewmayer View Post
George, do you have any large-dataset stats on ROEs at the above level, vs bad-results? A histogram of "number of 0.4375 errors during test vs % of such tests which failed" would be really useful.
No I don't. However, prime95 uses a special method to redo any iteration with an ROE above 0.40625. In effect, prime95 can tolerate ROE up to 0.59375.
Prime95 is offline   Reply With Quote
Old 2014-08-13, 23:13   #11
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2DEB16 Posts
Default

Quote:
Originally Posted by Prime95 View Post
No I don't. However, prime95 uses a special method to redo any iteration with an ROE above 0.40625. In effect, prime95 can tolerate ROE up to 0.59375.
You can reliably determine if e.g. a 0.4375 is really a 0.5625 which has been NINT-aliased? Do tell - something based on an FFT checksum?
ewmayer is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Post Lots and Lots of Top-5000 Primes Here Kosmaj Riesel Prime Search 1984 2022-07-22 04:12
Prime95 roundoff errors pjaj Software 24 2021-12-16 01:11
Possible hardware errors have occurred during the test! 1 ROUNDOFF > 0.4. Xyzzy Software 7 2016-12-20 00:01
POST LOTS AND LOTS AND LOTS OF PRIMES HERE lsoule Riesel Prime Search 1999 2010-03-17 22:33
lots of large primes Peter Hackman Factoring 2 2008-08-15 14:26

All times are UTC. The time now is 02:44.


Thu Mar 23 02:44:48 UTC 2023 up 217 days, 13 mins, 0 users, load averages: 1.21, 1.01, 0.91

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔