mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Data > Marin's Mersenne-aries

Closed Thread
 
Thread Tools
Old 2022-11-13, 18:19   #848
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

101010100011112 Posts
Default

Can you at least submit all of your "bad" results? That way if someone runs it and it matches one of those, it will be complete. I might queue it up on Prime95 on a machine that will take about 85 days to run it.
Uncwilly is offline  
Old 2022-11-13, 18:42   #849
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

101000001010012 Posts
Default

You both didn't get it. Read again my post. It is not about this particular exponent. Neither about using cudaLucas in the future. We know it is slower. And now, I proved it is buggy too. Not the first time I did that either (see 2012).
I don't know if other FFTs are affected.
There may be.
Therefore, there may be exponents which were both LL and DC with cudaLucas (the server accepted such results, with different shifts) and the residues matched, yet, they are wrong. Is not about "fixing" cudaLucas either, as long as we have gpuOwl and PRP with certs. But such exponents, if they exists, we need to find them and redo the tests. If they are too many to re-test "in bulk", then we need to debug cudaLucas to see which FFTs are affected, which versions are affected, etc., to eventually reduce the list.
I would be quite happy to be wrong, and no test to be affected.
But putting my nose into cudaLucas internals (FFT) is not what I can do. What I can do, I can insulate the point where the residues start differing, and make a checkpoint file close to it. Then I can pass that to somebody who knows the trade (George, Mihai, Ernst, etc).
The bug can be reproduced with colab script from Teal/Daniel on A100 and V100.
Going to bed, 1:45 AM here. Need to work today, too, in few hours...

Last fiddled with by LaurV on 2022-11-13 at 18:45
LaurV is offline  
Old 2022-11-13, 18:51   #850
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

5·2,179 Posts
Default

Spin up a thread about this in either GPU computing or in Software.

I understand that there is an issue. But, getting a sanity check via Prime95 should show what the right result is. That can point to the answer WRT to the FFT issue.
Uncwilly is offline  
Old 2022-11-13, 18:56   #851
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

163168 Posts
Default

PMed James & George asking for the CUDALucas-both-times list if feasible. (Would require reporting program to be stored in the database per LL result reported, or a sufficient set of clues to deduce it.)

And all the more reason to DC LL via PRP/GEC/proof generation and upload and cert.


@LaurV, if the 20000K fft deviating residues are reproducible in CUDALucas, please isolate it to the granularity gpuowl accepts on logging intervals (10,000) or finer.

Another possibility is a bug in the NVIDIA CUDA dlls. Pentium fdiv microcode bugs went undetected for a long time, and were operand dependent.

Last fiddled with by kriesel on 2022-11-13 at 19:49
kriesel is offline  
Old 2022-11-13, 20:06   #852
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2×3×1,229 Posts
Default

LaurV, if you are interested in trying to reproduce the problem in other exponents, you could try 20000K fft on CUDALucas for M332196607, for which I have Jacobi-checked and matching final residue, and full log at 50K iterations spacing for interim residues. And probably could round up a few others in gpuowl logs.
kriesel is offline  
Old 2022-11-13, 20:59   #853
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

5·23·71 Posts
Default

Quote:
Originally Posted by LaurV View Post
But putting my nose into cudaLucas internals (FFT) is not what I can do.
Quote:
Originally Posted by kriesel View Post
Another possibility is a bug in the NVIDIA CUDA dlls.
IIRC, there are no CudaLucas FFT internals, simply a call to the CUDA FFT library. That doesn't mean the bug isn't in CudaLucas, there is the weighting and carry propagation code to consider.

P.S. The database knows which LL results were produced by CudaLucas. I'd be extremely surprised if shift count doesn't protect GIMPS from a bad result getting flagged as DCed.

Last fiddled with by Prime95 on 2022-11-13 at 21:02
Prime95 is offline  
Old 2022-11-13, 21:31   #854
dcheuk
 
dcheuk's Avatar
 
Jan 2019
Florida

35 Posts
Default

Quote:
Originally Posted by Uncwilly View Post
This is for the list of needed triple (and higher order) checks.

Code:
Exponents with 2 Unverified results:

Cat 1
DoubleCheck=62858629,74,1
DoubleCheck=62871643,75,1
queued, mprime won't let me reserve these
dcheuk is offline  
Old 2022-11-13, 21:37   #855
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

19×181 Posts
Default

Quote:
Originally Posted by Prime95 View Post
P.S. The database knows which LL results were produced by CudaLucas. I'd be extremely surprised if shift count doesn't protect GIMPS from a bad result getting flagged as DCed.
How many exponents where all tests, 2 or more, were done with CUDALucas?
ATH is online now  
Old 2022-11-13, 21:55   #856
sdbardwick
 
sdbardwick's Avatar
 
Aug 2002
North San Diego County

22·3·67 Posts
Default

Quote:
Originally Posted by dcheuk View Post
queued, mprime won't let me reserve these
Is the machine you tried to reserve them from qualified to do Cat 1 exponents?


I took these:
Cat 2
DoubleCheck=63563179,75,1
DoubleCheck=67467457,75,1
sdbardwick is offline  
Old 2022-11-13, 21:57   #857
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

11100110011102 Posts
Default

Quote:
Originally Posted by ATH View Post
How many exponents where all tests, 2 or more, were done with CUDALucas?
TBD. As earlier indicated, I requested by PM, James or George query the database for the list. It's Sunday afternoon in North America. Please be patient.


One of the issues with CUDALucas is the absence of either readback or error/success value checking after some CUDA library calls. As one example:
CUDAMemcpy performs copies between host and gpu memory. https://developer.download.nvidia.co...2e9930741.html
Returns:cudaSuccess, cudaErrorInvalidValue, cudaErrorInvalidDevicePointer, cudaErrorInvalidMemcpyDirection
Gpuowl copies host>gpu,gpu>host, and does a compare on the host to verify correctness of the gpu copy.
CUDALucas does not do readback and does not IIUC check for success or error return values from the call either.
In CUDALucas.cu routine void write_gpu_data(int q, int n),
Code:
  // Square kernel data
  for (j = (n >> 2) - 1; j > 0; j--) s_ct[j] = 0.5 * cospi (j * d);
  cudaMemcpy (g_ct, s_ct, sizeof (double) * (n / 4), cudaMemcpyHostToDevice);
then continues on without checking for success or errors, as if such things never could happen.
It does this for most calls, for speed, yet is slower than gpuowl on same hardware and inputs.

Similarly, in the LL Iteration loop,
Code:
  cufftExecZ2Z (g_plan, (cufftDoubleComplex *) g_x, (cufftDoubleComplex *) g_x, CUFFT_INVERSE);
Gpu to host transfer is sometimes checked:
Code:
  if (error_flag & 3)
  {
    err = cutilSafeCall1 (cudaMemcpy (&terr, g_err, sizeof (float), cudaMemcpyDeviceToHost));
    if(terr > *maxerr) *maxerr = terr;
    //if( g_pf && g_sl) usleep(g_sv);//, nanosleep sleep(1);
  }
  else if (g_pf && (iter % g_po) == 0)
  {
    err = cutilSafeThreadSync();
    //if(g_sl) usleep(g_sv);//, nanosleep sleep(1);
  }
  if(err != cudaSuccess) terr = -1.0f;
  return (terr);
}

Last fiddled with by kriesel on 2022-11-13 at 22:00
kriesel is offline  
Old 2022-11-13, 22:51   #858
sdbardwick
 
sdbardwick's Avatar
 
Aug 2002
North San Diego County

22×3×67 Posts
Default

Quote:
Originally Posted by ric View Post
Requesting TC (as before, LL-only, please) for:

Code:
DoubleCheck=68891453,75,1
TIA
We matched.
sdbardwick is offline  
Closed Thread

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Posts that seem less than useless, or something like that jasong Forum Feedback 1054 2022-06-20 22:34
Posts in limbo 10metreh Forum Feedback 6 2013-01-10 09:50
Ton of spam posts jasonp Forum Feedback 9 2009-07-19 17:35
Exponents assigned to me but not processed yet? edorajh Data 10 2003-11-18 11:26
2000 posts! Xyzzy Lounge 10 2002-11-21 00:04

All times are UTC. The time now is 12:40.


Tue Feb 7 12:40:20 UTC 2023 up 173 days, 10:08, 1 user, load averages: 1.53, 1.40, 1.35

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔