mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2022-01-22, 21:51   #1684
Ethan (EO)
 
Ethan (EO)'s Avatar
 
"Ethan O'Connor"
Oct 2002
GIMPS since Jan 1996

2×72 Posts
Default Self-Test Failure / Missed Factors with cl_barrett15_*_gs on Radeon VII (mfakto 0.15pre8)

I'm getting missed factors from cl_barrett15_*_gs on Radeon VII with mfakto 0.15pre8 (on linux).
0.14 st and st2 complete without error on the same machine. Let me know if any information other than what I'm including below is helpful.

Code:
Self-test statistics                                      
  number of tests           335250
  successful tests          331150
  no factor found           4100

self-test FAILED!
Code:
mfakto# cat mfakto_radeonVII_st2.log | grep failed | grep barrett15 | wc
   4100   24600  244885
Code:
ERROR: self-test failed for M60008387 (cl_barrett15_71_gs)
ERROR: self-test failed for M60008387 (cl_barrett15_70_gs)
ERROR: self-test failed for M60008387 (cl_barrett15_69_gs)
ERROR: self-test failed for M60008387 (cl_barrett15_73_gs)
ERROR: self-test failed for M60005497 (cl_barrett15_74_gs)
ERROR: self-test failed for M60005497 (cl_barrett15_71_gs)
ERROR: self-test failed for M60005497 (cl_barrett15_70_gs)
ERROR: self-test failed for M60005497 (cl_barrett15_69_gs)
ERROR: self-test failed for M60005497 (cl_barrett15_73_gs)
ERROR: self-test failed for M332193203 (cl_barrett15_74_gs)
ERROR: self-test failed for M332193203 (cl_barrett15_71_gs)
ERROR: self-test failed for M332193203 (cl_barrett15_70_gs)
ERROR: self-test failed for M332193203 (cl_barrett15_69_gs)
ERROR: self-test failed for M332193203 (cl_barrett15_73_gs)
ERROR: self-test failed for M800007823 (cl_barrett15_74_gs)
ERROR: self-test failed for M800007823 (cl_barrett15_71_gs)
ERROR: self-test failed for M800007823 (cl_barrett15_70_gs)
ERROR: self-test failed for M800007823 (cl_barrett15_69_gs)
ERROR: self-test failed for M800007823 (cl_barrett15_73_gs)
ERROR: self-test failed for M800005699 (cl_barrett15_74_gs)
ERROR: self-test failed for M800005699 (cl_barrett15_71_gs)
ERROR: self-test failed for M800005699 (cl_barrett15_70_gs)
ERROR: self-test failed for M800005699 (cl_barrett15_69_gs)
ERROR: self-test failed for M800005699 (cl_barrett15_73_gs)
ERROR: self-test failed for M800003137 (cl_barrett15_74_gs)
ERROR: self-test failed for M800003137 (cl_barrett15_71_gs)
ERROR: self-test failed for M800003137 (cl_barrett15_70_gs)
ERROR: self-test failed for M800003137 (cl_barrett15_69_gs)
ERROR: self-test failed for M800003137 (cl_barrett15_73_gs)
ERROR: self-test failed for M800002757 (cl_barrett15_74_gs)
ERROR: self-test failed for M800002757 (cl_barrett15_71_gs)
ERROR: self-test failed for M800002757 (cl_barrett15_70_gs)
ERROR: self-test failed for M800002757 (cl_barrett15_69_gs)
ERROR: self-test failed for M800002757 (cl_barrett15_73_gs)
[snipped]

Code:
mfakto 0.15pre8 (64-bit build)


Runtime options
  INI file                  mfakto.ini
  Verbosity                 3
  SieveOnGPU                yes
  MoreClasses               yes
  GPUSievePrimes            81157
  GPUSieveProcessSize       24 Kib
WARNING: GPUSieveSize=128M must be a multiple of GPUSieveProcessSize=24k, adjusting GPUSieveSize to 126M
  GPUSieveSize              126 Mib
  FlushInterval             0
  WorkFile                  worktodo.txt
  ResultsFile               results.txt
  Checkpoints               enabled
  CheckpointDelay           300 s
  Stages                    enabled
  StopAfterFactor           class
  PrintMode                 compact
  V5UserID                  EO
  ComputerID                Highland2017
  ProgressHeader            "Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait"
  ProgressFormat            "%d %T | %C %p%% | %t  %e |   %g  %s  %W%%"
  TimeStampInResults        yes
  VectorSize                2
  GPUType                   AUTO
  SmallExp                  no
  UseBinfile                mfakto_Kernels.elf
Compile-time options

Select device - Get device info:
Device 1/1: gfx906 (Advanced Micro Devices, Inc.),
device version: OpenCL 2.0 AMD-APP (3180.7), driver version: 3180.7 (PAL,HSAIL)
Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_
khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_khr_gl_depth_images cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_o
ps cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_gl_event cl_khr_depth_images cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_amd_copy_buffer_p2p 
Global memory:17163091968, Global memory cache: 16384, local memory: 65536, workgroup size: 256, Work dimensions: 3[1024, 1024, 1024, 0, 0] , Max clock speed:1800, compute units:60

OpenCL device info
  name                      gfx906 (Advanced Micro Devices, Inc.)
  device (driver) version   OpenCL 2.0 AMD-APP (3180.7) (3180.7 (PAL,HSAIL))
  maximum threads per block 1024
  maximum threads per grid  1073741824
  number of multiprocessors 60 (3840 compute elements)
  clock rate                1800 MHz

Automatic parameters
  threads per grid          0
  optimizing kernels for    GCNF

Loading binary kernel file mfakto_Kernels.elf
Compiling kernels (build options: "-I. -DVECTOR_SIZE=2 -DGCNF -O3 -DMORE_CLASSES -DCL_GPU_SIEVE"). 
	BUILD OUTPUT

 	END OF BUILD OUTPUT

  GPUSievePrimes (adjusted) 81206
  GPUsieve minimum exponent 1037054
Ethan (EO) is offline   Reply With Quote
Old 2022-04-19, 20:25   #1685
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

5×499 Posts
Default

Anyone know if mfakto works with Intel Arc GPUs in its current state?
ixfd64 is online now   Reply With Quote
Old 2022-10-18, 13:59   #1686
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

11·373 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
Anyone know if mfakto works with Intel Arc GPUs in its current state?
I'll re-ask this question since Intel Arc is now available in more flavours.
James Heinrich is offline   Reply With Quote
Old 2022-10-18, 15:04   #1687
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

53·59 Posts
Default

No first-hand data, but https://www.techpowerup.com/gpu-specs/arc-a750.c3929 indicates OpenCl 3.0, which may be an issue in gpuowl or mfakto since 3.0 is a subset of 2.0. FP32 & FP64 theoretical specs look similar to Radeon VII although memory bandwidth is half that of a Radeon VII. For the a770, FP32 & FP64 theoretical are a bit faster, but memory bandwidth & OpenCl unchanged. Prices are good, if you can find them in stock somewhere near MSRP.

Last fiddled with by kriesel on 2022-10-18 at 15:14
kriesel is offline   Reply With Quote
Old 2022-10-18, 16:25   #1688
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

11101001012 Posts
Default

Quote:
Originally Posted by kriesel View Post
No first-hand data, but https://www.techpowerup.com/gpu-specs/arc-a750.c3929 indicates OpenCl 3.0, which may be an issue in gpuowl or mfakto since 3.0 is a subset of 2.0. FP32 & FP64 theoretical specs look similar to Radeon VII although memory bandwidth is half that of a Radeon VII. For the a770, FP32 & FP64 theoretical are a bit faster, but memory bandwidth & OpenCl unchanged. Prices are good, if you can find them in stock somewhere near MSRP.
I was under the impression that they didn't implement hardware FP64 on consumer cards, so am suspicious of the 1:4 FP64 statistic at least.
M344587487 is offline   Reply With Quote
Old 2022-10-18, 17:41   #1689
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

53·59 Posts
Default

Quote:
Intel says the Arc A770 and A750 require a 10th-gen Intel CPU or AMD Ryzen 3000 CPU or newer. That’s because Arc Alchemist cards benefit a lot from Resizable BAR, which is only available on the last few generations of processors. The cards will work with older CPUs, but you’ll have much lower performance if ReBAR is turned off.
https://www.digitaltrends.com/comput...0-a750-review/
kriesel is offline   Reply With Quote
Old 2022-11-15, 02:34   #1690
syjytg
 
Nov 2022

1 Posts
Default

How to download mfakto for GPU trial factoring?
syjytg is offline   Reply With Quote
Old 2022-11-15, 02:51   #1691
AlvinBunk
 
Dec 2017

32 Posts
Default mfakto download

You can download from here: https://download.mersenne.ca/mfakto
AlvinBunk is offline   Reply With Quote
Old 2022-12-07, 23:50   #1692
aperson1
 
Dec 2022

2·3 Posts
Default

I've just downloaded this for my Vega 20 built-in laptop GPU and it's a clear improvement above CPU factoring, but I've been having some minor issues I was curious if anyone knew any workarounds/optimizations for.

First of all, I'm getting some performance stuttering: Looking at the compute-1 process on my task manager, it goes from 100% usage to 0% intermittently, and running the program starts more efficient (~250-300 ghzd/day) before slowly getting less efficient over the next minute (down to ~190 ghzd/day) I would guess because of the stuttering.

In the .ini file, I lowered GPUSieveSize from the default 96 down to 16, which helped significantly but did not solve the problem, and I seemed to be getting diminishing returns rather than an actual "no stuttering" point.

---

Additionally, it seems to be significantly affecting my CPU performance running Prime95, bringing the speed it does both TF and P-1 work to ~55% (for some reason, Prime95 idly uses just ~30% of my CPU when there's 70% unused power, and using this program lowers that further to just ~15%).

I know lowering the CPU portion of GPU factoring lowers its efficiency, but what particular settings work best to minimize interference with Prime95's computing work, if any?
aperson1 is offline   Reply With Quote
Old 2022-12-08, 00:55   #1693
aperson1
 
Dec 2022

2·3 Posts
Default

PS- I've solved the CPU utilization problem. For some reason, my single worker was set to only use 4 cores of my 8 core CPU. I've doubled up into two workers of 4 each and it works much more efficiently.
aperson1 is offline   Reply With Quote
Old 2022-12-08, 01:08   #1694
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

53·59 Posts
Default

Quote:
Originally Posted by aperson1 View Post
I've just downloaded this for my Vega 20 built-in laptop GPU and it's a clear improvement above CPU factoring, but I've been having some minor issues I was curious if anyone knew any workarounds/optimizations for.

First of all, I'm getting some performance stuttering: Looking at the compute-1 process on my task manager, it goes from 100% usage to 0% intermittently, and running the program starts more efficient (~250-300 ghzd/day) before slowly getting less efficient over the next minute (down to ~190 ghzd/day) I would guess because of the stuttering.

In the .ini file, I lowered GPUSieveSize from the default 96 down to 16, which helped significantly but did not solve the problem, and I seemed to be getting diminishing returns rather than an actual "no stuttering" point.

---

Additionally, it seems to be significantly affecting my CPU performance running Prime95, bringing the speed it does both TF and P-1 work to ~55% (for some reason, Prime95 idly uses just ~30% of my CPU when there's 70% unused power, and using this program lowers that further to just ~15%).

I know lowering the CPU portion of GPU factoring lowers its efficiency, but what particular settings work best to minimize interference with Prime95's computing work, if any?
Sounds like you're using the IGP (low end GPU that is part of the CPU package). That will necessarily impair CPU throughput since they share the same wattage limit. They also share the same memory & bandwidth limit, but GPU TF does not use much memory. If the CPU supports hyperthreading, note that for P-1, PRP, LLDC, hyperthreading usually is not of benefit, so one thread per core will show 50% utilization in Task Manager by prime95. It should show almost no CPU usage for mfakto or other GPU apps; you want nearly all computation in a GPU app to be performed by the GPU, including sieving of candidate factors when possible. (Kernels ending in _gs)

Quote:
Originally Posted by syjytg View Post
How to download mfakto for GPU trial factoring?
For how to get started, and optimize, see reference info https://mersenneforum.org/showthread.php?t=24607
igp thread https://www.mersenneforum.org/showthread.php?t=25717
mfakto thread https://www.mersenneforum.org/showthread.php?t=23394
kriesel is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
gpuOwL: an OpenCL program for Mersenne primality testing preda GpuOwl 2910 2023-02-05 19:37
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3622 2023-01-25 16:41
LL with OpenCL msft GPU Computing 433 2019-06-23 21:11
OpenCL for FPGAs TObject GPU Computing 2 2013-10-12 21:09
Program to TF Mersenne numbers with more than 1 sextillion digits? Stargate38 Factoring 24 2011-11-03 00:34

All times are UTC. The time now is 05:02.


Wed Feb 8 05:02:53 UTC 2023 up 174 days, 2:31, 1 user, load averages: 1.43, 1.30, 1.05

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔