![]() |
![]() |
#1684 |
"Ethan O'Connor"
Oct 2002
GIMPS since Jan 1996
2×72 Posts |
![]()
I'm getting missed factors from cl_barrett15_*_gs on Radeon VII with mfakto 0.15pre8 (on linux).
0.14 st and st2 complete without error on the same machine. Let me know if any information other than what I'm including below is helpful. Code:
Self-test statistics number of tests 335250 successful tests 331150 no factor found 4100 self-test FAILED! Code:
mfakto# cat mfakto_radeonVII_st2.log | grep failed | grep barrett15 | wc 4100 24600 244885 Code:
ERROR: self-test failed for M60008387 (cl_barrett15_71_gs) ERROR: self-test failed for M60008387 (cl_barrett15_70_gs) ERROR: self-test failed for M60008387 (cl_barrett15_69_gs) ERROR: self-test failed for M60008387 (cl_barrett15_73_gs) ERROR: self-test failed for M60005497 (cl_barrett15_74_gs) ERROR: self-test failed for M60005497 (cl_barrett15_71_gs) ERROR: self-test failed for M60005497 (cl_barrett15_70_gs) ERROR: self-test failed for M60005497 (cl_barrett15_69_gs) ERROR: self-test failed for M60005497 (cl_barrett15_73_gs) ERROR: self-test failed for M332193203 (cl_barrett15_74_gs) ERROR: self-test failed for M332193203 (cl_barrett15_71_gs) ERROR: self-test failed for M332193203 (cl_barrett15_70_gs) ERROR: self-test failed for M332193203 (cl_barrett15_69_gs) ERROR: self-test failed for M332193203 (cl_barrett15_73_gs) ERROR: self-test failed for M800007823 (cl_barrett15_74_gs) ERROR: self-test failed for M800007823 (cl_barrett15_71_gs) ERROR: self-test failed for M800007823 (cl_barrett15_70_gs) ERROR: self-test failed for M800007823 (cl_barrett15_69_gs) ERROR: self-test failed for M800007823 (cl_barrett15_73_gs) ERROR: self-test failed for M800005699 (cl_barrett15_74_gs) ERROR: self-test failed for M800005699 (cl_barrett15_71_gs) ERROR: self-test failed for M800005699 (cl_barrett15_70_gs) ERROR: self-test failed for M800005699 (cl_barrett15_69_gs) ERROR: self-test failed for M800005699 (cl_barrett15_73_gs) ERROR: self-test failed for M800003137 (cl_barrett15_74_gs) ERROR: self-test failed for M800003137 (cl_barrett15_71_gs) ERROR: self-test failed for M800003137 (cl_barrett15_70_gs) ERROR: self-test failed for M800003137 (cl_barrett15_69_gs) ERROR: self-test failed for M800003137 (cl_barrett15_73_gs) ERROR: self-test failed for M800002757 (cl_barrett15_74_gs) ERROR: self-test failed for M800002757 (cl_barrett15_71_gs) ERROR: self-test failed for M800002757 (cl_barrett15_70_gs) ERROR: self-test failed for M800002757 (cl_barrett15_69_gs) ERROR: self-test failed for M800002757 (cl_barrett15_73_gs) [snipped] Code:
mfakto 0.15pre8 (64-bit build) Runtime options INI file mfakto.ini Verbosity 3 SieveOnGPU yes MoreClasses yes GPUSievePrimes 81157 GPUSieveProcessSize 24 Kib WARNING: GPUSieveSize=128M must be a multiple of GPUSieveProcessSize=24k, adjusting GPUSieveSize to 126M GPUSieveSize 126 Mib FlushInterval 0 WorkFile worktodo.txt ResultsFile results.txt Checkpoints enabled CheckpointDelay 300 s Stages enabled StopAfterFactor class PrintMode compact V5UserID EO ComputerID Highland2017 ProgressHeader "Date Time | class Pct | time ETA | GHz-d/day Sieve Wait" ProgressFormat "%d %T | %C %p%% | %t %e | %g %s %W%%" TimeStampInResults yes VectorSize 2 GPUType AUTO SmallExp no UseBinfile mfakto_Kernels.elf Compile-time options Select device - Get device info: Device 1/1: gfx906 (Advanced Micro Devices, Inc.), device version: OpenCL 2.0 AMD-APP (3180.7), driver version: 3180.7 (PAL,HSAIL) Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_ khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_khr_gl_depth_images cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_o ps cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_gl_event cl_khr_depth_images cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_amd_copy_buffer_p2p Global memory:17163091968, Global memory cache: 16384, local memory: 65536, workgroup size: 256, Work dimensions: 3[1024, 1024, 1024, 0, 0] , Max clock speed:1800, compute units:60 OpenCL device info name gfx906 (Advanced Micro Devices, Inc.) device (driver) version OpenCL 2.0 AMD-APP (3180.7) (3180.7 (PAL,HSAIL)) maximum threads per block 1024 maximum threads per grid 1073741824 number of multiprocessors 60 (3840 compute elements) clock rate 1800 MHz Automatic parameters threads per grid 0 optimizing kernels for GCNF Loading binary kernel file mfakto_Kernels.elf Compiling kernels (build options: "-I. -DVECTOR_SIZE=2 -DGCNF -O3 -DMORE_CLASSES -DCL_GPU_SIEVE"). BUILD OUTPUT END OF BUILD OUTPUT GPUSievePrimes (adjusted) 81206 GPUsieve minimum exponent 1037054 |
![]() |
![]() |
![]() |
#1685 |
Bemusing Prompter
"Danny"
Dec 2002
California
5×499 Posts |
![]()
Anyone know if mfakto works with Intel Arc GPUs in its current state?
|
![]() |
![]() |
![]() |
#1686 |
"James Heinrich"
May 2004
ex-Northern Ontario
11·373 Posts |
![]() |
![]() |
![]() |
![]() |
#1687 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
53·59 Posts |
![]()
No first-hand data, but https://www.techpowerup.com/gpu-specs/arc-a750.c3929 indicates OpenCl 3.0, which may be an issue in gpuowl or mfakto since 3.0 is a subset of 2.0. FP32 & FP64 theoretical specs look similar to Radeon VII although memory bandwidth is half that of a Radeon VII. For the a770, FP32 & FP64 theoretical are a bit faster, but memory bandwidth & OpenCl unchanged. Prices are good, if you can find them in stock somewhere near MSRP.
Last fiddled with by kriesel on 2022-10-18 at 15:14 |
![]() |
![]() |
![]() |
#1688 | |
"Composite as Heck"
Oct 2017
11101001012 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#1689 | |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
53·59 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#1690 |
Nov 2022
1 Posts |
![]()
How to download mfakto for GPU trial factoring?
|
![]() |
![]() |
![]() |
#1691 |
Dec 2017
32 Posts |
![]()
You can download from here: https://download.mersenne.ca/mfakto
|
![]() |
![]() |
![]() |
#1692 |
Dec 2022
2·3 Posts |
![]()
I've just downloaded this for my Vega 20 built-in laptop GPU and it's a clear improvement above CPU factoring, but I've been having some minor issues I was curious if anyone knew any workarounds/optimizations for.
First of all, I'm getting some performance stuttering: Looking at the compute-1 process on my task manager, it goes from 100% usage to 0% intermittently, and running the program starts more efficient (~250-300 ghzd/day) before slowly getting less efficient over the next minute (down to ~190 ghzd/day) I would guess because of the stuttering. In the .ini file, I lowered GPUSieveSize from the default 96 down to 16, which helped significantly but did not solve the problem, and I seemed to be getting diminishing returns rather than an actual "no stuttering" point. --- Additionally, it seems to be significantly affecting my CPU performance running Prime95, bringing the speed it does both TF and P-1 work to ~55% (for some reason, Prime95 idly uses just ~30% of my CPU when there's 70% unused power, and using this program lowers that further to just ~15%). I know lowering the CPU portion of GPU factoring lowers its efficiency, but what particular settings work best to minimize interference with Prime95's computing work, if any? |
![]() |
![]() |
![]() |
#1693 |
Dec 2022
2·3 Posts |
![]()
PS- I've solved the CPU utilization problem. For some reason, my single worker was set to only use 4 cores of my 8 core CPU. I've doubled up into two workers of 4 each and it works much more efficiently.
|
![]() |
![]() |
![]() |
#1694 | |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
53·59 Posts |
![]() Quote:
For how to get started, and optimize, see reference info https://mersenneforum.org/showthread.php?t=24607 igp thread https://www.mersenneforum.org/showthread.php?t=25717 mfakto thread https://www.mersenneforum.org/showthread.php?t=23394 |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
gpuOwL: an OpenCL program for Mersenne primality testing | preda | GpuOwl | 2910 | 2023-02-05 19:37 |
mfaktc: a CUDA program for Mersenne prefactoring | TheJudger | GPU Computing | 3622 | 2023-01-25 16:41 |
LL with OpenCL | msft | GPU Computing | 433 | 2019-06-23 21:11 |
OpenCL for FPGAs | TObject | GPU Computing | 2 | 2013-10-12 21:09 |
Program to TF Mersenne numbers with more than 1 sextillion digits? | Stargate38 | Factoring | 24 | 2011-11-03 00:34 |