mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Operation Billion Digits

Reply
 
Thread Tools
Old 2020-10-09, 19:42   #12
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

17×277 Posts
Default

Quote:
Originally Posted by axn View Post
The figure of 91K is wrong. I think it is closer to 600K. James's site doesn't have P95 timing data for a FFT big enough to handle that exponent.
James has made use of extrapolated fft lengths and corresponding extrapolated iteration times to adjust those figures upward considerably. What was 91K is now ~700K.

Last fiddled with by kriesel on 2020-10-09 at 19:42
kriesel is offline   Reply With Quote
Old 2020-10-12, 22:10   #13
clowns789
 
clowns789's Avatar
 
Jun 2003
The Computer

18016 Posts
Default

Quote:
Originally Posted by kriesel View Post
Start lobbying Mihai and George now for a gigadigit-capable fft length in gpuowl, and robust error checking in P-1, which I estimate would take a month to run on a Radeon VII, and start saving for a Radeon VII Pro for the PRP multiyear run.
Thanks Ken for the feedback in the last two posts. It will be interesting to see if Team Red can match the new Nvidia RTX A6000 in performance any time soon. Perhaps gpuowl will be optimized for OpenCL as well at some point.
clowns789 is offline   Reply With Quote
Old 2020-10-12, 22:30   #14
Viliam Furik
 
Jul 2018
Martin, Slovakia

3778 Posts
Default

Quote:
Originally Posted by clowns789 View Post
Thanks Ken for the feedback in the last two posts. It will be interesting to see if Team Red can match the new Nvidia RTX A6000 in performance any time soon. Perhaps gpuowl will be optimized for OpenCL as well at some point.
I am not sure whether I missed something important, but AFAIK, gpuOwl works ONLY on OpenCL. The fact that it works on Nvidia cards is only because of them having OpenCL
Viliam Furik is online now   Reply With Quote
Old 2020-10-13, 01:50   #15
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

17×277 Posts
Default

Quote:
Originally Posted by Viliam Furik View Post
I am not sure whether I missed something important, but AFAIK, gpuOwl works ONLY on OpenCL. The fact that it works on Nvidia cards is only because of them having OpenCL
Viliam, your understanding is correct.
Gpuowl is developed on Linux, AMD gpus, ROCm driver, and AMD's OpenCL. Mihai owned an NVIDIA card briefly and got rid of it. We are fortunate that gpuowl works also on Windows and on some NVIDIA gpus and even on some Intel igps and AMD IGPUs. There are some NVIDIA gpus that are not compatible with a new enough driver to support a high enough version of OpenCL so can't run gpuowl.
kriesel is offline   Reply With Quote
Old 2020-10-13, 02:10   #16
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

17·277 Posts
Default

Quote:
Originally Posted by clowns789 View Post
Interestingly, according to the following link, factoring to 91 bits would require over 150K GHz-days of computation, while a LL test requires only 91K:
Not all GhzD are equal.
Historically, TF was done on cpus, as was LL, and at that time there was no GIMPS PRP. The GhzD unit of measure was set as one core of a theoretical 1Ghz Core 2 processor.
Gpus have much faster single precision or integer speed (relevant to TF) than DP (relevant to P-1, LL, and PRP); in some cases as much as 8x, 12x, 16x, or more (although some rare models are 2x or 3x). In cpus the ratios I've seen ranged from 0.7 to 1.4.
On a gpu, a TF GhzD occurs much more quickly than a P-1 or PRP or LL GhzD.
Compare GhzD/day figures for TF https://www.mersenne.ca/mfaktc.php and for LL / PRP https://www.mersenne.ca/cudalucas.php for the same gpu model.
kriesel is offline   Reply With Quote
Old 2020-10-13, 13:39   #17
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

22E416 Posts
Default

Quote:
Originally Posted by kriesel View Post
Not all GhzD are equal.
In this case, they are. History has nothing to do with it. One GHzDay is the work one 32-bit core running at 1 GHz can do in one day (more or less, there are some "ifs" and "tricks" here). No matter if TF or FFT. Now, the TF effort doubles with every bitlevel, therefore, factoring to 91 bits requires about ONE MILLION (2^20) times more effort compared with factoring to 71 bits.

Where the "historical" part comes to place is that development and advance in parallel computing hardware (i.e. GPUs) make the factoring much faster (therefore you can get a lot of "credit" GHzDays by doing TF with a GPU, so, from this point of view, when you "factor" in the wall clock time spent, they are "not equal". If somebody would/will make a DSP in the future which could do some "long" FFT at hardware level (some DSP - digital signal processors - can already do small FFTs in hardware), then the things would be the other way around, one may get more GHzDays/Day doing LL and PRP... But this doesn't make the two "not equal".

Last fiddled with by LaurV on 2020-10-13 at 13:47
LaurV is offline   Reply With Quote
Old 2020-10-13, 14:46   #18
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

17×277 Posts
Default

RTX2080 2623 TF GhzD/day; 65 LL GhzD/day; ratio 40.4
Radeon VII 1113 TF GhzD/day; 281 LL GhzD/day; ratio 3.96
GTX1080 1042 TF GhzD/day; 64.6 LL GhzD/day; ratio 16.1
Nothing close to equal there. (Recent improvements in gpuowl have raised PRP performance to as high as 510GD/d at 5M fft on linux, but 281 is representative of performance at some higher fft lengths)
All figures from mersenne.ca benchmark pages.

A recent server logs analysis for September 2020 showed 95+% of results received were by manual submission, which is the status quo for gpus; only ~5% by PrimeNet API, which is characteristic of cpus runnng mprime / prime95.

Last fiddled with by kriesel on 2020-10-13 at 14:56
kriesel is offline   Reply With Quote
Old 2020-10-30, 03:25   #19
clowns789
 
clowns789's Avatar
 
Jun 2003
The Computer

27·3 Posts
Default

Quote:
Originally Posted by kriesel View Post
...and start saving for a Radeon VII Pro for the PRP multiyear run.
Perhaps a 6900 XT would also be good if the double precision performance is adequate.
clowns789 is offline   Reply With Quote
Old 2020-10-30, 15:33   #20
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

10010011001012 Posts
Default

Quote:
Originally Posted by clowns789 View Post
Perhaps a 6900 XT would also be good if the double precision performance is adequate.
Too slow;
1.44Tflops FP64, versus Radeon VII 3.5 or Radeon VII Pro 6.5;
also half the memory bandwidth.
https://www.techpowerup.com/gpu-spec...-6900-xt.c3481

Last fiddled with by kriesel on 2020-10-30 at 15:33
kriesel is offline   Reply With Quote
Old 2020-10-30, 19:36   #21
clowns789
 
clowns789's Avatar
 
Jun 2003
The Computer

18016 Posts
Default

Thanks for the quick response and link! Luckily I think we are a good ways away from needing it to advance the project.
clowns789 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
95-96M to 64 bits. chalsall Lone Mersenne Hunters 1 2009-09-08 02:28
64 bits versus 32 bits Windows S485122 Software 2 2006-10-31 19:14
35-35.2 to 62 bits, cont from 61 bits Khemikal796 Lone Mersenne Hunters 12 2005-12-01 21:35
26.1-26.3 to 62 Bits derekg Lone Mersenne Hunters 1 2004-06-09 18:47
5.98M to 6.0M: redoing factoring to 62 bits GP2 Lone Mersenne Hunters 0 2003-11-19 01:30

All times are UTC. The time now is 19:10.

Thu Nov 26 19:10:44 UTC 2020 up 77 days, 16:21, 3 users, load averages: 1.05, 1.28, 1.39

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.