20201009, 19:42  #12 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
1218_{16} Posts 
James has made use of extrapolated fft lengths and corresponding extrapolated iteration times to adjust those figures upward considerably. What was 91K is now ~700K.
Last fiddled with by kriesel on 20201009 at 19:42 
20201012, 22:10  #13  
Jun 2003
The Computer
2·191 Posts 
Quote:


20201012, 22:30  #14  
Jul 2018
Martin, Slovakia
223 Posts 
Quote:


20201013, 01:50  #15  
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
4632_{10} Posts 
Quote:
Gpuowl is developed on Linux, AMD gpus, ROCm driver, and AMD's OpenCL. Mihai owned an NVIDIA card briefly and got rid of it. We are fortunate that gpuowl works also on Windows and on some NVIDIA gpus and even on some Intel igps and AMD IGPUs. There are some NVIDIA gpus that are not compatible with a new enough driver to support a high enough version of OpenCL so can't run gpuowl. 

20201013, 02:10  #16  
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
4632_{10} Posts 
Quote:
Historically, TF was done on cpus, as was LL, and at that time there was no GIMPS PRP. The GhzD unit of measure was set as one core of a theoretical 1Ghz Core 2 processor. Gpus have much faster single precision or integer speed (relevant to TF) than DP (relevant to P1, LL, and PRP); in some cases as much as 8x, 12x, 16x, or more (although some rare models are 2x or 3x). In cpus the ratios I've seen ranged from 0.7 to 1.4. On a gpu, a TF GhzD occurs much more quickly than a P1 or PRP or LL GhzD. Compare GhzD/day figures for TF https://www.mersenne.ca/mfaktc.php and for LL / PRP https://www.mersenne.ca/cudalucas.php for the same gpu model. 

20201013, 13:39  #17 
Romulan Interpreter
Jun 2011
Thailand
19·467 Posts 
In this case, they are. History has nothing to do with it. One GHzDay is the work one 32bit core running at 1 GHz can do in one day (more or less, there are some "ifs" and "tricks" here). No matter if TF or FFT. Now, the TF effort doubles with every bitlevel, therefore, factoring to 91 bits requires about ONE MILLION (2^20) times more effort compared with factoring to 71 bits.
Where the "historical" part comes to place is that development and advance in parallel computing hardware (i.e. GPUs) make the factoring much faster (therefore you can get a lot of "credit" GHzDays by doing TF with a GPU, so, from this point of view, when you "factor" in the wall clock time spent, they are "not equal". If somebody would/will make a DSP in the future which could do some "long" FFT at hardware level (some DSP  digital signal processors  can already do small FFTs in hardware), then the things would be the other way around, one may get more GHzDays/Day doing LL and PRP... But this doesn't make the two "not equal". Last fiddled with by LaurV on 20201013 at 13:47 
20201013, 14:46  #18 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2^{3}×3×193 Posts 
RTX2080 2623 TF GhzD/day; 65 LL GhzD/day; ratio 40.4
Radeon VII 1113 TF GhzD/day; 281 LL GhzD/day; ratio 3.96 GTX1080 1042 TF GhzD/day; 64.6 LL GhzD/day; ratio 16.1 Nothing close to equal there. (Recent improvements in gpuowl have raised PRP performance to as high as 510GD/d at 5M fft on linux, but 281 is representative of performance at some higher fft lengths) All figures from mersenne.ca benchmark pages. A recent server logs analysis for September 2020 showed 95+% of results received were by manual submission, which is the status quo for gpus; only ~5% by PrimeNet API, which is characteristic of cpus runnng mprime / prime95. Last fiddled with by kriesel on 20201013 at 14:56 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
9596M to 64 bits.  chalsall  Lone Mersenne Hunters  1  20090908 02:28 
64 bits versus 32 bits Windows  S485122  Software  2  20061031 19:14 
3535.2 to 62 bits, cont from 61 bits  Khemikal796  Lone Mersenne Hunters  12  20051201 21:35 
26.126.3 to 62 Bits  derekg  Lone Mersenne Hunters  1  20040609 18:47 
5.98M to 6.0M: redoing factoring to 62 bits  GP2  Lone Mersenne Hunters  0  20031119 01:30 