mersenneforum.org New Google Colab Notebooks For Primality Testing
 Register FAQ Search Today's Posts Mark Forums Read

2021-04-12, 12:33   #78
tdulcet

"Teal Dulcet"
Jun 2018

71 Posts

Quote:
 Originally Posted by LaurV Wow! it works! You (two) are my heroes for this weekend!
Great! We are glad it works for you.

Quote:
 Originally Posted by LaurV Albeit a little bit too complicate, first it didn't work, as I had the "CPU and GPU" output (sure! I want to see what BOTH of them are doing!), then I looked in the code and seen that you use the "-k" switch only when the output is "GPU Only"
Yes, sorry, I guess I should have mentioned that. I did not realize anyone was using the "GPU and CPU" output type, as it is very verbose. I added it shortly before we officially announced the notebooks, as I saw it was requested a few times on the main Colab thread and it was easy to implement. When using that option, both CUDALucas and MPrime are run in the background, while the tail -f command is run the foreground, so there is no easy way to pass input to CUDALucas.

I updated our PrimeNet script on Saturday to support still getting first time LL tests using the method described by @Prime95 above, so that users can still use CUDALucas while we work on upgrading our GPU notebook to use GpuOwl. (@LaurV - You will no longer have to do this manually. ) Anyone who wants to continue doing first time LL tests on the GPU would need to reset up their GPU notebooks after they finish any current assignments. I also included many of the changes needed for our PrimeNet script to support GpuOwl, including adding support for reporting LL/PRP and P-1 results. Going forward we decided we are going to recommend users do PRP tests, which will be the default, although we will still provide the option of doing LL tests on the GPU for users with very limited Drive space, as explained above. Prime95/MPrime of course has its PrimeNet functionality builtin, so unfortunately there is not much we can do about the CPU for users with limited Drive space. Those users will need to do LL DC tests on the CPU, although as George said, there is "a chance that a new Mersenne prime is hidden in all those double-checks".

2021-05-07, 21:36   #79
moebius

"CharlesgubOB"
Jul 2009
Germany

10011011102 Posts

Quote:
 Originally Posted by danc2 I realize we did not post any output or pictures, just links. Since we have this dedicated thread, here is example output from a GPU notebook running the Tesla V100-SMX2-16GB (a \$6,195.00 GPU according to Amazon).
LL-test runs much slower than with gpuowl -LL, the same Exponent and the Tesla V100 gpu

2021-07-07, 11:04   #80
mognuts

Sep 2008
Bromley, England

1011012 Posts
Colab now using AMD CPUs

This is the first time I've ever had an AMD!!

Quote:
 Previous CPU counts 15 Intel(R) Xeon(R) CPU @ 2.30GHz 63 9 Intel(R) Xeon(R) CPU @ 2.00GHz 85 8 Intel(R) Xeon(R) CPU @ 2.20GHz 79 1 AMD EPYC 7B12 49

2021-07-07, 18:34   #81
danc2

Dec 2019

5·7 Posts

@mognuts
Yeah, I was pretty surprised when I first saw that on my machines also!

Quote:
 Previous CPU counts 111 Intel(R) Xeon(R) CPU @ 2.30GHz 63 97 Intel(R) Xeon(R) CPU @ 2.20GHz 79 29 Intel(R) Xeon(R) CPU @ 2.00GHz 85 15 AMD EPYC 7B12 49

2021-07-07, 20:05   #82
PhilF

"6800 descendent"
Feb 2005

52×29 Posts

Quote:
 Originally Posted by mognuts This is the first time I've ever had an AMD!!
I was told if you snag one of those to throw it back because the performance is lower than the others. But that was a while back, that advice might have been referring to a different AMD model.

2021-07-07, 21:02   #83
chalsall
If I May

"Chris Halsall"
Sep 2002

24·5·7·19 Posts

Quote:
 Originally Posted by PhilF I was told if you snag one of those to throw it back because the performance is lower than the others. But that was a while back, that advice might have been referring to a different AMD model.
Busy, but quickly...

The AMD CPUs have been given out for quite a while now. And, at least for P-1'ing, they're faster than all the Intel instances (~20% or so).

2021-07-07, 21:50   #84
Flaukrotist

Sep 2020
Germany

548 Posts

Quote:
 Originally Posted by chalsall And, at least for P-1'ing, they're faster than all the Intel instances (~20% or so).
I cannot confirm that. Using Prime95 v30.4 and exponents in range 104M with bounds determined by Prime95, I get the following ranking for the time needed for P-1 stage 1 + 2 in total:

Code:
Model 63, Intel(R) Xeon(R) CPU @ 2.30GHz: 36.09 h
Model 79, Intel(R) Xeon(R) CPU @ 2.20GHz: 31.58 h
Model 49, AMD EPYC 7B12:                  31.36 h
Model 85, Intel(R) Xeon(R) CPU @ 2.00GHz: 25.27 h
So, the Intel Model 85 is clearly fastest.

2021-07-07, 22:18   #85
chalsall
If I May

"Chris Halsall"
Sep 2002

24·5·7·19 Posts

Quote:
 Originally Posted by Flaukrotist I cannot confirm that. ...snip... So, the Intel Model 85 is clearly fastest.
I could very well be wrong. My observations were subjective. Would be worth collecting hard data on this.

 2021-07-08, 22:06 #86 slandrum   Jan 2021 California 5·7·13 Posts There are 3 versions of the Intel chipset on Colab (that I've received on free accounts). The 2.30 GHz model 63 is the worst, followed by the 2.20 GHz model 79, and the 2.00 GHz model 85 with AVX512 is by far the best. The AMD chipset's times overlap with the times I get with the 2.00GHz Intel - the worst times for the 2.00Ghz model 85 Intel are slightly worse that the worst times with the AMD, but the best times with the 2.00 Ghz model 85 Intel are much better than the best times with the AMD. This is for running tests with mprime (LL, PRP, PM1, CERT). For around 110M PRP, iteration times on 2.30 and 2.20 GHz Intels are around 40ms ranging from the mid 30 to mid 40 - timings on the two overlap but the 2.30 GHz model 63 averages the worst. For the 2.00 GHz model 85 Intel I've seen from 21ms to 32ms. For the AMD I see 26 to 31ms. The iterations times can vary through 6-12 hour session, sometimes by a lot, but most instances seem to stay pretty close to the same ms/iteration throughout the session. The average times on the model 85 are better than the average times on the AMD model 49. There are far more 2.30 and 2.20 GHz Intels available to me at any given time than either the 2.00 GHz Intel or the AMD. Last fiddled with by slandrum on 2021-07-08 at 22:34
2022-05-25, 13:54   #87
tdulcet

"Teal Dulcet"
Jun 2018

71 Posts

Quote:
Originally Posted by tdulcet
Quote:
 Originally Posted by LaurV Here attached there is a digest of the FFT sizes, with times per iteration, for all the five cards that colab offers.
Thanks, your spreadsheet does make it easier to compare the ms/iter times. It looks like you created it from the *fft.txt and *threads.txt files in our repository.
I regenerated these *fft.txt and *threads.txt files for FFT lengths 1K to 32768K using twice as many iterations for better accuracy and also added them for the A100 GPU. Anyone using our GPU notebook should consider upgrading, as you would likely get a performance improvement.

Since my last update over a year ago, here are the notable changes I have made to the notebooks:
• June 3, 2021
• Updated CPU notebook to default to the 150 worktype (first time PRP tests).
• August 1, 2021 - Updated MPrime install script to use 80% of available memory for stage 2. The notebooks will thus use up to around 10.1 GiB of RAM instead of just 6 GiB.
• August 31, 2021
• Added support for computer_numbers greater than 9.
• Updated notebooks to configure MPrime to not preallocate disk space for the proof interim residues files to reduce Google Drive storage use.
• November 1, 2021 - Updated the CPU on the GPU notebook to default to the 150 worktype.
• December 1, 2021 - Updated MPrime install script to use the latest Prime95/MPrime v30.7.
• January 2 - Added support for the 154 (first time PRP tests that need P-1 factoring) and 155 (double-check tests using PRP with proof) worktypes on the CPU.
• May 5 - Updated GPU notebook to also compile CUDALucas for the A100 GPU.
• Today - Regenerated the optimization files using twice as many iterations and also added them for the A100 GPU.
For the improvements to our PrimeNet script, please see here and the below post in the dedicated thread. Feedback is welcome!

I know it has been over a year now, but we are still patiently waiting for Colab to upgrade to Ubuntu 20.04 (probably now 22.04) so we can finally switch to GpuOwl...

Quote:
 Originally Posted by tdulcet We of course still need to test with the other Tesla GPUs available on Colab and with the latest version of GpuOwl.
Quote:
 Originally Posted by moebius LL-test runs much slower than with gpuowl -LL, the same Exponent and the Tesla V100 gpu
We had some Google Cloud credits that were expiring, so I was able to confirm that GpuOwl is indeed faster than CUDALucas on all six GPUs available on Colab. However, I also found that the GpuOwl performance on these GPUs has slowly degraded up to 15% across all FFT lengths over the last few years. While the v6 branch is faster than the master branch, it is not the fastest version. For example, for a wavefront first time exponent on the Tesla V100 GPU, the master branch is 654 us/iter, the v6 branch is 641 us/iter and the fastest version is 599 us/iter. See the issue I created on the GpuOwl repository for more information and several graphs. Hopefully someone will be able to fix these performance regressions before we are able to switch...

Quote:
 Originally Posted by LaurV The issue will remain with gpuOwl. Moreover, gpuOwl doesn't provide a way to switch to another FFT size on the fly.
I tested all FFT lengths from 1M to 32M in GpuOwl for both the master and v6 branches on all six GPUs and the smallest FFT length selected by default seemed to always be the fastest, so that should not be an issue. However, for the FFT lengths that support multiple variants, the variant selected default is not always the fastest/optimal one, which is obviously another issue. I would be happy to share this data if anyone is interested.

 Similar Threads Thread Thread Starter Forum Replies Last Post Corbeau Cloud Computing 1225 2022-07-31 13:51 Viliam Furik Math 3 2020-08-18 01:51 kriesel Cloud Computing 11 2020-01-14 18:45 chalsall Cloud Computing 3 2019-10-13 20:03 jasong Math 1 2007-11-06 21:46

All times are UTC. The time now is 06:30.

Mon Sep 26 06:30:43 UTC 2022 up 39 days, 3:59, 0 users, load averages: 1.19, 1.05, 1.06