mersenneforum.org New Google Colab Notebooks For Primality Testing
 User Name Remember Me? Password
 Register FAQ Search Today's Posts Mark Forums Read

 2021-03-21, 16:00 #56 LaurV Romulan Interpreter     "name field" Jun 2011 Thailand 280616 Posts Question: Is it normal under Linux that the performance of the CPU degrades (to about a half) when gpuowl cudaLucas is running? This is what I experience with the two "tricks" that are the subject of the current thread. The one "cpu only" gets some ms per iteration, but when I am lucky enough and get a gpu, the cpu in that case gets a double (almost) amount of ms/iteration. Local, under windoze, I remember there was a version of the AMD Nvidia drivers some time ago that caused gpuOwl cudaLucas to steal CPU clocks (like one full core or so). Later on, this was fixed either by Mihai FlashJH (?) Dubslow (?), or by upgrading the drivers, no idea (i.e. I don't remember how it was fixed and by who), but now, for the local machine(s) we don't experience any millisecond difference in P95, when gpuOwl cudaLucas runs or not in any of the Rvii's 2080Ti, or in all together. So, is this a "colab" thing? Linux thing? Can you use different AMD Nvidia drivers in your notebooks? Or is it a "me only" thing? How to get rid of it? To be clear, right now, for me, as a "non-US" (therefore can't get the Pro, I tried!), it doesn't worth much to run the "CPU and GPU" toy. I never get a GPU which is good for PRP, so every time I connect, I use the script which they give as example in the intro: Code: gpu_info = !nvidia-smi gpu_info = '\n'.join(gpu_info) if gpu_info.find('failed') >= 0: print('Select the Runtime > "Change runtime type" menu to enable a GPU accelerator, ') print('and then re-execute this cell.') else: print(gpu_info)  to see what GPU they gave me (because this is running much faster than waiting for the other toys to install stuff, especially Chris' gpu72 toy is soooooo slow in doing that! Chris, you should first display the GPU, and only after proceed to download stuff!) and in case I got a T4, I will run gpu72 script. In case I got anything else (which is the half of k80, or the p4) then I will ignore it and run Teal's "CPU only" toy. It doesn't worth to run the "GPU" version in those GPUs, they are extremely slow, never finish, and they'll slow the CPU too, to half of the speed. So, no go. Last fiddled with by LaurV on 2021-03-22 at 07:08
2021-03-21, 21:34   #57
danc2

Dec 2019

5×7 Posts

Quote:
 Currently getting about 10 minutes at a time running gpu ...
Availability depends largely upon region. If you are getting disconnected a lot and/or would like to make the process automatic, I would try and use the extension.

Quote:
 Is is normal under Linux that the performance of the CPU degrades (to about a half) when gpuowl is running?
The current project does not use gpuowl yet (plans, work to do so have been started). Instead, CUDALucas is used. I am not well-versed in why this would happen, but I can say that the cpu that you are offered is not always the same per VM, so it could be that you are getting a lower-quality CPU when being assigned a backend that has a GPU. This could be either intentional (i.e., Google is giving high-end GPU machines low-end CPUs with the expectation that a user will use the GPU more often than the CPU) or coincidental. Or I could be wrong completely.

What I would do in your situation is to have a notebook that runs a "GPU and CPU" notebook (you may run two at a time also if you want) and have 1-4 "CPU only" notebooks running as Google allows you multiple notebooks and you can make progress this way on multiple assignments at a time and make progress despite whatever hardware idiosyncrasies exist (I assume your goal is to make the most progress as possible).

Quote:
 ...as a "non-US" (therefore can't get the Pro, I tried!)
This is really weird Google has not opened this up to other users from different countries. I'm not sure why. You could try using a VPN and signing up for a Gmail/Colab pro account, but maybe it is based on the location of your payment plan...though that could be risky. I'm sorry this does not work for you.

2021-03-21, 22:23   #58
chalsall
If I May

"Chris Halsall"
Sep 2002
Barbados

2A7D16 Posts

Quote:
 Originally Posted by danc2 Availability depends largely upon region. If you are getting disconnected a lot and/or would like to make the process automatic, I would try and use the extension.
I, on the other hand, would advise against this.

At least, for anyone who takes their relationship with Google seriously.

Some do; others don't. By definition, sentients have free will, and so will make and are responsible for their own decisions.

 2021-03-22, 00:13 #59 danc2   Dec 2019 5×7 Posts For those interested and who have not followed the thread, see the following posts which mention arguments for and against the use of the extension (mods feel free to add others): For: Post #1 Against: Post #1 Last fiddled with by danc2 on 2021-03-22 at 00:14
2021-03-22, 01:27   #60
chalsall
If I May

"Chris Halsall"
Sep 2002
Barbados

73×149 Posts

Quote:
 Originally Posted by danc2 For those interested and who have not followed the thread, see the following posts which mention arguments for and against the use of the extension (mods feel free to add others):
Personally, I find this post disingenuous.

For those who want to actually understand the various positions argued, please read this thread from the top.

Various "new phrases" came to mind during this reply. But I deleted them before hitting the "Submit Reply".

P.S. They might have included a certain person who has been said walked on water to do unimaginable things to themself...

Last fiddled with by chalsall on 2021-03-22 at 01:32 Reason: s/new words/new phrases/; # Humor can be such a subjective thing...

2021-03-22, 02:30   #61
LaurV
Romulan Interpreter

"name field"
Jun 2011
Thailand

240068 Posts

Quote:
 Originally Posted by danc2 The current project does not use gpuowl yet
Sorry. Total brain fart on my side. Please substitute in all my post in your mind, when you read it, "gpuOwl" with "cudaLucas". I was talking about cudaLucas and Nvidia drivers. The situation I described (GPU work taking a CPU core) happened with cudaLucas and Nvidia drivers in the past. But I got it messed up somehow. I guess that's happening when posting at midnight, with 39°C fever (being arguing with a flu for few days).

The topic stands. When I am using "Colab CPU Notebook" I am getting 2.5-3.5 ms/iter, depending of the exponent and CPU given (PRP-CF work), but when I am using "Colab GPU and CPU Notebook", I am getting 4.5-5.5 ms/iter for the CPU work. In this case, a GPU like P4 or the "half k80" doesn't help, they are very slow, even for LLDC work, they don't last long, and I don't get them so often to be able to finish any assignment in time, so the GPU work will not compensate for the lost CPU work. And a T4 is a waste for PRP/LL anyhow, as that is so good for TF.

Last fiddled with by LaurV on 2021-03-22 at 07:22

2021-03-22, 03:18   #62
danc2

Dec 2019

1000112 Posts

Quote:
 Personally, I find this post disingenuous.
Although that may be the perception, it was not the intention. I personally gain no benefit from advocating a free extension. Users are busy, including myself, and I was hoping to avoid reading 6+ pages of forum data, in which non-related topics are discussed. Because specifics were not discussed in your post either, maybe it would have been more beneficial to say "read from the top" as what was said adds no meat to the discussion beyond what has already been said. Please feel free to pm me any specific posts if you'd like my previous post to appear more ingenuous.

Quote:
 The topic stands.
Understood. I am not qualified to respond to the hardware specifics beyond what supposition I supplied. I hope someone else may be able to answer your question better!

 2021-03-22, 06:46 #63 LaurV Romulan Interpreter     "name field" Jun 2011 Thailand 2·47·109 Posts Nah, wait, wait, don't go! It wasn't anything about hardware. You didn't reply my question: does anybody experiences increasing in iteration time for the CPUs on colab, when the GPU is running, with your "gpu and cpu notebook"? Because if so, then you may be using the wrong driver, or you would need a tweak of it or of cudaLucas (or dld a different/older driver when you install the notebook). Or this is a Linux issue of which I am not aware? Running local (windows, again, I have no linux experience) does not show this behavior. With or without cL running, the P95 iterations take the same amount of time (regardless if 2080Ti, 1080Ti, Titan, 580 fermi, these are the only cards I have now). Anybody else can answer/weight in? Or is only me running Teal's toys? Edit: I edited the brain farting post, just to have it right for the future readers. My problem, as expressed in above post: When I am using "Colab CPU Notebook" I am getting 2.5-3.5 ms/iter, depending of the exponent and CPU given (PRP-CF work), but when I am using "Colab GPU and CPU Notebook", I am getting 4.5-5.5 ms/iter for the CPU work. In this case, a GPU like P4 or the "half k80" doesn't help, they are very slow, even for LLDC work, they don't last long, and I don't get them so often to be able to finish any assignment in time, so the GPU work will not compensate for the lost CPU work. And a T4 is a waste for PRP/LL anyhow, as that is so good for TF. So, if you want me to use your "GPU and CPU" toy, you have to fix the (whatever?) to make the GPU work come as additional work, and not as "instead of CPU" work. 101 miles per hour is better that 100 miles per hour, but 101 miles per hour and headache is not. Last fiddled with by LaurV on 2021-03-22 at 07:26
 2021-03-22, 07:15 #64 LaurV Romulan Interpreter     "name field" Jun 2011 Thailand 280616 Posts Also, for Chris, you maybe didn't read the ending of my post, there was there an argument that you should display the GPU before proceeding to download all the stuff (which takes ages). The reason is that if I see I don't have the "right" GPU, then I can click the "factory default" button before waiting an eon and a bit more for all the stuff to download, and get a new (possible better) GPU. This will save a lot of time at startup. In fact, based on Teal's idea, you should check if the stuff is downloaded, and don't download it every time. Keep it in drive (i.e. not in "home" as it is currently, home is gone when reset). Last fiddled with by LaurV on 2021-03-22 at 07:16
2021-03-22, 12:02   #65
tdulcet

"Teal Dulcet"
Jun 2018

71 Posts

Quote:
 Originally Posted by LaurV Question: Is it normal under Linux that the performance of the CPU degrades (to about a half) when gpuowl cudaLucas is running?
No. All the Colab VMs have one hperthreaded CPU core with two threads. MPrime does not use the hyperthreading by default when doing LL/PRP tests, so it will only use one of those CPU threads. On the GPU notebook, CUDALucas will use the other CPU thread, which will cause a small performance reduction for MPrime, but definitely not half. As with the GPUs, there is a large range in the performance of the CPUs provided, which I bet accounts for the majority of what you experienced. For wavefront first time primality tests with both notebooks, I have gotten everything between around 25 ms/iter for the AVX-512 CPUs to 48 ms/iter for the FMA3 ones, although I usually get 35-40 ms/iter. I suspect that if you ran both our notebooks for a few weeks, you would find that your average ms/iter times were about the same.

Quote:
 Originally Posted by LaurV I never get a GPU which is good for PRP, so every time I connect, I use the script which they give as example in the intro:
Our "GPU and CPU" notebook includes nearly identical code to that and it will output the name of the current GPU on the very first line. Both our notebooks will output the name of the current CPU and more system information. They will also output counts of all previous GPUs and CPUs respectively.

Quote:
 Originally Posted by LaurV It doesn't worth to run the "GPU" version in those GPUs, they are extremely slow, never finish, and they'll slow the CPU too, to half of the speed. So, no go.
In my experience doing primality testing, even the slowest GPU (the Tesla P4) is over twice as fast as the fastest CPU available (one of the AVX-512 ones).

Quote:
 Originally Posted by LaurV The situation I described (GPU work taking a CPU core) happened with cudaLucas and Nvidia drivers in the past.
Yes, CUDALucas will generally use 100% of one of the two CPU threads on Colab. However, from my limited testing of GpuOwl on Colab, it will also use 100% of one of the CPU threads, so I do not think there is a driver issue. Our GPU notebook does not install any drivers, as that is handled by Colab.

Quote:
 Originally Posted by LaurV In this case, a GPU like P4 or the "half k80" doesn't help, they are very slow, even for LLDC work, they don't last long, and I don't get them so often to be able to finish any assignment in time, so the GPU work will not compensate for the lost CPU work.
For a category 4 LL DC test, you should have 360 days to complete it since the notebook is using our PrimeNet script, which uses the PrimeNet API. Regardless, even the slowest GPU on Colab should be able to complete an LL DC test in less than two weeks. The faster GPUs (the Tesla P100 and V100) should be able to do it in less than a day. There also should not be any "lost CPU work", as users can run both notebooks at the same time, so it would just be "additional CPU work".

Quote:
 Originally Posted by LaurV Or is only me running Teal's toys?
FYI, both our notebooks are just as much Daniel's as they are mine, although I appreciate the complement. It was actually Daniel's idea to create them in the first place and he did most of the initial work.

Quote:
 Originally Posted by LaurV So, if you want me to use your "GPU and CPU" toy, you have to fix the (whatever?) to make the GPU work come as additional work, and not as "instead of CPU" work.
I am not sure what you mean by this... With our "GPU and CPU" notebook, the GPU work is completely separate from the CPU work and users can select the worktypes independently. Both our notebooks are also designed to be used at the same time.

BTW, I have finished most of the necessary work for our GPU notebook to use GpuOwl. We are just waiting on Colab to upgrade to Ubuntu 20.04 so we can start testing and finish updating our PrimeNet script...

 2021-03-22, 13:44 #66 LaurV Romulan Interpreter     "name field" Jun 2011 Thailand 240068 Posts This very nicely answers all my questions. Summary: cudaLucas taking one full CPU core and reducing the CPU performance to half (that is what I experience, regardless of what you, being in US and using Pro account, say), is a colab thing, and nothing can be done about it (drivers are handled by colab). The part with using both notebooks in the same time does not apply to me, I can't do that unless I use multiple accounts, and that is what I was referring to when I said "headache". Also, "a gpu will complete whatever work in two weeks" if you get it. If you get two hours today, two after 3 days, then that work will never complete, and (due to separate CPU pools of assignments) bottleneck the CPU work. That is what I was referring as "combine them together" in one of my first posts in this thread, and if I get a GPU, do that, if not, do this. But use a common pool. Then the GPU work comes as additional, the 101st mile, not as a showstopper. Anyhow, thanks a lot for the notebooks, and for the answers. Good job. (my last sentence was intentionally formulated so you will "feel threatened" and reply ) Last fiddled with by LaurV on 2021-03-22 at 13:59

 Thread Tools

 Similar Threads Thread Thread Starter Forum Replies Last Post Corbeau Cloud Computing 1225 2022-07-31 13:51 Viliam Furik Math 3 2020-08-18 01:51 kriesel Cloud Computing 11 2020-01-14 18:45 chalsall Cloud Computing 3 2019-10-13 20:03 jasong Math 1 2007-11-06 21:46

All times are UTC. The time now is 18:28.

Fri Dec 9 18:28:59 UTC 2022 up 113 days, 15:57, 0 users, load averages: 1.11, 1.21, 1.15

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔