mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Cloud Computing (https://www.mersenneforum.org/forumdisplay.php?f=134)
-   -   "Unable to connect to the runtime" - Colab (https://www.mersenneforum.org/showthread.php?t=25418)

mrk74 2020-04-02 16:13

"Unable to connect to the runtime" - Colab
 
Does anybody know what the usage limits for Colab are? I've maybe let it go for 12 hrs a day for a few days. Haven't been able to connect to a GPU or TPU for about 14-15 hrs because of usage limits. All I get is "Unable to connect to the runtime." Would getting another key help or is it connected to the login? ANY help in understand is greatly appreciated!

kriesel 2020-04-02 16:32

moderator please move this thread to the cloud computing subforum's Colab area. It's Cola-specific, unrelated to PrimeNet.

chalsall 2020-04-02 17:14

[QUOTE=mrk74;541601]Does anybody know what the usage limits for Colab are? I've maybe let it go for 12 hrs a day for a few days. Haven't been able to connect to a GPU or TPU for about 14-15 hrs because of usage limits. All I get is "Unable to connect to the runtime."![/QUOTE]

This is nominal behavior.

No one except Google (and perhaps not even their humans) know exactly what the usage limits are. They seem to change over time -- sometimes suddenly, sometimes gradually.

Here's my empirical: after the initial "honeymoon" of a Google (read: Gmail) ***account*** has started using Colab, the availability of a GPU tends to converge to get a single instance run once per day. Lately, they've been running about 7 to 10 hours, tending to the lower range.

Also, when the GPU availability window "opens" seems to be the same time each day. Some of my accounts get GPUs at around 1400 UTC, others not until around 2200. They are always offered (and get) a CPU instance if the GPU isn't available.

[QUOTE=mrk74;541601]Would getting another key help or is it connected to the login? ANY help in understand is greatly appreciated![/QUOTE]

Completely tied to the Gmail login. But Google doesn't seem to care how many accounts you use. I'm running eight (across three "humans" (read: VPNs)), and I'm on the free tier. I know at least one person on the paid tier running four concurrently, often almost 24/7.

Lastly, I've observed that new Colab users (even by a newly created Gmail account) initially gets about 12 hour runs of T4s or P100s, which can often immediately be relaunched. This lasts for two or three days, and then the usage is constrained as above.

May we live in interesting times...

mrk74 2020-04-02 17:28

[QUOTE=chalsall;541610]This is nominal behavior.

No one except Google (and perhaps not even their humans) know exactly what the usage limits are. They seem to change over time -- sometimes suddenly, sometimes gradually.

Here's my empirical: after the initial "honeymoon" of a Google (read: Gmail) ***account*** has started using Colab, the availability of a GPU tends to converge to get a single instance run once per day. Lately, they've been running about 7 to 10 hours, tending to the lower range.

Also, when the GPU availability window "opens" seems to be the same time each day. Some of my accounts get GPUs at around 1400 UTC, others not until around 2200. They are always offered (and get) a CPU instance if the GPU isn't available.



Completely tied to the Gmail login. But Google doesn't seem to care how many accounts you use. I'm running eight (across three "humans" (read: VPNs)), and I'm on the free tier. I know at least one person on the paid tier running four concurrently, often almost 24/7.

Lastly, I've observed that new Colab users (even by a newly created Gmail account) initially gets about 12 hour runs of T4s or P100s, which can often immediately be relaunched. This lasts for two or three days, and then the usage is constrained as above.

May we live in interesting times...[/QUOTE]
If I'm being honest I have no idea what T4 or P100 means but thanks for the info! I guess I'll just have to keep trying till I can latch on.

Uncwilly 2020-04-02 19:11

[QUOTE=chalsall;541610] Lately, they've been running about 7 to 10 hours, tending to the lower range.[/QUOTE]I have been getting ~6-8 hours recently. But, using your tip of using the "Factor Reset" option, I fish until I get a P100 or T4 before holding on to them. Also, if you miss the start of your 24 hour reset, it seems that when you do restart, you may reset your 24 hour window.

pepi37 2020-04-02 19:36

[QUOTE=mrk74;541611]If I'm being honest I have no idea what T4 or P100 means but thanks for the info! I guess I'll just have to keep trying till I can latch on.[/QUOTE]


That is Nvidia GPU, very powerful and expensive card. And very fast...

Chuck 2020-04-03 03:18

[QUOTE=chalsall;541610]
I know at least one person on the paid tier running four concurrently, often almost 24/7.

May we live in interesting times...[/QUOTE]

For a time on the paid tier I was getting 24 hour sessions. Lately they have been limited to 18 hours.

chalsall 2020-04-06 16:43

Colab just reset all runtimes...
 
Interesting...

So, just now I noticed that /all/ of my sessions stopped. Three GPUs and five CPU only.

Then I noticed on one of my Admin reports on GPU72 that *every* GPU72_TF Notebook user also suddenly stopped reporting work underway. This appears to have happened at around 1619 UTC.

I was able to relaunch all my instances. Those that were previously running GPUs were again allowed to get them -- two T4s and a P100.

I'm inferring that Google did some sort of an upgrade, and had to restart everything to accomplish this. So, anyone at console, you might want to try reattaching and restarting your sessions.

PhilF 2020-04-06 22:08

[QUOTE=chalsall;541953]Interesting...

So, just now I noticed that /all/ of my sessions stopped. Three GPUs and five CPU only.

Then I noticed on one of my Admin reports on GPU72 that *every* GPU72_TF Notebook user also suddenly stopped reporting work underway.[/QUOTE]

I bet you were a bit apprehensive at that point, huh? :ermm: :smile:

kuratkull 2020-04-07 08:02

CPU instances (for LLR64) have been stable/predictable for the last couple of weeks. I use two accounts with 4 instances each. They both get about 12 hours of runtime a day. Both accounts expire and become available again at roughly the same times.

linament 2020-04-08 14:32

No backend
 
For the first time, I received this message on Colab when attempting to run without a GPU: "Sorry, no backends available. Please try again later."

kuratkull 2020-04-08 14:55

[QUOTE=linament;542094]For the first time, I received this message on Colab when attempting to run without a GPU: "Sorry, no backends available. Please try again later."[/QUOTE]


Just happened to both of my accounts. Got backends again a short time later.

PhilF 2020-04-08 14:57

[QUOTE=linament;542094]For the first time, I received this message on Colab when attempting to run without a GPU: "Sorry, no backends available. Please try again later."[/QUOTE]

I had 2 CPU only sessions running which ended early. I think they are in the middle of making changes.

petrw1 2020-04-12 16:23

[QUOTE=linament;542094]For the first time, I received this message on Colab when attempting to run without a GPU: "Sorry, no backends available. Please try again later."[/QUOTE]

Same for me for the last few days.

chalsall 2020-04-12 16:45

[QUOTE=petrw1;542439]Same for me for the last few days.[/QUOTE]

Hmmm... I have /never/ not received a CPU. And the daily GPU allotment has been running for ten hours more often than seven the last couple of days.

Chuck 2020-04-13 00:36

[QUOTE=petrw1;542439]Same for me for the last few days.[/QUOTE]

Did you try the trick of changing the runtime type to TPU and attempting restart?

kuratkull 2020-04-13 06:32

[QUOTE=chalsall;542443]Hmmm... I have /never/ not received a CPU. And the daily GPU allotment has been running for ten hours more often than seven the last couple of days.[/QUOTE]


I am sure Google tries its best so that users wouldn't see that. But I can see how instances could be unavailable for a short time in times of congestion. Other than that one brief time when it happened a few days ago, Google is pretty regular in letting 1 instance run always, and the others are time limited. (CPU only.)

kladner 2020-04-20 16:10

As expected this morning my 4 instances were shut down. Just now, a couple of hours later, I got 1 CPU-only worker to start. A second one refused to start. Getting to see all the P-1 startup was enlightening. With 10,240MB allowed only 192 relative primes are being processed. Optimal bounds are B1=720,000, B2=1,332,0000. Meanwhile, Prime95, on a 32GB machine is setting 765,000 and 13,593,750, and running 480 RPs. High memory notebooks have about 25.5GB available. Wouldn't it improve results to let mfaktc use more of the available RAM?

chalsall 2020-04-20 16:38

[QUOTE=kladner;543258]High memory notebooks have about 25.5GB available. Wouldn't it improve results to let mfaktc use more of the available RAM?[/QUOTE]

Hmmm... (And I presume you mean mprime using the RAM.)

To the best of my knowledge, only you and Chuck are running the paid tier. So such high-memory instances wouldn't be that common.

There's also the issue of assignment reissuing. If a high-memory instance got an assignment and started working with really high bounds, what would happen if it then was reassigned to a standard memory sized instance? mprime will still work at the already worked B1/B2 bounds, but at what efficiency? Perhaps George can weigh in on this.

I guess I could have my server code-path make assignment decisions based on the memory available at request time (the telemetry sent back does include this information).

Let me meditate on this a bit, and see if I can come to sane convergence in my head (can be a somewhat painful exercise -- arguing with myself)...

kladner 2020-04-20 18:03

OK. Thanks Chris. I had not thought of the possibility of High RAM not being available on subsequent launches. The whole thing is mostly curiosity based. Good thing I am not a cat.

Chuck 2020-04-20 18:15

[QUOTE=kladner;543258]As expected this morning my 4 instances were shut down. Just now, a couple of hours later, I got 1 CPU-only worker to start. A second one refused to start. [/QUOTE]

My quota has settled into four 19-hour GPU sessions each day. After six hours I can restart them. I have only ever received P100s on the paid tier.

kladner 2020-04-20 18:27

[QUOTE=Chuck;543277]My quota has settled into four 19-hour GPU sessions each day. After six hours I can restart them. [B]I have only ever received P100s on the paid tier.[/B][/QUOTE]
Same here. I wonder if having multiple instances causes this, even if one is only using one at a particular time. I guess I could delete three of the four and mess around for a while to see if anything changes.

ATH 2020-04-20 19:02

[QUOTE=Chuck;543277]My quota has settled into four 19-hour GPU sessions each day. After six hours I can restart them. I have only ever received P100s on the paid tier.[/QUOTE]

I would love to run 4 GPU sessions but I'm afraid it is too good to be true and won't last and I risk losing my account. 4*19*30/24 = 95 days of P100 per month for $10.

Chuck 2020-04-20 20:18

[QUOTE=ATH;543289]I would love to run 4 GPU sessions but I'm afraid it is too good to be true and won't last and I risk losing my account. 4*19*30/24 = 95 days of P100 per month for $10.[/QUOTE]

Indeed it is a remarkable bargain. If it becomes seriously restricted, I will just drop my subscription.

kladner 2020-04-20 21:44

I have noticed that a sign of one's time running out is when the RAM and Disk display gets stuck on Allocating, or something other than the graphic.

That started happening in the last half hour, say 21:15 UTC. I started to try to save them, but I got repeated Save Failed messages. Eventually, a dialog came up saying to save the notebook file and upload it. This is supposed to capture your progress since the last successful save. I have the saves, but currently Colab doesn't even get beyond a white screen with a spinner. System maintenance does seem like a plausible explanation. On the other hand, while a Time Out was not declared, it was in a range of hours when that might have been what tripped me. I'll try the uploads when it settles down (and lets me on again.) :sad:
22:10 UTC: The Welcome to Colab page won't load, either.

kladner 2020-04-21 01:21

Now I find that the site is reported to be up, but I still can't get there. Maybe I wore out my welcome. Is anyone else getting in?

PhilF 2020-04-21 02:40

[QUOTE=kladner;543332]Now I find that the site is reported to be up, but I still can't get there. Maybe I wore out my welcome. Is anyone else getting in?[/QUOTE]

I started 2 CPU-only sessions at 00:00 UTC without problems.

I don't currently use any GPU resources, so I can't comment on their availability.

chalsall 2020-04-21 02:42

[QUOTE=kladner;543332]Now I find that the site is reported to be up, but I still can't get there. Maybe I wore out my welcome. Is anyone else getting in?[/QUOTE]

There are about 20 GPU72_TF session running right now. Have you tried rebooting your computer?

kladner 2020-04-21 17:40

[QUOTE=chalsall;543337]There are about 20 GPU72_TF session running right now. Have you tried rebooting your computer?[/QUOTE]
Eventually, yes. I am running now on a second Google account. After some shuffling to get access to notebooks on my original Drive, I got four up and running (yesterday) and let them run until I was ready to crash (human, not machine!), then stopped them on concern that my other account is failing from overuse. Today, as I was starting up, I got some P100s, but also P4 and K80. After a lot of repeats I had P100s in place of the lower cards, but another problem cropped up. I started getting "too many sessions" warnings when I only had two notebooks running and tried to start a third.

Running 2 at the moment and letting the others 'rest'. I have not tried my original account yet today. Don't know if it has revived.

It now seems that my new Google account was being treated as a freeby. I went back to my original account and got 4 running, all with P100s. I still don't know if the old account can get to Colab on its own. I already had notebooks running when I switched logins. Had to restart them, of course, but everything worked out.

chalsall 2020-04-21 19:19

[QUOTE=kladner;543373]Today, as I was starting up, I got some P100s, but also P4 and K80. After a lot of repeats I had P100s in place of the lower cards, but another problem cropped up. I started getting "too many sessions" warnings when I only had two notebooks running and tried to start a third.[/QUOTE]

OK... As already documented, I'm on the free tier. I've never been able to run more than one instance per account. Not even just a CPU instance to do some software development in. Every running instance is associated with its own Google GMail account (often in different tabs in the same browser context, IP address, browser fingerprints, etc).

Because I'm not on the paid tier, I can't do any drill-down on the GUI and runtime differences. But if you got a P4 (yukky) or a K80 (why waste your time?), it suggests strongly that Colab considered those accounts/instances as unassociated with your paid status. I guess one test would be to see if you have access to high-memory environments.

I must again thank Google for providing this wonderful environment to learn in! We're not just TF'ing. Really! (Seriously and sincerely.) :smile:

chalsall 2020-04-21 20:26

VIM in Colab!
 
I just thought I'd quickly share sometime new I've discovered about Colab: VIM is available for the web-based GUI editor! I think this is relatively new.

Go to the "Tools" menu and select "Settings". In the pop-up select the Editor tab, and under the "Editor key bindings" drop-down you can now choose "vim".

Cool! :tu:

vi/vim is my nominal editor at the console(s), so this is wonderful for me. Cursor over to something, press "cw" (change word command for VIM) and away I type! Happy!

kladner 2020-04-21 22:17

Hi Chris,
I am just telling my adventures in this realm. I'm still not sure how Google sees these things (accounts) but [U]right now[/U], my instances were started from the original account, and I have been able to restart them without hangups. It is hard to get real experiments when you are only surmising about the underlying rules. This is all just a chance to screw around with a new toy, and boost my statistics. My tales are mostly along the lines of "THIS did not work for me. YMMV." There are too many unknown-unknown variables, I have not conceived of a way to get a clean experiment.

EdH 2020-04-21 23:47

[QUOTE=chalsall;543389]I just thought I'd quickly share sometime new I've discovered about Colab: VIM is available for the web-based GUI editor! I think this is relatively new.

Go to the "Tools" menu and select "Settings". In the pop-up select the Editor tab, and under the "Editor key bindings" drop-down you can now choose "vim".

Cool! :tu:

vi/vim is my nominal editor at the console(s), so this is wonderful for me. Cursor over to something, press "cw" (change word command for VIM) and away I type! Happy![/QUOTE]
This sounds excellent - I will have to learn how to do it. I've been using your tunnels with vi to make simple edits and looking for an easier way.

chalsall 2020-04-22 12:42

[QUOTE=EdH;543404]This sounds excellent - I will have to learn how to do it. I've been using your tunnels with vi to make simple edits and looking for an easier way.[/QUOTE]

Unfortunately for your use-case, it won't be of assistance. The VIM key-bindings are only for editing Sections within the Notebooks, not arbitrary files in the filesystem.

EdH 2020-04-22 13:24

[QUOTE=chalsall;543428]Unfortunately for your use-case, it won't be of assistance. The VIM key-bindings are only for editing Sections within the Notebooks, not arbitrary files in the filesystem.[/QUOTE]:sad:
I am glad you let me know, preventing me from the frustration of trying.:smile:

kladner 2020-04-22 14:03

Last night I shut down the 2 notebooks running at the time on my original account and logged that one off. Fired up the new (unpaid) gmail account. Started the 2 notebooks that had not been running (I have 4 total). Imagine my delight when I was able to get a P100 and a T4. :w00t: These ran for 8-1/4 hours until I woke up, switched to the paid account, and fired up all 4 again. All are now crunching away on P100s. :smile:

I am hoping to be able to repeat this pattern for similar results. I am working on a theory of self-regulation to head off timeouts.

kladner 2020-04-22 22:51

I worked the same scheme just now.I logged out of one Google account and into the other. The first notebook came up with a P4, but after a couple of (factory) resets I got a T4. The second notebook launched took several more resets, but I now have 2 T4s running. :grin:

I ran 2 notebooks last night for about 8 hours. They had P100s, and I wasn't persistent in trying for more. That was my #s 3 and 4 notebooks. Tonight I'm running #s 1 and 2. I doubt it makes a difference anyway.

However, I can confirm that free accounts are able to get T4s, which I've never seen on Paid. I don't think I've tried repeated resets there, like I did tonight on Free.

It seems there is still some tie-in between accounts, as when I am running Paid x4, I can't get Free to load. It is the Too Many Sessions error. Of course, this is with both accounts logged in, so the connection is obvious.


Another EDIT: Could this thread maybe transmogrify into a General Colab Discussion? It has accumulated a lot of information, but newer users might not think to look there, unless they had the same error. I am a fairly clumsy and ignorant supermod, but I probably change the title without screwing up. Another question might involve merging other Colab-related threads, but that's another whole can of worms for me. :max: [Can O Worms Emo is just too big.]
Maybe I'll just start a thread and copy informative posts into it. Ideas?

kladner 2020-04-23 18:10

Thursday morning all notebooks are getting stuck on the self-test. Same result in Firefox and Chrome.

PhilF 2020-04-23 18:36

I just connected manually and got my normal CPU-only sessions, but the welcome page has changed. I bet they have made a change in their interface that has affected Chris's notebooks and/or sessions.

mrk74 2020-04-23 18:46

[QUOTE=kladner;543561]Thursday morning all notebooks are getting stuck on the self-test. Same result in Firefox and Chrome.[/QUOTE]


I've had P-1 running for about 4 hours just fine, Started a second instance and got stuck on selftest too. Notified Chris about it.


Edit: Just broke out of the self test and got TF + P-1

chalsall 2020-04-23 19:05

[QUOTE=PhilF;543564]I just connected manually and got my normal CPU-only sessions, but the welcome page has changed. I bet they have made a change in their interface that has affected Chris's notebooks and/or sessions.[/QUOTE]

Nope... An SPE. Sorry guys.

I had LG72D focusing on the high end of 100M. But then Oliver came in and reserved a huge batch of candidates to take up to 77.

I need to add a fall-back code path, such that if the targeted assignments aren't available, it will at least hand out /something/.

PhilF 2020-04-23 19:16

[QUOTE=chalsall;543568]I need to add a fall-back code path, such that if the targeted assignments aren't available, it will at least hand out /something/.[/QUOTE]

Well, I could use an ice cream cone. Just sayin....

kladner 2020-04-23 23:28

On the bright side, I got some T4 time on my second account this morning, though it took lots of resets. Could only get a P100 on a second instance. Poor poor pitiful me. I stopped that one at 5 hours in the interest of self-regulation and Google placation. I got offered K80s and P4s several times, so getting what I did made it a pretty good morning.

I did have to take a break for a while when all notebooks got stuck on Allocating memory. At the time I took it as one of those signs that one should sign out as gracefully as possible. Whether it's a systemic malfunction, or the Usage Hounds nipping at your heels, or something else, there are times when Things Just Aren't Working and you might as well call it quits.

Interesting note: right now I have a P-1 using 10.91 GB on Stage 2. That's the highest I've seen, and it turned the bar a darker orange. Available RAM is 12.72 GB.

kladner 2020-04-24 22:25

It seems 2200 UTC [I]might[/I] be a time to snag T4s. I made another account about that time and have 2 T4s running. :cool: Earlier I was not so fortunate.
Now have 4 accounts, though I have not succeeded in evading limits when logged into more than one. Switching between one paid (4 instances) and 3 freebies (2 instances) can work pretty well when the free accounts are sometimes getting T4s easily.

I have also come to think that limiting runs of whatever number of session to around 6 hours and then switching accounts may work around usage limits. :smile:
EDIT: 0300 UTC Saturday was not yielding any T4s on the free accounts, so I made do with 4 P100s on the paid account. :smile: 1120+ GHz/d for the 6 hours it will run before I switch off to another account in the morning.


All times are UTC. The time now is 08:19.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.