Register FAQ Search Today's Posts Mark Forums Read

2019-09-18, 03:26   #78
Dylan14

"Dylan"
Mar 2017

2·293 Posts

Quote:
 Originally Posted by Dylan14 This error would suggest that perhaps / is read only on Kaggle (as per the replies here: https://www.linuxquestions.org/quest...es-4175619721/). But that doesn't make sense, since we are able to write to the disk to run the bootstrap script.

I figured the issue out. Just have to call

Code:
!chmod 777 /tmp
before calling the apt-get command and then it works fine on Kaggle:

Code:
Ign:1 http://deb.debian.org/debian stretch InRelease
Get:2 http://security.debian.org/debian-security stretch/updates InRelease [94.3 kB]
Get:3 http://deb.debian.org/debian stretch-updates InRelease [91.0 kB]
Get:4 http://deb.debian.org/debian stretch Release [118 kB]
Get:5 http://packages.cloud.google.com/apt cloud-sdk InRelease [6337 B]
Get:6 http://deb.debian.org/debian stretch Release.gpg [2365 B]
Get:7 http://security.debian.org/debian-security stretch/updates/main amd64 Packages [503 kB]
Get:8 http://packages.cloud.google.com/apt cloud-sdk/main amd64 Packages [86.7 kB]
Get:9 http://deb.debian.org/debian stretch/main amd64 Packages [7086 kB]
Fetched 7678 kB in 2s (3565 kB/s)
Reading package lists... Done

 2019-09-18, 12:23 #79 henryzz Just call me Henry     "David" Sep 2007 Cambridge (GMT/BST) 2·2,969 Posts I believe most people recommend running apt-get update as root with sudo. I don't know whether that is an option on this system. It might struggle for permissions on writing the final files as well as the temporary files(I believe this is the normal reason for the root permissions).
2019-09-18, 13:39   #80
chalsall
If I May

"Chris Halsall"
Sep 2002

100111010000012 Posts

Quote:
 Originally Posted by Chuck I have just observed what is causing the "wide" Kaggle output. When the uptime goes beyond "23:59", it starts outputting "1 day, 4 min" etc. These additional characters are causing the line wrap.
OK, thanks for bringing that forward. I'm now getting the raw uptime from /proc/, and rendering it as HH:MM.

2019-09-18, 13:40   #81
chalsall
If I May

"Chris Halsall"
Sep 2002

274116 Posts

Quote:
 Originally Posted by Dylan14 I figured the issue out. Just have to call ... before calling the apt-get command and then it works fine on Kaggle:
OK, thanks for this improvement. Applied.

 2019-09-18, 15:32 #82 Chuck     May 2011 Orange Park, FL 38216 Posts Kaggle checkpoints Since we are only allowed 30 hours of GPU time per week, and 9 hours of connect time per session, if checkpoint restarts are going to work they will have to be saved for about five days. This assumes I will use my 30 GPU hours the first two days of each week. And shouldn't the process begin with looking for checkpoint files instead of assigning new work? I am getting a lot of abandoned checkpoints building up. (I noticed this morning that Colab started out with a checkpoint file; perhaps this has already been addressed). Last fiddled with by Chuck on 2019-09-18 at 15:47
2019-09-18, 15:47   #83
chalsall
If I May

"Chris Halsall"
Sep 2002

13·773 Posts

Quote:
 Originally Posted by Chuck Since we are only allowed 30 hours of GPU time per week, and 9 hours of connect time per session, if checkpoint restarts are going to work they will have to be saved for about five days. This assumes I will use my 30 GPU hours the first two days of each week.
Assignments with checkpoint data will never be expired. Or, at least, there's no code for that currently -- will probably be needed in the future to deal with abandoned "Anonymous" assignments.

So if you eat your 30 hour allotement quickly, the assignments with work done will stick around, for you to pick up whenever you next launch an instance.

Keep in mind also that your Colab worker(s) will be given any assignments not reported on for 12 hours, so old assignments handing around shouldn't really be an issue.

2019-09-18, 15:50   #84
chalsall
If I May

"Chris Halsall"
Sep 2002

13·773 Posts

Quote:
 Originally Posted by Chuck I am getting a lot of abandoned checkpoints building up. (I noticed this morning that Colab started out with a checkpoint file; perhaps this has already been addressed).
OK... It's entirely possible I've done something stupid.

I'm watching the logs; let me observe what's happening...

2019-09-18, 21:51   #85
chalsall
If I May

"Chris Halsall"
Sep 2002

235018 Posts

Quote:
 Originally Posted by chalsall I'm watching the logs; let me observe what's happening...
OK... Something weird is going on with regards to reassigning you your old candidates; haven't figured out why yet. The work definitely isn't "lost" -- I just need to figure out the stupid mistake I've made in the SQL. Still working it.

For anyone running an instance (or two...), I have just "pushed" the lastest production Bootstrap package. This has been regression tested, and it's sane.

I've tightened up the log output, to be as dense as it can be, while still containing the data. I've moved the "ETA" field to be immediately after "% Done" -- seemed more logical.

The spider is now returning the observed GHzD and ItrTime data to the server. This is to be able to calculate estimated completions (not coded yet on the server).

Anyone launching future instances will pick up this new code. For anyone currently running an instance, it is safe to stop and then relaunch.

This is ***so*** cool!

 2019-09-18, 22:48 #86 Uncwilly 6809 > 6502     """"""""""""""""""" Aug 2003 101×103 Posts 29·349 Posts So, I am not a Linux or Python person (my bad). I do have access to the Coloboraory through a corporate g-mail / g-suite package. If I want to set up to run, is there some hand-holding instructions on how to? I have been paying some attention, but much of the code is lost on me.
2019-09-18, 23:28   #87
chalsall
If I May

"Chris Halsall"
Sep 2002

13×773 Posts

Quote:
 Originally Posted by Uncwilly If I want to set up to run, is there some hand-holding instructions on how to? I have been paying some attention, but much of the code is lost on me.
We've gotten to the point that code isn't really involved. Other than one copy-and-paste.

Just Create a new Assignment Key, and then log into Colab and/or Kaggle to paste the code, and then click Run. That's it.

This presumes you already have a GPU72 account. And, of course, a Primenet account to which to submit results (that part isn't automated yet).

Please give it a whirl. I like to see the code paths exercised, to find those corner cases!

 2019-09-19, 00:01 #88 petrw1 1976 Toyota Corona years forever!     "Wayne" Nov 2006 Saskatchewan, Canada 3×1,619 Posts Last evening I started my weekly 30 hours. 2 commits and 1 run all. Before lunch today my 30 hours were all gone.

 Similar Threads Thread Thread Starter Forum Replies Last Post kriesel Cloud Computing 11 2020-01-14 18:45 enzocreti enzocreti 0 2019-02-15 08:20 Christenson Hardware 32 2011-12-25 08:17 garo Hardware 41 2011-10-06 04:06 dsouza123 NFSNET Discussion 5 2004-02-27 00:42

All times are UTC. The time now is 12:36.

Mon Dec 6 12:36:42 UTC 2021 up 136 days, 7:05, 0 users, load averages: 2.32, 1.94, 1.63