mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Cunningham Tables

Reply
 
Thread Tools
Old 2021-10-21, 02:29   #111
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

3·372 Posts
Default

My last post made me think of something:

Did you let CADO-NFS select the folder and name, or did you set those? If CADO chose the folder, it will be in the /tmp directory and will be lost in the case of a computer reboot (not the CADO restart described above). If the folder is in /tmp, make sure to make frequent backups of the folder elsewhere.
EdH is offline   Reply With Quote
Old 2021-10-21, 05:38   #112
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

32·563 Posts
Default

Changing settings on the fly is no problem, via the snapshot file. Clients see nothing amiss, at worst they go into "waiting 10 sec to try again" mode for a few loops until the server is serving again.

Seems worth it to comment out adjust_strategy line until after Q=120M.
VBCurtis is offline   Reply With Quote
Old 2021-10-21, 08:13   #113
kruoli
 
kruoli's Avatar
 
"Oliver"
Sep 2017
Porta Westfalica, DE

2·383 Posts
Default

No worries, the workdir parameter is set. Good thing I have done so, we are just having an extended power outage here. Cell towers continue to function. We experience Bft 10 winds and maybe something dropped on a power line. I cannot recall that we had an outage for more than a few minutes, but now it's already more than an hour. First world problems...

As soon as I get everything up again I will comment out the strategy setting in the snapshot file and restart from there.
kruoli is offline   Reply With Quote
Old 2021-10-21, 09:24   #114
kruoli
 
kruoli's Avatar
 
"Oliver"
Sep 2017
Porta Westfalica, DE

2FE16 Posts
Default

Everything should be up again with the new settings. Now, I'm seeing the 5.5GB memory usage VBCurtis recorded originally.
kruoli is offline   Reply With Quote
Old 2021-10-21, 11:11   #115
charybdis
 
charybdis's Avatar
 
Apr 2020

5×109 Posts
Default

Thank you!

One more thing: the next time you restart the server, maybe to add adjust_strategy back in at 120M, it would be a good idea to change tasks.maxtimedout and tasks.maxfailed from their default values of 100 to something like 1000. A WU will time out every time a client leaves and that'll almost certainly happen over 100 times over the whole job. 100 WUs failing is less likely, but you've got less leeway on that because I left a machine with a hardware fault churning out errors for nearly a day
charybdis is offline   Reply With Quote
Old 2021-10-21, 16:19   #116
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

3·372 Posts
Default

I just added a client and found I did need to update my CADO-NFS. I think I had an http/https issue. Do I remember correctly that https support is more recent?

In any case, I appear to be well behind in my "up-to-datedness." it will take a bit to update all of my machines. . .

Thanks for the "make check" note. I was not aware of that, even after all my years using CADO-NFS.
EdH is offline   Reply With Quote
Old 2021-10-22, 12:09   #117
charybdis
 
charybdis's Avatar
 
Apr 2020

5·109 Posts
Default

Server seems to have been down for a while?
charybdis is offline   Reply With Quote
Old 2021-10-22, 12:12   #118
kruoli
 
kruoli's Avatar
 
"Oliver"
Sep 2017
Porta Westfalica, DE

2FE16 Posts
Default

It seems to have crashed. I cannot even ping the machine from my home network. I have no clue what could have caused that. Maybe we had another power outage when I was not at home.

I'll reboot it as soon as I am home (in around five hours).

Sorry.
kruoli is offline   Reply With Quote
Old 2021-10-22, 16:53   #119
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

32·563 Posts
Default

I think I'm happier to hear of a hardware crash than to think CADO still fails regularly.
VBCurtis is offline   Reply With Quote
Old 2021-10-22, 17:22   #120
kruoli
 
kruoli's Avatar
 
"Oliver"
Sep 2017
Porta Westfalica, DE

2·383 Posts
Default

The system is up again and already receving results.

When I came home, the machine was still in a high power state (as in: power usage was as if the machine was under full load), but there was no GPU output. The keyboard was still responding to NUM- and CAPS-lock changes, which usually does not happen if a machine really hangs. But I had no way to interact with the machine in a meaningful way.

I am everything but sure what happened. I do not think it is CADO's fault, but I cannot state that for sure. There are file changes in the system that occured shortly before I pressed the machine's reset switch, so the kernel was working at least in parts before I shut the machine down. So basically, it was running. It looks like all PCIe cards got disconnected, since GPU and network (dedicated card) both did not work anymore. This is only my best guess; it never happened before (XKCD 1068) and hopefully does not happen again. I will keep a closer eye on it.
kruoli is offline   Reply With Quote
Old 2021-10-22, 18:01   #121
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

10000000010112 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
I think I'm happier to hear of a hardware crash than to think CADO still fails regularly.
A recent run with my system using only local machines also stopped serving quite often, but the machine was always available via ssh and such. All was working, but the serving of WUs. But, at the time, I was still using a September 2020 commit. As I remember, that was how the Team Sieve you ran was when it stopped serving. This does sound quite different.
EdH is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Using 16e on smaller numbers fivemack Factoring 3 2017-09-19 08:52
NFS on smaller numbers? skan YAFU 6 2013-02-26 13:57
Bernoulli(200) c204 akruppa Factoring 114 2012-08-20 14:01
checking smaller number fortega Data 2 2005-06-16 22:48
Factoring Smaller Numbers marc Factoring 6 2004-10-09 14:17

All times are UTC. The time now is 04:19.


Tue Nov 30 04:19:01 UTC 2021 up 129 days, 22:48, 0 users, load averages: 1.67, 1.44, 1.27

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.