View Single Post
Old 2009-08-19, 16:02   #1189
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

23·5·263 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
Ah, that would make sense. Both G8000 and G7000 crashed a couple of times after the outage, as always seems to happen after an outage. They seem to have stabilized now, though I've put them in a loop so that they'll restart if they do crash again.
They kept crashing Max. Please don't use the phrase "couple of times" when you don't know how many times. Just look in the restart.txt file. There are multiple crashes. I looked at 3 AM CDT this morning and I saw that both had crashed again within the last few hours and were automatically restarted with the loop thing.

You are bound and determined to gloss over this whole issue without doing a detailed look at the exact times and matching up when the rejected results were originally handed out. I took 2 hours last night to do that for you now. How about looking into it this time please?

Please calculate when the 26 rejected results were originally handed out today. I saved them off under an obvious file name. Like I said, I only had time to look at the first 2-3 and those were handed out at 09:55-10:00 CDT on Aug. 18th. Simply take the time that the original result was returned and subtract the # of seconds that it took to return it.

I'm not going to back off on this until we nail it down. I nailed down 10 rejected results to the original power outage. The other 46 still have no explanation. How do we know that they were as a result of yet another crash? We don't. We need to match up exact crash times with times in which the original pairs were handed out.

We seem to have gotten into this habit of glossing over these server problems and that habit needs to end.

I don't know if this will help but it can't hurt:

On port G8000 only, please increase the JobMaxTime to 2 days.

Please tell me how you safely stop the server to do this. If you can let me know how that is done, then I'll do it if it is needed in the future. I now how to change the JobMaxTime and to restart it but don't want to create a problem when I stop it.

Karsten, can we talk you into returning pairs normally instead of ~100 at a time about twice a day? If you need to do so many at a time, how about you write a script to do ~20 each hour for 5 hours or something like that? That may help some.


Gary

Last fiddled with by gd_barnes on 2009-08-19 at 16:06
gd_barnes is offline   Reply With Quote