![]() |
![]() |
#34 | |
A Sunny Moo
Aug 2007
USA (GMT-5)
3·2,083 Posts |
![]() Quote:
Maybe my theory about the personal proxy is correct, but the server seemed back to normal again because it finally released all the sockets about the time Carlos reported that the server was back online? Anyway, restarting the server should clear out any bugs left in the system. (Why do our servers always seem to go down right before a rally is to be held on them?) ![]() |
|
![]() |
![]() |
#35 | |
May 2007
Minnesota USA
1100012 Posts |
![]() Quote:
|
|
![]() |
![]() |
#36 |
Jul 2007
Tennessee
25·19 Posts |
![]()
I don't know what was going on. Everything seemed OK but something had obviously happened. Does anyone have the client console log from when this started?
It seems like the client should continue to process the WUCache even if the server goes completely offline. Last fiddled with by AES on 2008-05-23 at 15:13 |
![]() |
![]() |
#37 | |
Sep 2004
2·5·283 Posts |
![]() Quote:
I had lucky because I was using the machine when all my clients stuck, please see the time of my posting. I don't have the log. Correct but that doesn't happens... Last fiddled with by em99010pepe on 2008-05-23 at 15:39 |
|
![]() |
![]() |
#38 |
Account Deleted
"Tim Sorbera"
Aug 2006
San Antonio, TX USA
426710 Posts |
![]()
I've added one core. The other will go on in about half an hour, once it finishes balancing the time remaining between my two cores.
|
![]() |
![]() |
#39 |
Dec 2005
313 Posts |
![]()
Not a good morning at all.......... :(
Twenty-nine llrnet instances locked up. One was running. The running one was one that I had set the cache to 50 on, and using the PG llrnet app. Two other PG llrnet app instances with the cache set to 15, and refill at 5 were locked up. And then all the rest were using the standard sr2 llrnet app and were locked up. So it begs the question regarding the PG llrnet app of why the small cache instances locked and the 50 cache instance didn't? So I don't think it is an issue of the PG client not keeping running when the "event" happened, I think it is that the server or network clogged with the event, and then they hung cause they couldn't get more work. My two bits worth is that the load on the server system/net just got to heavy??????? I'm still not running as I haven't gone through and cleared everything. |
![]() |
![]() |
#40 |
May 2007
Kansas; USA
11·937 Posts |
![]()
I haven't checked my main machines yet but at least 2 of my slower cores are sleeping on LLRnet right now. I'll make the rounds and see if others are having the same problem.
Unfortunately I have the cache set at 1 for all of them. Gary |
![]() |
![]() |
#41 |
A Sunny Moo
Aug 2007
USA (GMT-5)
3·2,083 Posts |
![]()
It looks like the PrimeGrid client doesn't respond any differently than the stock client in regard to freezing up--whereas the "normal" Linux client will empty its workfile.txt cache before freezing up, the PrimeGrid version freezes up when it reaches its refill value--essentially, they freeze when they're supposed to get more work. (This is in contrast to the Windows client, which freezes with full cache and 99% progress.)
Going by gut feeling, I still think this might have something to do with the fact that personal proxies tend to lock up Windows LLRnet servers. After the rally, I think I'll try running a test on my local network to see if I can reproduce a situation like freezeup we had earlier today. |
![]() |
![]() |
#42 | |
A Sunny Moo
Aug 2007
USA (GMT-5)
141518 Posts |
![]() Quote:
I generally, as a rule of thumb, keep a minimum cache of 5 for numbers this size on my Core 2 Duo. (That is, a cache of 5 on each core.) In fact, I usually use a cache size of 10 just to give myself some extra "padding" in case my internet connection skips out (which it does, for brief intervals, regularly; this is why I do only manual work while I'm on vacation). ![]() ![]() |
|
![]() |
![]() |
#43 |
May 2007
Kansas; USA
1030710 Posts |
![]()
Bad...bad...bad...
All 18 of my LLRnet port 300 cores have been sleeping since 5:45 AM EDT this morning. (Ugh, HUGE loss of CPU cycles!) I've now 'killed' the LLRnet instances on all of them and tried to restart them. No luck... We need to take some fast action with less than 1-1/2 hours before the rally. David (Ironbits) or Adam, we need to think about setting up another temporary server and loading it up with some work for the rally. Perhaps Bruce and I only can run on the temporary server since we're the heaviest 2 hitters (I think) running the rally. I'll be out for about a half hour here. If everyone can post their thoughts here, that would be great. I don't want a repeat of our 2nd rally. Gary |
![]() |
![]() |
#44 |
May 2007
Kansas; USA
1030710 Posts |
![]()
I've sent a note to IronBits to be on stand by to set up a temporary LLRnet server for the rally.
Adam, I've killed all instances of the LLRnet server on my machines. Can you restart your server? I'm hoping that taking 18 instances of it away will help. Gary |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Rally Jan. 23rd-25th | gd_barnes | No Prime Left Behind | 89 | 2009-01-25 22:59 |
LLRnet server rally 400<k<1001 August 8-10 | mdettweiler | No Prime Left Behind | 66 | 2008-08-11 03:00 |
LLRnet server rally 400<k<1001 June 20-22 | mdettweiler | No Prime Left Behind | 67 | 2008-06-23 15:32 |
LLRnet server rally port 300 May 3rd-4th | gd_barnes | No Prime Left Behind | 45 | 2008-05-05 19:56 |
LLRnet server rally March 8th-9th | gd_barnes | No Prime Left Behind | 135 | 2008-03-14 19:52 |