![]() |
![]() |
#1 |
Bemusing Prompter
"Danny"
Dec 2002
California
1001110001112 Posts |
![]()
For a few days, the server showed no data. The data is back, but the formatting of the page has changed.
What happened? |
![]() |
![]() |
![]() |
#2 |
Jul 2004
Nowhere
14518 Posts |
![]()
Server was haveing issuse were the rounding to make colums was affected by new release ranged...
|
![]() |
![]() |
![]() |
#3 |
Bemusing Prompter
"Danny"
Dec 2002
California
2,503 Posts |
![]()
The new formatting of the columns is much harder to read. :\
|
![]() |
![]() |
![]() |
#4 |
Jul 2004
Nowhere
809 Posts |
![]()
yes it is but sacrifices need to be made. Good thing is that v5 is becomeing a reality.
|
![]() |
![]() |
![]() |
#5 |
Jul 2004
Milan, Ita
22×32×7 Posts |
![]()
... but apparently is just a matter of an extra "\n" occurring after each line and not only after 5 or so.
Nothing really important, though, just a little bit disturbing to the eye... |
![]() |
![]() |
![]() |
#6 |
Bemusing Prompter
"Danny"
Dec 2002
California
9C716 Posts |
![]()
Someone fixed it now. :)
|
![]() |
![]() |
![]() |
#7 | |
Dec 2002
22·3·73 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#8 | |
Aug 2002
3·43·67 Posts |
![]() Quote:
![]() |
|
![]() |
![]() |
![]() |
#9 |
P90 years forever!
Aug 2002
Yeehaw, FL
204016 Posts |
![]()
He was vague on the details, SK found a race condition or deadlocking problem deep in the kernel somewhere. He added a mutex to prevent the condition a week ago. So far so good.
He also met with his ISP yesterday. It seems we were on a 10Mbps connection when we should have been on a 100Mbps. Maybe there were other issues too - I don't know. |
![]() |
![]() |
![]() |
#10 |
Jan 2003
Altitude>12,500 MSL
6516 Posts |
![]()
Details for the curious:
The mutex was a tool for finding the deadlock; it also allows a graceful error 23 server busy condition to surface if after a 10 second timeout the client can't get into the server's transaction queue, so I left it in place. The deadlock happened in NT kernel.dll from using Win32 timeSetEvent() API within CriticalSection code, when a timer event handler was setup, but not every time. It's not clear why this should be a problem but there's quite a bit of online material describing similar issues with these APIs. v4 uses delay timers for a variety of utility functions but in this particular case, it was for the rollback of an assignment if the client failed to respond with an estimated completion date within two minutes. v4 uses a thread from a pool for each client transaction, so by the time the deadlock condition was evident other client threads had started and became queued at various CriticalSections and the deadlocked call stack was too deep to debug with my tools. The mutex serialized the clients in the CGI layer instead of the v4 service layer CriticalSections, so there were far fewer threads and relatively shallow stacks to debug where the deadlock originated. The actual fix drops use of the rollback timer and instead allows the automated daily cleanup to handle the unacknowledged assignment rollbacks. This is not in the same service layer so there's no risk of a similar deadlock in the same area. The ISP normally is really on top of things, but during a routine review of bandwith settings their staff mixed up KB/s for Kbps, clamping bandwith at 1/10th its nominal rate. This apparently happened to several other clients of theirs, too, but they had nowhere near the traffic we do, and didn't notice it. For a while I thought something was wrong with the server because I knew from past experience it should be really fast. After a process of elimination it checked out clean and I had the ISP check the bandwidth limit, and fix it. The summary report having extra newlines was a formatting goof when we accidently had a -1 exponent in an unsigned int column of the database (it becomes a really big positive number) and I thought the report needed adjusting to handle a real exponent. |
![]() |
![]() |
![]() |
#11 |
Dec 2002
22×3×73 Posts |
![]()
Great, this will have a positive impact on the whole community. One more favor to be asked: Do you have some time to update the stats page of the primenet server? Currently throughput is in the graph up to some point in the year 2004.
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
What happened to the M47? | dabler | News | 16 | 2018-06-12 21:45 |
What happened with Dubslow? | aketilander | Lounge | 13 | 2013-08-31 00:35 |
n=500000 what happened? | cipher | Twin Prime Search | 2 | 2009-07-15 01:15 |
What happened to my stats? | BranMuffin | PrimeNet | 4 | 2008-11-19 22:33 |
What happened? | ThomRuley | Lone Mersenne Hunters | 7 | 2003-07-31 16:31 |