![]() |
![]() |
#100 |
"Gary"
May 2007
Overland Park, KS
3×5×7×113 Posts |
![]()
I'm having a serious concern about the most recent stress test now. With the aforementioned 5 errors fixed, there should be no problems. But something happened that has now happened 3 times in a row:
1. The server "loses" several pairs right at the very beginning. In this case, it is pairs numbered 6 thru 10. The first 5 went through OK. (BTW, my cache was set to 10 for this test.) 2. The server "loses" a large # of pairs at the very end. In this case, 42 of them, which is the fewest that its lost of any of the 3 stress tests I've run. (Likely because I was only running 4 cores vs. 31 cores.) Checking confirmed that it was the final 42 pairs. They are just sitting in knpairs.txt and joblist.txt as though they were handed out and never processed. Yet checking my clients confirmed that they were. I don't know if this is stress-related or related to problems in the Linux client/script. Since this occurred when running just one quad, which is effectively like 1000+ clients at n=~400K, which makes it a pretty decent stress test, I may need to run the Windows client to see if it has the same problem. I can simulate a similar load with 4 cores of my I7 with the same knpairs loaded in the server. I initially thought that it might be related to the fact that all of the first few pairs are prime except that the same issue seems to be happening at the beginning of the file as at the end. For reference, I'm attaching the final knpairs that didn't process and the joblist. See a few posts back where I posted the entire knpairs file. The prune period was set to 15 mins and the server dried some 9 hours ago so these are not just some straggling pairs that still need to be received by the server. Gary Last fiddled with by gd_barnes on 2010-02-24 at 23:11 |
![]() |
![]() |
![]() |
#101 | |
"Gary"
May 2007
Overland Park, KS
3·5·7·113 Posts |
![]() Quote:
That's a good idea on displaying that info. But if you do it, we need to make sure Max agrees and that he can change the Linux client. Gary |
|
![]() |
![]() |
![]() |
#102 | |
A Sunny Moo
Aug 2007
USA (GMT-5)
11000011010102 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#103 | |
A Sunny Moo
Aug 2007
USA (GMT-5)
2·55 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#104 |
A Sunny Moo
Aug 2007
USA (GMT-5)
2·55 Posts |
![]() |
![]() |
![]() |
![]() |
#105 | ||
"Gary"
May 2007
Overland Park, KS
271318 Posts |
![]() Quote:
Unfortunately I've already stopped the server; saved off the applicable files and reloaded it. I'll have to try a smaller file to retest it in < 1 hour or so instead of waiting 6-7 hours for it to dry. Quote:
Gary Last fiddled with by gd_barnes on 2010-02-25 at 06:05 |
||
![]() |
![]() |
![]() |
#106 |
A Sunny Moo
Aug 2007
USA (GMT-5)
2×55 Posts |
![]()
I've now tested the do.pl script on Windows for most of today (since around 10 AM EST), and have encountered no problems except the small factor issue. BTW, I did a bit of investigating on that and found a couple things:
-The first of the four results (which came before the small factor in the batch) was accepted fine. -On the server end, the small-factor result was received and accepted, though with the NewPGen header (!) put in place of the residual. -The remaining 3 results in the batch, all of which came after the small-factor one, were rejected and subsequently thrown out by the client. I'm not positive, but I think normal LLRnet is designed to be able to handle small factors correctly (though I haven't actually tested it). At any rate, though, no properly sieved file should ever have factors in it small enough for LLR to turn up; I imagine it wouldn't be a big deal if we didn't bother to fix it, since if there's small factors in the server, then there's a much bigger problem than just a few abandoned tests. Not to mention that if LLRnet doesn't have a precedent for handling these (as I said, I'm not sure if it does), then we wouldn't be able to fix it at all without adding code for it on the server end (which we probably don't want to get into). Other than that, though, do.pl seems to be working perfectly. Gary, have you gotten the chance to test the latest version yet on Linux and do the stress test you were planning? |
![]() |
![]() |
![]() |
#107 | |
"Gary"
May 2007
Overland Park, KS
3·5·7·113 Posts |
![]() Quote:
The version that I posted yesterday is the latest version. Correct? lol It is that latest version that I ran my big stress test on yesterday. It was about halfway through the stress test that I changed a prime residue from 16 x's to a single digit of "0" like the Windows client. What I want to test today is the same script for the cancellation of pairs and the problem with the pairs not processed at the beginning and end of the file by the server. I just got back in after a long day and need to do a couple of things yet. But I plan to test in the wee hours here for 2-4 hours. BTW, I also observed what you did on a pair that had a factor of 5. It put the file header in the residue. You know what? I think that might explain why the 4-5 pairs right after it were not accepted by the server even though the client processed them. Bingo! And...if what you said about the final pruning is causing them not to be processed at the end, well...that might explain completely what happened yesterday with the pairs that weren't processed by the server. That said, the server never showed the results for the missing pairs at the end so I'm questioning how a final prune would actually be able to work. Agreed that a small factor should never happen on a reasonably sieved file. As a programmer though, it would be nice to code around it but not at the expense of a lot of extra time/testing. I'll see what the code looks like. Gary Last fiddled with by gd_barnes on 2010-02-25 at 06:17 |
|
![]() |
![]() |
![]() |
#108 | |
"Gary"
May 2007
Overland Park, KS
271318 Posts |
![]() Quote:
Karsten, I was looking to make this change to the residue for a prime in llrnet.lua on the Linux side but it appears to already default to a "0". Here is the code: Code:
-- perform prime test ! if not asynchronous then Logout() -- logout before performing computation end -- UpdateStatus(format("Working on : %s/%s (%s)", k, n, t)) -- print(format("Working on : %s/%s (%s)", k, n, t)) -- result, residue = primeTest(t, format("%s %s", k, n)) result, residue = 0, "0" -- check user interruption if stopCheck() then return -- return with no error end end SemaWait(semaphore) What change is needed to accomplish what you are talking about? Edit: The code in the Windows client is the same. Please enlighten me. Last fiddled with by gd_barnes on 2010-02-25 at 06:26 |
|
![]() |
![]() |
![]() |
#109 | |
Mar 2006
Germany
5·601 Posts |
![]() Quote:
try to do this: there're functions called PrunePairs() and PruneJoblist() (called in funtion ProxyUpdate). make an output on the server with "print("PrunePairs Call 1")" everytime that function is called before (the other same) and gave every call an own number, so you can say, which call invokes the function. even better: put the date/time into it: Code:
print(format("PrunePairs Call #1: [%s] ", date("%Y-%m-%d\ %H:%M:%S"))) this let you see, where and when the server pruned; perhaps there's issue: only pruning when results received. Last fiddled with by kar_bon on 2010-02-25 at 06:36 |
|
![]() |
![]() |
![]() |
#110 | |
"Gary"
May 2007
Overland Park, KS
3·5·7·113 Posts |
![]() Quote:
Don't you ever sleep? lol I'm going to need some help with this. I'm not clear on where in llrserver.lua that it goes. Can you post an updated llrserver.lua file with this change in it? |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Anti-poverty drug testing vs "high" tax deduction testing | kladner | Soap Box | 3 | 2016-10-14 18:43 |
What am I testing? | GARYP166 | Information & Answers | 9 | 2009-02-18 22:41 |
k=243 testing ?? | gd_barnes | Riesel Prime Search | 20 | 2007-11-08 21:13 |
Testing | grobie | Marin's Mersenne-aries | 1 | 2006-05-15 12:26 |
Speed of P-1 testing vs. Trial Factoring testing | eepiccolo | Math | 6 | 2006-03-28 20:53 |