mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Data

Reply
 
Thread Tools
Old 2003-11-08, 19:50   #1
GP2
 
GP2's Avatar
 
Sep 2003

5×11×47 Posts
Default Early double-checking to determine error-prone machines?

From examining historical data, we know that most machines have a 0% error rate, but a certain percentage have much higher rates.

We also know that some errors can be detected by examining the error code returned with each result, but sometimes a result is erroneous even if it was returned with a 00000000 error code. In fact, 54% of all verified-bad results were returned with a 00000000 error code, and that percentage rises to 60% if we include other "harmless" error codes (AB00AB00 and 00XX0000). [Note: in calculating these percentages we exclude old results that did not report error codes, as well as high-bit 8XXXXXXX error codes].

Thus, some error-prone machines may be going undetected. This problem is worsened by the fact that a large gap exists between the leading edges of first-time checking (currently at 21M) and double checking (currently at 10M). So it may be years before the necessary double checks and triple checks are done to identify error-prone machines and their exponents.

The consequences are: 1) possible delay in finding a Mersenne prime 2) owners of error-prone machines may not know it until years later, after the machine might already be out of service.

The next post will discuss possible ways of addressing this issue.
GP2 is offline   Reply With Quote
Old 2003-11-08, 20:26   #2
GP2
 
GP2's Avatar
 
Sep 2003

5×11×47 Posts
Default

First of all, any solution that requires a modification to the server code must be ruled out. The server code is currently at Build 4.0.031, and from examining old historical summary.txt files, it has been there since some time between May and September 1999 (more than four years).

So we must assume the current server code is frozen and cannot be modified. With respect to writing a new server, I think our best bet is to track the BOINC platform being developed for next-generation SETI@Home. However, the time frame for this is uncertain.


Also, any solution that involves dumping routine double checks into the first-time queue should probably be ruled out, for the following reasons:

What we call the "first-time" queue and the "double-checking" queue really should be viewed more broadly. The "first-time" queue is really the queue for all exponents that have not had at least one presumed-good result returned. Thus it's legitimate to release not only genuine first-time tests, but also double checks where we believe the first test has a fair probability of being erroneous, or even triple checks where both original tests have a fair probability of being erroneous, etc.

The "double-checking" queue is the queue for all exponents that have had at least one result returned for which we have no reason to presume that the result was not good. For all such exponents, there is no urgency in getting a double check done, since there is a 98%+ probability that the result was correct (based on an error rate of 1.5-2.0% for results with a "harmless" error code, as opposed to 3.5-4.0% error rate overall).

So exponents in the double-checking queue have a very low chance of being Mersenne prime candidates, whereas those in the first-time queue have at least a legitimate shot at finding the next prime. People who joined the project for the potential glory of finding a prime might not be happy to get assigned routine double-checks merely because the first-time queue is moving at a faster pace than the double-checking queue.


With these constraints in mind, the next post will discuss some ideas.
GP2 is offline   Reply With Quote
Old 2003-11-08, 22:33   #3
GP2
 
GP2's Avatar
 
Sep 2003

258510 Posts
Default

To summarize, there are some double checks or triple checks that we might like to do out-of-sequence, because they would help to detect error-prone machines. However, we can't legitimately assign these to the first-time queue because there is no reason to suspect that the original result was actually erroneous.


Here are some options:


1) Do nothing.

The odds are quite unlikely that we've missed a Mersenne prime due to an erroneous first-time results. Eventually, with a new server based on BOINC or another platform, we would have the flexibility to release exponents on a priority basis rather than the current server's behavior of just assigning the lowest exponent in the queue.


2) Ask volunteers to do out-of-sequence double checks and triple checks.

Unfortunately, this is not practical because there are way too many exponents needing to be done. The Mersenne-aries subforum has been successful with P-1 assignments, because these have a rapid turnover and the gratifying instant feedback of factors found. However, the Team_Prime_Rib orphanage shows that there is a very slow uptake for manual LL assignments, even when these can be done for PrimeNet credit. Most people just prefer the convenience of automated assignments.


3) Find some way to automatically assign out-of-sequence double checks and triple checks despite the current server's limitations.

Currently, the leading edge of double-checking is in the 10.4M range, and the server lists 2200+ exponents as available in the 10.5M range. In the normal course of events, when the leading edge started sweeping through the 10.5M range, George would release a new block of 2000+ sequential double checks in the 10.6M range, and so forth.

Rather than doing this, he could instead release a large, scattered set of "priority" exponents between 10.6M and 20M and let automated assignments sweep through and clear these, and only afterwards go back and release the full sequential block of 2000+ exponents in 10.6M, and so forth.


4) Modify Prime95 so that the "makes most sense" default is a mix of first-time checking and double-checking.

Most users don't actually explicitly request first-time checking only. Rather they just accept the default of "whatever work makes most sense". Right now, Prime95 interprets this to mean 100% first-time checking if the machine is fast enough, and otherwise 100% double checking.

So Prime95's behavior is usually the same, regardless of whether the user really insisted on only first-time tests, or just shrugged and accepted the default. However, many users are motivated more by team or personal stats rather than the very small chance of personally finding a Mersenne prime, and wouldn't mind doing a few double checks if it "made sense".

So how about if the setup screen had a horizontal slider bar that let the user adjust the percentage of double-checking? 0% would mean exclusively double-checking and 100% would mean exclusively first-time checking, and the default would be something like 80%.

In other words, every time Prime95 needed to fetch an exponent from the server, it would randomly ask for an exponent from the double-checking queue 20% of the time and from the first-time checking queue 80% of the time (in the default case).

The advantages would be:
- partly reduce the large gap between double-checking and first-time checking (currently 10M <-> 21M).
- shouldn't require any modification to the server.
- users can still request 100% first-time checking if they want.
- identifies errorprone machines in a much more timely manner, because nearly all machines would do at least some double checks.

If we wanted to get fancy, instead of fixing the default at 80%, the default could be intelligently set based on machine speed. The current version of Prime95 sets it to 100% if above a current threshold machine speed, and 0% if below that machine speed -- perhaps it could be adjusted more finely. Alternatively, Prime95 could check today's date and compare it with its own internal version timestamp, and if enough time had elapsed it would deduce that it was an old version running on an old machine, and do more double-checking accordingly. Or Prime95 could even ensure that the first two or three exponents assigned were guaranteed to be from the double-checking queue, as a way of quickly testing any new machine. But all those refinements shouldn't really be necessary.

The only disadvantage of a client that encourages more double-checking is that the leading edge of first-time checking would move more slowly, given the same resources. But that shouldn't be much of an issue: the leading edge of first-time testing would still move at around 80% of the current rate, not really a noticeable difference.

Also, in the early days of GIMPS it was important to move very quickly because a rival project (like David Slowinski and his Cray) could find a prime first (which in fact is exactly what happened with M1257787). But today, there is no rival project looking for Mersenne primes, and given the open-ended nature of searching for Mersenne primes, there is not the same urgency to keep pushing the leading edge of first-time testing at the absolute maximum possible speed.



So to summarize, I'd argue in favor of 4), as long as it's not too much trouble for George to modify Prime95.

Any opinions?
GP2 is offline   Reply With Quote
Old 2003-11-09, 12:21   #4
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

2·2,417 Posts
Default

We all know that the actual behaviour of server is a bit outdated regarding GIMPS and its needs.

Point 4) looks like a workaround; I usually prefer a clean solution, even if it needs more work.

On my opinion the "prime question" should be: is the server going to be updated? If so, then let's wait for it, if not I'd agree with your point 4).

Luigi
ET_ is offline   Reply With Quote
Old 2003-11-09, 12:23   #5
only_human
 
only_human's Avatar
 
"Gang aft agley"
Sep 2002

2×1,877 Posts
Default

I am very much in favor of option 4 (and think option 3 works best synergistically when option 4 is already in place).

Quote:
The advantages would be:
- partly reduce the large gap between double-checking and first-time checking (currently 10M <-> 21M).
- shouldn't require any modification to the server.
- users can still request 100% first-time checking if they want.
- identifies errorprone machines in a much more timely manner, because nearly all machines would do at least some double checks.
I would explicitly add some advantages that are implicit in the above list:
- helps advance all GIMPS milestones.
- improves perception and participation in quality of results.
- helps retain users in the project because of all of the above.

Here is a snapshot of the other milestones in process:
  • All exponents below 10,412,700 have been tested at least once.
  • Countdown to testing all exponents below M(13466917) once: 63
  • Countdown to proving M(13466917) is the 39th Mersenne Prime: 76,018
only_human is offline   Reply With Quote
Old 2003-11-10, 15:11   #6
only_human
 
only_human's Avatar
 
"Gang aft agley"
Sep 2002

1110101010102 Posts
Default

I'd like to retract the possible advantages I mentioned in the above message. Looking at them fresh today, they look speculative and of the topic of the thread.

That said, I am still in favor of option 4 and believe it would be useful even if the server changed or improved.
only_human is offline   Reply With Quote
Old 2003-11-11, 02:15   #7
Complex33
 
Complex33's Avatar
 
Aug 2002
Texas

5×31 Posts
Default

I'm all for anything that shrinks the gap between first time tests and DC's. I try to keep a balance myself and think an option in the program would be great. Just my thoughts.
Complex33 is offline   Reply With Quote
Old 2003-11-11, 11:02   #8
lycorn
 
lycorn's Avatar
 
"GIMFS"
Sep 2002
Oeiras, Portugal

17×89 Posts
Default

As an alternative, in case option 4 represents too much trouble for George to implement right now (reckon he´s busy working on the Athlon64 optimizations... ), I think a not too bad approach would be to assign DCs to every new machine joining the project (maybe the first two assignments). This would serve the purpose of earlier error-prone machines detection, and would also provide people with a much faster result, which I think is more motivating then starting the project by waiting a long time for a first-time LL. As many newcomers in fact accept the default type of work, they wouldn´t bother about starting by a DC. This should be explained in Prime95´s setup screen (and/or somewhere in the help file). I´m sure that the shorter time waiting for the first result would help reducing the percentage of first-time assignments never completed, which I´m afraid has been rather high (GP2, any figures on this?...). After two completed DC assignments, the machine would qualify for first-time work, if so desired by the user, and if the machine meets Prime95´s criterium for first-time LL testing.

Last fiddled with by lycorn on 2003-11-11 at 11:04
lycorn is offline   Reply With Quote
Old 2003-11-12, 18:57   #9
GP2
 
GP2's Avatar
 
Sep 2003

5·11·47 Posts
Default

If we want to try case 3), I can supply about 2500 "interesting" exponents between 10.6M and 12M that can be released to the double-checking queue:

About 800 triple checks.

About 400 exponents from "only slightly error prone" machines satisfying bad / (bad+good) >= 0.10

About 1300 exponents from machines where bad <= 1 and good <= 2 and uv2 >= 10, with no more than 2 exponents per machine (in other words, machines where we don't have good stats on whether they're errorprone or not, and with at least 10 exponents that would be released for a 2nd check if it turns out the machine is indeed error-prone).


A normal, routine release of double-checking exponents would release about 2200 exponents in the range 10.6M-10.7M, so this would be roughly the same number of exponents, but a more "interesting" selection over a broader range (10.6M - 12M).
GP2 is offline   Reply With Quote
Old 2003-11-13, 00:13   #10
GP2
 
GP2's Avatar
 
Sep 2003

5·11·47 Posts
Default

Quote:
Originally posted by lycorn
I´m sure that the shorter time waiting for the first result would help reducing the percentage of first-time assignments never completed, which I´m afraid has been rather high (GP2, any figures on this?...).
I haven't tried calculating this yet. Once I gather a complete set of old status.txt and cleared.txt files, I'll probably try to calculate it. It's complicated a bit by the fact that people can change their user names in mid-test.
GP2 is offline   Reply With Quote
Old 2003-11-13, 02:16   #11
outlnder
 
outlnder's Avatar
 
Aug 2002

2×3×53 Posts
Default

Does the changing of computer ID cause you much of a problem.
I.E.- outlnder02 to outlnder52, or such
outlnder is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Round Off Checking and Sum (Inputs) Error Checking Forceman Software 2 2013-01-30 17:32
Error Prone Machines PageFault Data 17 2012-04-10 01:40
exponents <10M digits that need early double check tha Lone Mersenne Hunters 0 2008-10-22 20:48
List of error prone machines available for download GP2 Data 3 2004-01-03 00:41
Team_Prime_Rib error-prone machines GP2 Data 10 2003-10-05 18:34

All times are UTC. The time now is 06:53.


Tue Dec 7 06:53:33 UTC 2021 up 137 days, 1:22, 0 users, load averages: 1.19, 1.33, 1.28

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.