 2003-09-11, 22:02 #1 GP2     Sep 2003 13×199 Posts New "Data" forum might be needed The Data forum would be used to discuss the actual numerical results of the GIMPS projects, specifically the contents of the files BAD, LUCAS_V.TXT, HRF3.TXT, FACTORS.CMP, NOFACTOR.CMP, PMINUS1.TXT (generally updated weekly), or the status.txt and cleared.txt files (assigned exponents and cleared exponents, updated hourly). This would be used to point out oddities or anomalies or interesting facts about the data, not from a math-theoretical point of view, but purely from the point of data consistency. Some clarification: The existing Server (or Software) forums are used to discuss: - The server is down - What does Error 678 mean? - How do you extend the check-in deadline? - How do you check in manual tests - Why is Entropia's redirector not working? - When will there be another database sync?' - How do I change my user name or team? - When will the server be rewritten? - How can we prevent poaching - Etc. etc. etc. The existing Math forum is used to discuss: - I have an idea for an algorithm - I have a proof of the Riemann hypothesis but this margin is too small - Other theoretical math topics The Data forum would simply discuss the data that has been collected by the GIMPS project, not in the sense of math conjectures or math theory, but in the sense of verifying its accuracy or consistency, identifying exponents that might need an extra double-check, trying to find suspicious or bad data, or simply trivia about the data (which exponents were double-checked the most number of times, etc). Topics such as: - Hey, how come exponent 6522911 was tested 305 separate times and 8893783 was tested 244 separate times? - Are we really sure there are no false positive factors in the factor data? [Answer: yes, we're sure] - Why do there seem to be some non-matching residues in LUCAS_V.TXT, which should contain only verified-good results? Example: Code: 6759449,simenh,arne,WS1,B814D809ABC9E3DA,, 6759449,TempleU-DI,C031EBA9B,WW5,C1FE092BC7D5FDD4,2426613,00000000 6759449,jshowalter,showalter2a,WW5,C1FE092BC7D5FDD4,552567,00000000 - What's the best estimate of the error rate, and how can we calculate it? [Answer: I believe it's just under 4%] - Why do some lines in cleared.txt contain a residue of only "0x" followed by blanks? (this might be a server issue, but could also be discussed here) - Are there any exponents for which all the LL tests (original and double-check(s)) were performed only by the same user, and should they be triple-checked by an independent user? [Answer: Brian Beesley seems to be doing it] - Exponent 7021433 required 6 tests (2 good results and 4 bad results) before obtaining two matching results, and 27 other exponents required 5 tests (2 good results and 3 bad results). Is this within statistical expectations? [Answer: apparently yes, if the error rate is indeed 3.5-4.0 %] - Scripts and bits of code could also be posted that can be run against the data files (for instance, a script for calculating the data error rate, for verifying that there are no false-positive results in the factors database, etc). And so forth. Once again, anything of a theoretical math nature would belong in the Math forum, whereas server operation and errors, rewriting the server, and so forth would belong in the Server forum. The sample topics mentioned above don't really fit in either of those two forums.
 2003-09-12, 03:43 #2 Xyzzy     "Mike" Aug 2002 175408 Posts Get a few people to chime in for it and we'll make it happen... I'm assuming you want to volunteer to moderate it? :)
 2003-09-12, 04:47 #3 GP2     Sep 2003 13×199 Posts It should be relatively low volume, moderation shouldn't be a problem. Perhaps I could do it. Any "chimers" or contrary opinions?
 2003-09-12, 14:36 #4 roy1942   Aug 2002 47 Posts chime (where's a bell emoticon when you need it!)
 2003-09-12, 17:13 #5 garo     Aug 2002 Termonfeckin, IE 32×307 Posts chime
 2003-09-12, 18:57 #6 Prime95 P90 years forever!     Aug 2002 Yeehaw, FL 22·3·617 Posts You can populate it with your recent interesting threads and some of the older ones regarding error rates.
 2003-09-12, 20:32 #7 richs     "Rich" Aug 2002 Benicia, California 22×307 Posts This area interests me so I like the idea.
 2003-09-12, 21:14 #8 mephisto     Feb 2003 Norway 23×7 Posts It's been missing. Chime.
 2003-09-13, 02:34 #9 Xyzzy     "Mike" Aug 2002 11111011000002 Posts Look for the new forum sometime over the next two days...

