mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   PrimeNet (https://www.mersenneforum.org/forumdisplay.php?f=11)
-   -   OFFICIAL "SERVER PROBLEMS" THREAD (https://www.mersenneforum.org/showthread.php?t=5758)

Uncwilly 2021-09-17 13:46

Found an oops in the system.
[M]102436709[/M], [M]102437149[/M], & [M]102982241[/M] all have mismatched LL tests. Jan S. got a PRP assignment on each and turned in a P-1 test before the PRP was run. That dropped the assignment on PrimeNet. If a person is assigned a PRP or LL test, shouldn't that assignment be retained if [U][I]they[/I][/U] turn in a NF TF or P-1?

Viliam Furik 2021-09-17 13:48

[QUOTE=Uncwilly;588032]Found an oops in the system.
[M]102436709[/M], [M]102437149[/M], & [M]102982241[/M] all have mismatched LL tests. Jan S. got a PRP assignment on each and turned in a P-1 test before the PRP was run. That dropped the assignment on PrimeNet. If a person is assigned a PRP or LL test, shouldn't that assignment be retained if [U][I]they[/I][/U] turn in a NF TF or P-1?[/QUOTE]

If the result is returned with mismatching AID, the server is confused and nullifies the assignment. It shouldn't IMO.

If the result is returned without AID, it shouldn't do this at all.

kriesel 2021-09-17 15:35

[QUOTE=Uncwilly;588032]Found an oops in the system.
[M]102436709[/M], [M]102437149[/M], & [M]102982241[/M] all have mismatched LL tests. Jan S. got a PRP assignment on each and turned in a P-1 test before the PRP was run. That dropped the assignment on PrimeNet. If a person is assigned a PRP or LL test, shouldn't that assignment be retained if [U][I]they[/I][/U] turn in a NF TF or P-1?[/QUOTE]I've seen it where holding a P-1 assignment for an exponent, and turning in a P-1 result on it, results in the P-1 assignment getting marked expired, even though it accepts the result also. (An issue reported previously.)

Uncwilly 2021-09-17 19:17

Sure, but that is the same work type. Only a Factor Found should cancel a PRP or LL assignment.

kriesel 2021-09-17 22:04

[QUOTE=Uncwilly;588060]Sure, but that is the same work type. Only a Factor Found should cancel a PRP or LL assignment.[/QUOTE]The common ground is processing a P-1 no-factor result, causes a valid assignment held by the submitter to get marked expired when it should not. In the one subcase the P-1 assignment should be considered completed not expired; in the other the primality test should persist as a valid and pending assignment.

kriesel 2021-09-20 20:52

A detailed example is at [URL]https://mersenneforum.org/showpost.php?p=588256&postcount=89[/URL] from a February occurrence I just discovered today, of P-1 no factor found reporting causing the PRP assignment to disappear.
If not addressed, this may create trouble more frequently, after prime95 or mprime begin to support the same low cost stage 1 P-1 by using PRP generated powers of 3. Or Mlucas.

kriesel 2021-09-20 21:02

Moebius reports a [URL="https://mersenneforum.org/showpost.php?p=588258&postcount=91"]case[/URL] where two PRPs, two proof generations, two certs were done on same day same exponent different users.

kriesel 2021-09-20 21:22

[QUOTE=kriesel;588259]A detailed example is at [URL]https://mersenneforum.org/showpost.php?p=588256&postcount=89[/URL] from a February occurrence[/QUOTE]That one is relating to manual assignment and reporting re gpuowl V7.2-21. I think it likely the issue is more widespread.
Gpuowl v6.11-380 and others split an assignment
PRP=<AID>,blah,blah,2
into
PFactor=<AID>...
and
PRP=<AID>...
Same AID, different work, different results, that will get reported at different times/dates by the same user or the primenet.py script.

chalsall 2021-09-20 21:38

[QUOTE=kriesel;588262]Same AID, different work, different results, that will get reported at different times/dates by the same user or the primenet.py script.[/QUOTE]

Non-conformant to the API specs.

James Heinrich 2021-09-20 21:50

[QUOTE=chalsall;588263]Non-conformant to the API specs.[/QUOTE]I'm not sure that it is -- Prime95 does the same thing. Picking a random example [m]106223153[/m], the work was assigned as PRP, a NF-PM1 was reported but the PRP assignment is still active.

chalsall 2021-09-20 21:59

[QUOTE=James Heinrich;588264]I'm not sure that it is -- Prime95 does the same thing. Picking a random example [m]106223153[/m], the work was assigned as PRP, a NF-PM1 was reported but the PRP assignment is still active.[/QUOTE]

OK... I was thinking about splitting the AID into different work types to run in parallel, and then not having IPC between the workers to ensure the first to report doesn't set the DONE flag.

Prime95 / mprime will always do the P-1'ing work first in the case you've described. And, clearly, it understands the API.

kriesel 2021-09-20 23:50

[QUOTE=kriesel;588262]into
PFactor=<AID>...[B],2[/B]
and
PRP=<AID>...[B],0[/B]
Same AID, different work, different results, that will get reported at different times/dates by the same user or the primenet.py script.[/QUOTE]And those go sequentially at the end of the same worktodo.txt for a single Gpuowl instance, so get done sequentially. Given that manually assigned 106M wavefront PRP take ~27 hours now on my power-reduced Radeon VIIs, and default periodic reporting is daily, the P-1 result will report ~1 day before the PRP it precedes, or occasionally ~2 days (when the P-1 just makes it before a daily reporting time, and the next day the PRP just misses).

chalsall 2021-09-20 23:59

[QUOTE=kriesel;588268]And those go sequentially at the end of the same worktodo.txt for a single Gpuowl instance, so get done sequentially.[/QUOTE]

OK. We're just trying to figure out what isn't working in the various workflows. As has been reported here.

Are the "humans getting into the loop when they shouldn't" the problem? Manually submitting results, for example.

Few appreciate just how tricky software is. Putting humans into the equation just adds a few extra dimensions of uncertainty (read: "fun"). :wink:

kriesel 2021-09-21 00:06

5 Attachment(s)
Attempted a couple manually assigned test wavefront PRPs which both needed P-1 first.
V6.11-380 gpuowl manually. With lots of notes and screen captures along the way.

On the [URL="https://www.mersenne.org/report_exponent/?exp_lo=106303147&exp_hi=&full=1#"]first one[/URL], I did some progress updating using CURL which sort of converts an assignment from manual.
Used curl to report 99% s2 progress.
Then manually reported the completed P-1 NF ~5 minutes later.
A check of the exponent status showed a P-1 result report in the history, and a 99% complete S2, a contradiction.
Then used curl to report its brief PRP progress to correct the status.

[URL="https://www.mersenne.org/report_exponent/?exp_lo=106304603&exp_hi=&full=1"]Second one[/URL], no curl progress reporting ever, completed and reported the P-1 NF for the PRP assignment.
The PRP assignment remained.
Assignment status shows it as PRP, no stage, no %. It would seem reasonable to assume it at stage PRP 0% after getting P-1 NF. And reasonable to take the stance the server should assume nothing.

So, was unable to reproduce the PRP-assignment-disappearance, but found something new, a contradictory status creation method I guess. Server seems not prepared for a mix of manual and primenet activity on the same assignment. Not surprising really. I would probably not have gone looking for that kind of trouble either, while coding or debugging server scripts.

And maybe that fail to reproduce the issue is because [URL="https://mersenneforum.org/showpost.php?p=588267&postcount=93"]George already attempted a fix[/URL]. (Dueling threads, for more fun!)

chalsall 2021-09-21 01:39

[QUOTE=kriesel;588271](Dueling threads, for more fun!)[/QUOTE]

Please forgive me for this. But some call it Agile Development...

Jan S 2021-09-21 18:05

Sorry i didn't notice this discussion.

If I remember correctly, after uploading P-1 results, server wrote: "Original assigment not deleted".

I tried to register these exponents again([URL]https://www.mersenneforum.org/showpost.php?p=587795&postcount=598[/URL], but i was unsuccessful(Error text: No assignment available meeting CPU, program code and work preference requirements...).

I'm still working on them([URL="https://www.mersenne.org/report_exponent/?exp_lo=102436709&full=1"]102436709[/URL] - 56.03; [URL="https://www.mersenne.org/report_exponent/?exp_lo=104000179&full=1"]104000179[/URL] - 28.84).

Uncwilly 2021-09-21 18:33

[QUOTE=Jan S;588344]I'm still working on them([URL="https://www.mersenne.org/report_exponent/?exp_lo=102436709&full=1"]102436709[/URL] - 56.03; [URL="https://www.mersenne.org/report_exponent/?exp_lo=104000179&full=1"]104000179[/URL] - 28.84).[/QUOTE]I noticed the ones that you turned in P-1 from the Strategic thread. I have kept them off the list (figuring that you are working on them.)

LaurV 2021-10-04 12:09

Sometime ago I got assigned M3427211 for a PRPCF and i found out that is closed to expire and it didn't do off from the assignment list, while it was not in the worktodo. Checking the log files to see if it was really assigned, if the work was done and why it was not reported, I found out that the assignment was indeed taken, but the work was never done, ending up with a "does not divide" error every time I add the assignment to worktodo. I first suspected that some factor is wrong, but then looking more careful to the line, the last factor is doubled. It seems that we found a "non square free mersenne", solving a 300 years old dilemma... :razz:

[CODE]PRP=3F18_KEY_KEY_KEY_KEY_AE98,1,2,3427211,-1,99,2,3,1,"6854423,5867740296406049161,168285690558111904601,168285690558111904601"
[/CODE]I fixed the assignment line, by deleting the last factor, and now the test is ongoing. The question still remains why this happened (it is clear that the last factor was reported two times, but this should not influence the assignment) and how many exponents are still in this situation (work not done, skipped because assignment line is not generated correctly).

ric 2021-10-04 12:52

[QUOTE=LaurV;589375]The question still remains why this happened[/QUOTE]

It happened before, very occasionally, and had already been reported (ATH, myself, maybe someone else). IIRC, a transient glitch, maybe not worth further investigation (not my words, GW's).

LaurV 2021-10-04 15:12

Well, the real issue is that the exponent continues to be assigned again and again (ending in error), after the job is done, because the PRPCF for the Mx/3_factors is not the same as the PRPCF of the Mx/4_factors (including the duplicate), so in its mind (server's, that is), the work is not done. So, this MUST be investigated, as it bothers the clients and slows them down. For now, the job is done, so I removed it manually from the worktodo file, but I will keep it (re-re-)assigned on the server, so it won't be assigned to somebody else.

Viliam Furik 2021-10-09 13:46

I was looking at the P+1 successful efforts list on mersenne.ca, when I noticed exponent 40927 seems to have a factor found by P+1, but in November 2018... AFAIK, P+1 was not possible back then. I suspect some mishandling of server records.

[URL="https://www.mersenne.org/report_exponent/?exp_lo=40927&full=1"]link to the exponent page[/URL]

James Heinrich 2021-10-09 14:25

[QUOTE=Viliam Furik;590011]I was looking at the P+1 successful efforts list on mersenne.ca, when I noticed exponent 40927 seems to have a factor found by P+1, but in November 2018... AFAIK, P+1 was not possible back then. I suspect some mishandling of server records.[/QUOTE]As you can see on the [url=https://www.mersenne.ca/pplus1.php]P+1 factors page[/url] the general finding of P+1 factors starts shortly after the release of Prime95 with P+1 capabilities in April 2021. However, once P+1 became a result type on PrimeNet user [url=https://www.mersenne.ca/userfactors/pp1/63514/bits]YarBer[/url] emailed George and myself indicating that his factors for [m]M40927[/m] and [m]M193873[/m] were found with GMP-ECM using P+1. Those factors had been recorded by PrimeNet but were assumed at the time to be ECM. George and I manually updated the mersenne.org and mersenne.ca databases respectively to reflect the P+1 discovery method.

kriesel 2021-10-10 21:36

Cert estimated completion mismatch
 
1 Attachment(s)
The prime95 client shows me Oct 14 (~3.5 days) one way, and ~30. days another, for completing the Cert on 843112609. And after using Advanced, Manual communication, checking the box for send new expected completion dates to server, then OK, then in the web browser refresh [URL]https://www.mersenne.org/workload/[/URL] the server still shows [B]1[/B] day estimated completion. It's always 0 or 1. Seven percent completed in ~2 days is consistent with the worker windows ~30. days to complete, but the server is still indicating 1.

slandrum 2021-10-26 03:05

Server accounting is off
 
The following line from the exponent status distribution:
[CODE]105000000 54071 | 35011 14284 4769 7 | 8 3 | 221 4542 |[/CODE]
Shows 7 exponents left to clear (for FTC) in this range, but 8 being worked on, and if you examine all 8, they do all need to be cleared.

techn1ciaN 2021-11-04 00:55

If you check out exponents for PRP-CF on the manual assignment page (or re-load any of your automatically fetched PRP-CF work from the replacement lines in [URL]https://www.mersenne.org/workload/[/URL]), the work lines given look like:

[CODE]PRP=[AID],1,2,[exponent],-1,99,2,3,1,"[known factor(s)]"[/CODE]If loaded without editing, this sets PRP residue type 1, which disables Gerbicz error checking and proof generation in the case of a PRP-CF test. These tools can be used for PRP-CF if residue type 5 is selected instead, so lines for those assignments should look like:

[CODE]PRP=[AID],1,2,[exponent],-1,99,2,3,5,"[known factor(s)]"[/CODE](Incidentally, these lines also set 2 primality tests saved in the case of a found P-1 factor. This doesn't make any sense for CF work since we want to see if the current cofactor is prime before we put more work into factoring, by definition. This shouldn't mean anything practical with what the completed TF bit level is set to; just something else I noticed.)

James Heinrich 2021-11-04 01:08

[QUOTE=techn1ciaN;592377]If loaded without editing, this sets PRP residue type 1, which disables Gerbicz error checking and proof generation in the case of a PRP-CF test. These tools can be used for PRP-CF if residue type 5 is selected instead[/QUOTE]I think this is something George will need to look into, as best I can tell it's set to 1 in the table of available assignments.

techn1ciaN 2021-11-05 22:17

[QUOTE=techn1ciaN;592377]If you check out exponents for PRP-CF on the manual assignment page (or re-load any of your automatically fetched PRP-CF work from the replacement lines in [URL]https://www.mersenne.org/workload/[/URL]), the work lines given look like ...[/QUOTE]

I have apparently misstated the issue. What I said initially is only true for the replacement work lines in your Assignments. The lines given by the manual assignment page actually look like:

[CODE]PRP=[AID],1,2,[exponent],-1,99,0,"[known factor(s)][/CODE]If you load this, Prime95 sets residue type 5 automatically. So, there is no problem (not even with the P-1 tests_saved value) unless you are specifically trying to use the replacement work lines tool.

James Heinrich 2021-11-05 22:55

[QUOTE=techn1ciaN;592575]only true for the replacement work lines in your Assignments[/QUOTE]OK, that I can fix. I can either hardcode them to show [c]base,type[/c] as [c]3,5[/c], or just leave those two fields out entirely and let Prime95 act on its default behavior. I have opted for the latter (do not include these fields), same as manual assignment.

S485122 2021-11-10 09:24

www.mersenne.org down ?
 
It worked until about 09:10 UTC. The server responds to pings and FTP but not to HTTP(S) requests.

S485122 2021-11-10 10:05

[QUOTE=S485122;592846]It worked until about 09:10 UTC. The server responds to pings and FTP but not to HTTP(S) requests.[/QUOTE]OK again.
Thanks to whoever restored the service.

techn1ciaN 2021-11-10 17:29

[QUOTE=James Heinrich;592579]OK, that I can fix. I can either hardcode them to show [c]base,type[/c] as [c]3,5[/c], or just leave those two fields out entirely and let Prime95 act on its default behavior. I have opted for the latter (do not include these fields), same as manual assignment.[/QUOTE]

Just tested by checking out a PRP-CF assignment and loading it from my replacement work lines. Starts as intended now — thanks. :thumbs-up:

Might you also change [c]tests_saved[/c] in these lines to [c]0[/c], for full consistency with what the manual assignment page outputs?

James Heinrich 2021-11-11 00:41

[QUOTE=techn1ciaN;592861]Might you also change [c]tests_saved[/c] in these lines to [c]0[/c], for full consistency with what the manual assignment page outputs?[/QUOTE]As best I can see what the page is trying to do, it shows [c]0[/c] if a decent amount of P-1 has already been done; if it hasn't then it shows [c]1[/c] for PRP-DC and [c]2[/c] for PRP. Whether this makes sense or not I don't know, I didn't write the logic and I'm not that knowledgeable on PRP assignments.

slandrum 2021-11-11 05:54

[QUOTE=James Heinrich;592894]As best I can see what the page is trying to do, it shows [c]0[/c] if a decent amount of P-1 has already been done; if it hasn't then it shows [c]1[/c] for PRP-DC and [c]2[/c] for PRP. Whether this makes sense or not I don't know, I didn't write the logic and I'm not that knowledgeable on PRP assignments.[/QUOTE]

I think what was being asked for - if the exponent has already been proven composite, then tests saved cannot be more than 0.

techn1ciaN 2021-11-11 11:13

[QUOTE=slandrum;592909]I think what was being asked for - if the exponent has already been proven composite, then tests saved cannot be more than 0.[/QUOTE]

This is essentially the idea. If someone is checking out an exponent for PRP-CF, then a full-length primality test will be run, by definition.

Of course, with [c]how_far_factored=99[/c], the issue should be purely cosmetic in all practical cases.

S485122 2021-11-11 11:37

Exponent Status Distribution errors
 
The [url=https://www.mersenne.org/primenet/]Exponent Status Distribution[/url] (the menu Item "Current Progress / Work Distribution Map") has some wrong totals, for instance the PrimeNet Activity Summary dated 2021-11-11 10:00 UTC.

The 10M range has a spurious exponent counted as having only one erroneous test. That is wrong since all exponents of that range have long been verified or factored.

There is some logic error in the counting, I will illustrate it with 105M 106M range.
- The number of untested exponents is 4, it is indeed the number of assigned first time tests. But the NO-LL count in the table is 3, one to low !
- The number of factored Mersennes is correct in the table.
- The table has the correct total number of exponents (sum of the number of exponents for which the corresponding Mersenne number is prime, factored, verified composite, tested once, got an erroneous result and untested.)
- I counted 4775 exponents with unverified test(s) (some with a mix of LL and PRP without Cert). The table has only 4769, 6 are missing.
- I counted 14281 verified exponents (LL or PRP double checked or certified PRP.) The table counts 14288, 7 too many.
The differences do add up to 0 in this range.

Other range have too few assigned and available exponents : 3M is missing 1, 23M : 1, 30M : 1, 59M : 3, 60M : 2, 61M : 16, 62M : 1, 63M : 1, 64M : 1, 104M : 1, 106M : 13, 107M : 33, 108M : 30, 109M : 47, 110M : 27, 111M : 11, 112M : 13, 113M : 12, 114M : 7, 115M : 1, 122M : 1, 124M : 1, 126M : 29, 149M : 1, 150M : 1, 160M : 3, 164M : 1, 165M : 2, 166M : 5, 172M : 1, 177M : 1, 184M : 2, 185M : 2, 188M : 1, 190M : 1, 332M : 2, 333M : 1, 371M : 1, 385M : 1, 623M : 1 and 800M : 1. Some of those differences are quite persistent over months (others, more transient, might be due to cut-off issues.)

There is at the moment no range with more assigned than available exponents (the ECM range is a special case), there have been some in the past.

sdbardwick 2021-11-16 12:00

DB server might be having problems. Homepage opens, but anything involving DB seems to stall, including login.
EDIT: Appears to be back online.


All times are UTC. The time now is 15:39.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.