mersenneforum.org Things that make you go hmm, concerning gpuowl runs
 Register FAQ Search Today's Posts Mark Forums Read

2020-09-07, 11:28   #23
preda

"Mihai Preda"
Apr 2015

2×691 Posts

Quote:
 Originally Posted by kriesel 2020-09-04 05:23:39 asr2/radeonvii0 checksum de9da1a4 (expected 75062f10) in '.\63000061\proof\14519546'
Interesting. It looks as if that file was not written correctly, or corrupted on disk. When trying to use it for proof generation, the checksum mismatch was discovered. On restart, it tries first the proof power requested (8), and if that doesn't work, tries every power from higest to lowest to see if any is feasible given the residues that are present. This order is indeed not minimal (e.g. 8 is checked twice) but it represents an exceptional case so it's not worth "fixing" IMO.

2020-09-07, 23:26   #24
storm5510
Random Account

Aug 2009

3·701 Posts

@preda:

Quote:
 Originally Posted by Prime95 Storm your gpuowl proof was no good. You might want to contact Mihai with the details.
George asked me to contact you with more detail about this. It is for M10496897. I do not have many details to provide. My log file does got go back that far. I keep all my results for such instances:

Quote:
 {"status":"C", "exponent":"10496897", "worktype":"PRP-3", "res64":"c25cf3c76e1acf6f", "residue-type":"1", "errors":{"gerbicz":"0"}, "fft-length":"655360", "proof":{"version":"1", "power":"8", "hashsize":"64", "md5":"7e64c95c13fcad24f5d666ca0e366e27"}, "program":{"name":"gpuowl", "version":"v6.11-364-g36f4e2a"}, "user":"storm5510", "computer":"7700_Kaby_Lake", "aid":"766017F5351C87AEFDFB0D475734D58A", "timestamp":"2020-08-26 15:29:16 UTC"}

2020-09-08, 00:04   #25
preda

"Mihai Preda"
Apr 2015

138210 Posts

Quote:
 Originally Posted by storm5510 @preda: George asked me to contact you with more detail about this. It is for M10496897. I do not have many details to provide. My log file does got go back that far. I keep all my results for such instances:
Do you have the proof file? Would be in a folder named "uploaded", in pool/ if you use -pool or in the run directory of gpuowl otherwise, and is named something like 10496897-8.proof . If you have it, please upload it somewhere (Drive, Box, etc) and send me the link, thanks.

2020-09-08, 00:43   #26
storm5510
Random Account

Aug 2009

3·701 Posts

Quote:
 Originally Posted by preda Do you have the proof file? Would be in a folder named "uploaded", in pool/ if you use -pool or in the run directory of gpuowl otherwise, and is named something like 10496897-8.proof . If you have it, please upload it somewhere (Drive, Box, etc) and send me the link, thanks.

Edit: The proof is here: https://www.adrive.com/public/9Uj3T4/10496897-8.proof

Last fiddled with by storm5510 on 2020-09-08 at 00:51

 2020-09-08, 01:42 #27 kriesel     "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 22·1,619 Posts Improving fallback to lower proof power etc. This was a power 8 proof PRP run on Radeon 5700XT and Windows 10 that went awry. Aside from the reported checksum error, I see a few additional issues with this run sequence. Code: 2020-09-07 11:48:16 asr2/5700xt 99356317 OK 98400000 99.04%; 2212 us/it; ETA 0d 00:35; ed3540b5aa993e56 (check 1.11s) 2020-09-07 11:55:39 asr2/5700xt 99356317 OK 98600000 99.24%; 2213 us/it; ETA 0d 00:28; 4569ca1b42f97bf5 (check 1.11s) 2020-09-07 12:03:03 asr2/5700xt 99356317 OK 98800000 99.44%; 2212 us/it; ETA 0d 00:21; 52d0f07278cb3ea0 (check 1.10s) 2020-09-07 12:10:27 asr2/5700xt 99356317 OK 99000000 99.64%; 2213 us/it; ETA 0d 00:13; 214b5e72adcb0097 (check 1.10s) 2020-09-07 12:17:50 asr2/5700xt 99356317 OK 99200000 99.84%; 2212 us/it; ETA 0d 00:06; 8dc4afa02db98b6e (check 1.10s) 2020-09-07 12:23:36 asr2/5700xt CC 99356317 / 99356317, af767eb4030a____ 2020-09-07 12:23:38 asr2/5700xt 99356317 OK 99356800 100.00%; 2215 us/it; ETA 0d 00:00; 5a424b4dc57d3ccf (check 1.07s) 2020-09-07 12:23:39 asr2/5700xt proof: building level 1, hash dc19c1ed5074bfed 2020-09-07 12:23:39 asr2/5700xt proof: building level 2, hash e1c39c39ef8fec8c2020-09-07 12:23:40 asr2/5700xt proof: building level 3, hash 4dff4687239f51cc 2020-09-07 12:23:42 asr2/5700xt proof: building level 4, hash 7803518131602fc6 2020-09-07 12:23:45 asr2/5700xt proof: building level 5, hash e6e93fce0591589a 2020-09-07 12:23:51 asr2/5700xt proof: building level 6, hash 486bd862e2a3633f 2020-09-07 12:23:53 asr2/5700xt checksum 78d6fc30 (expected 9be1d4ca) in '.\99356317\proof\23286660' 2020-09-07 12:23:53 asr2/5700xt Exception NSt10filesystem7__cxx1116filesystem_errorE: filesystem error: checksum mismatch: No error 2020-09-07 12:23:53 asr2/5700xt Bye >gpuowl-win 2020-09-07 17:44:04 gpuowl v6.11-364-g36f4e2a 2020-09-07 17:44:04 config: -user kriesel -cpu asr2/5700xt -d 2 -use NO_ASM -maxAlloc 7500 2020-09-07 17:44:04 device 2, unique id '' 2020-09-07 17:44:04 asr2/5700xt 99356317 FFT: 5.50M 1K:11:256 (17.23 bpw) 2020-09-07 17:44:04 asr2/5700xt Expected maximum carry32: 293D0000 2020-09-07 17:44:05 asr2/5700xt OpenCL args "-DEXP=99356317u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=11u -DPM1=0 -DAMDGPU=1 -DWEIGHT_STEP_MINUS_1=0xb.52db15a632b98p-4 -DIWEIGHT_STEP_MINUS_1=-0xd.42fc054606498p-5 -DNO_ASM=1 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only " 2020-09-07 17:44:13 asr2/5700xt OpenCL compilation in 8.11 s 2020-09-07 17:44:15 asr2/5700xt 99356317 OK 99200000 loaded: blockSize 400, 8dc4afa02db98b6e 2020-09-07 17:44:15 asr2/5700xt validating proof residues for power 8 2020-09-07 17:44:22 asr2/5700xt checksum 78d6fc30 (expected 9be1d4ca) in '.\99356317\proof\23286660' 2020-09-07 17:44:22 asr2/5700xt validating proof residues for power 9 2020-09-07 17:44:22 asr2/5700xt Can't open '.\99356317\proof\194056' (mode 'rb') 2020-09-07 17:44:22 asr2/5700xt validating proof residues for power 8 2020-09-07 17:44:27 asr2/5700xt checksum 78d6fc30 (expected 9be1d4ca) in '.\99356317\proof\23286660' 2020-09-07 17:44:27 asr2/5700xt validating proof residues for power 7 2020-09-07 17:44:30 asr2/5700xt checksum 78d6fc30 (expected 9be1d4ca) in '.\99356317\proof\23286660' 2020-09-07 17:44:30 asr2/5700xt validating proof residues for power 6 2020-09-07 17:44:30 asr2/5700xt Can't open '.\99356317\proof\1552443' (mode 'rb') 2020-09-07 17:44:30 asr2/5700xt Proof disabled because of missing checkpoints 2020-09-07 17:44:33 asr2/5700xt 99356317 OK 99200800 99.84%; 2199 us/it; ETA 0d 00:06; 924e6946b4f9fde2 (check 1.37s) 2020-09-07 17:50:17 asr2/5700xt CC 99356317 / 99356317, af767eb4030a5338 2020-09-07 17:50:18 asr2/5700xt 99356317 OK 99356400 100.00%; 2212 us/it; ETA 0d 00:00; abffad0e796314d6 (check 1.08s) 2020-09-07 17:50:18 asr2/5700xt {"status":"C", "exponent":"99356317", "worktype":"PRP-3", "res64":"af767eb4030a____", "residue-type":"1", "errors":{"gerbicz":"0"}, "fft-length":"5767168", "program":{"name":"gpuowl", "version":"v6.11-364-g36f4e2a"}, "user":"kriesel", "computer":"asr2/5700xt", "aid":"(redacted)", "timestamp":"2020-09-07 22:50:18 UTC"} 1) Gpuowl gives up, abandoning the run. It could skip to the next worktodo entry instead, putting hours or days of gpu time to productive use rather than leaving it idle until the user finds gpuowl halted. 2) There is a 1552444 iteration residue file, while in the restart it's looking for 1552443 at power 6. It seems there was a slight difference in computing how many iterations between the initial run and the restart or the original power and the fallback power. 3) It had already computed to 100% in the first run. And it recomputes from an indicated 99.84% to 100% in the restart. This is a minor production loss at 5 minutes 44 seconds. 4) The off-by-1, 1552444 vs. 1552443 prevents a power 6 proof from being generated in the restart. 5) Power 5 which would still save ~96% of a PRP DC is not attempted in the restart, or supported. (It might have the off by one, or more, issue too.) Admittedly this should be a rare case. Even power 4 would represent an occasional substantial savings over a complete DC as result of error. 6) For power 8, topk would be the next multiple of 256 above p which is 99356416 for p~99356317. Topk/256 for power 8 would be 388111. Saved residues would be at iterations that are multiples of that. Four times 388111=1552444, the first saved for power 6. The initial run goes past 99356416 to 99356800, presumably because of block size 400. But the restart computes only to 99356400, one less block for some reason. 99356400/256 = 388110.9375. Four times that is 1552443.75 which apparently got truncated to 1552443 for the power 6 restart. Or the restart proof attempts compute iteration count independently for each power ignoring the history of the exponent's run, or any need to ensure powers of 2 between iterations for different power proofs. If the restart omits the ceiling function, 99356317/256=388110.61328125; 4 times that is 1552442.453125, unlikely to produce 1552443 for power 6. So I suspect there's no way currently to save the proof. I still have all the files generated, and have not yet reported the PRP result. If in a future version, gpuowl computed topk for its maximum supported power (currently 9), then derived the specified power's iteration count for residues saved from multiples of that, some iteration multiples would be more reliably interchangeable among powers, improving fallback to lower powers upon an error. As is, topk/2^power for p=99356317 = power, first residue save, proposed, nearest multiple from current power8 default; 9 194056 194056 na 8 388111 388112 388111 7 776222 776224 776222 6 1552443 1552448 1552444 5 3104885 3104896 3104888 4 6209770 6209792 6209776 Last fiddled with by kriesel on 2020-09-08 at 01:54
2020-09-08, 12:10   #28
storm5510
Random Account

Aug 2009

210310 Posts

It seems I have ran into another oddity. gpuOwl is refusing to run PRP-CF tests:

Quote:
 1 FFT: 128K 256:1:256 (0.00 bpw) FFT size too large for exponent (0.00 bits/word) Exiting because "FFT size is too large"
This is on version 6.11-380-g79ea0cc. I never had a problem before, so something has changed. My GPU is an Nvidia GTX 1080. There are two lines in "use-flags" text file which sort of resembles the first line in my quoted section. It mentions 256,4,1. I do not know if this is relevant or not.

 2020-09-08, 12:41 #29 ATH Einyen     Dec 2003 Denmark 24×32×23 Posts Gpuowl can only do PRP tests, not PRP-CF test where 1 or more factors are known and testing the remaining cofactor.
2020-09-08, 15:28   #30
storm5510
Random Account

Aug 2009

3×701 Posts

Quote:
 Originally Posted by ATH Gpuowl can only do PRP tests, not PRP-CF test where 1 or more factors are known and testing the remaining cofactor.
The assignment lines in worktodo begin with "PRP" only. I does not know the difference big or small. I have been running CF's, with known factor(s), since George began the certification process with Prime95 v30.x. The problem must be something else...

 2020-09-08, 17:00 #31 kriesel     "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 22×1,619 Posts Gpuowl not running PRP-CF which it does not support at all, is not an oddity. Especially when the worktodo line says PRP. PRP-CF is not an implemented work type in gpuowl. It's likely that giving it known factors in the worktodo line is confusing the input parser that was not written to handle such data. Other types of input error will also give the error message reported or similar, such as omitting AID or a placeholder 0 instead of AID; it shifts k,b,n,c, etc. into unintended non-matching variables. Worktodo entry formats are described in https://www.mersenneforum.org/showpo...8&postcount=22 There is no utility to running a PRP test in gpuowl for something with a known factor. Nor in attempting to run a computation type in gpuowl that gpuowl does not implement, such as PRP-CF. Last fiddled with by kriesel on 2020-09-08 at 17:04
2020-09-08, 17:49   #32
storm5510
Random Account

Aug 2009

3·701 Posts

Quote:
 Originally Posted by kriesel Gpuowl not running PRP-CF which it does not support at all, is not an oddity. Especially when the worktodo line says PRP. PRP-CF is not an implemented work type in gpuowl. It's likely that giving it known factors in the worktodo line is confusing the input parser that was not written to handle such data. Other types of input error will also give the error message reported or similar, such as omitting AID or a placeholder 0 instead of AID; it shifts k,b,n,c, etc. into unintended non-matching variables. Worktodo entry formats are described in https://www.mersenneforum.org/showpo...8&postcount=22 There is no utility to running a PRPtest in gpuowl for something with a known factor. Nor in attempting to run a computation type it does not implement.
With all due respect, you are telling me I cannot run something which I am running now.

Quote:
 PRP=,1,2,7877777,-1,99,0,3,5,"47266663,9172685364795810287,125872567825872611377"
In the attached image, I have highlighted the version number at the top. This is the predecessor to the current which would not run CF's. I believe this is the same version which George sent a PM to me about a proof which was no good. I will only run this one assignment, and have George check it. If this proof is also bad, then I will stop. I have 5 other proofs from runs with gpuOwl. I received no notifications about problems with those proofs. I looked at my results on mersenne.org. I have 8 "verified" results and 1 "suspect." The suspect being the one I received the message about. preda requested the proof so I put it where he could download it.

In Prime95, I have my work type set to "First time PRP on Mersenne cofactors." If these are first time tests, then why do all the assignments I get have factors? Manual reservations are the same.
Attached Thumbnails

2020-09-08, 21:17   #33
kriesel

"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

647610 Posts

Quote:
 Originally Posted by storm5510 With all due respect, you are telling me I cannot run something which I am running now. ...
PRP proofs getting certified do not prove a completed PRP test is a valid PRP-CF test. Check your gpuowl result lines for your PRP-CF attempts via PRP= worktodo lines with extra ignored parameters tacked on the end (residue-type,base,"factor1,factor2"). Residue type 1 result is a PRP test in current gpuowl. Check the PrimeNet status pages of exponents you've run PRP tests on gpuowl (including the ones you intended as PRP-CF); PRP test is what the software did, and PrimeNet displays, residue type 1.

A first time PRP-CF test of Mwhatever/"previously known factors, new-factor" would tell whether the cofactor is still composite, or a prime. PRP of composite Mwhatever will return composite no matter how many factors are known or unknown. And a type 1 PRP test's res64 will match a type 5 PRP-CF run if both are correct, independently of how many factors are input into the PRP-CF run done by software that supports it, because that's a property of type 5 PRP residue tests.

Last fiddled with by kriesel on 2020-09-08 at 21:47

 Similar Threads Thread Thread Starter Forum Replies Last Post Xyzzy Lounge 4457 2022-05-14 15:42 preda GpuOwl 20 2020-10-17 06:51 GP2 GpuOwl 22 2020-06-13 16:57 M344587487 GpuOwl 14 2018-12-29 08:11 MattcAnderson Operazione Doppi Mersennes 3 2014-02-16 15:19

All times are UTC. The time now is 08:27.

Sun May 22 08:27:03 UTC 2022 up 38 days, 6:28, 0 users, load averages: 1.28, 1.11, 1.03