Register FAQ Search Today's Posts Mark Forums Read

2009-06-03, 22:17   #45
thome

May 2009

2×11 Posts

Quote:
 Originally Posted by fivemack I issue the command Code: nfsslave2@cow:/scratch/fib1039/with-cado$/home/nfsslave2/cado/cado-nfs-20090603-r2189/build/cow/linalg/bwc/u128_bench -t --impl bucket snfs.small and it produces a lot of output at the 'large' level normal behaviour (admittedly way too verbose). Quote:  before failing with Code: Lsl 56 cols 3634827..3699734 w=778884, avg dj=7.2, max dj=34365, bucket hit=1/1834.7-> too sparse Switching to huge slices. Lsl 56 to be redone Flushing 56 large slices Hsl 0 cols 3634827..5582056 (30*64908) .............................. w=16383453, avg dj=0.3, max dj=29376, bucket block hit=1/10.2 u128_bench: /home/nfsslave2/cado/cado-nfs-20090603-r2189/linalg/bwc/matmul-bucket.cpp:610: void split_huge_slice_in_vblocks(builder*, huge_slice_t*, huge_slice_raw_t*, unsigned int): Assertion (n+np)*2 == (size_t) (spc - sp0)' failed. Aborted If you could put your failing snfs.small file somewhere where I can grab it, it would be great. Quote:  The enormous filtering run got terminated by something that kills SSH sessions that have produced no output for ages, will try that again. You mean, the cado filtering programs got killed prematurely ? That would have a tendency to truncate the input to the bwc executables, but I doubt this is the cause, since the balancing program would have choked first. Thanks for your patient investigations... E. 2009-06-03, 22:55 #46 fivemack (loop (#_fork)) Feb 2006 Cambridge, England 11001001001102 Posts Quote: Quote:  The enormous filtering run got terminated by something that kills SSH sessions that have produced no output for ages, will try that again. You mean, the cado filtering programs got killed prematurely ? That would have a tendency to truncate the input to the bwc executables, but I doubt this is the cause, since the balancing program would have choked first. I wasn't using the script, just running Code: ~/cado/cado-nfs-20090603-r2189/build/cow/merge/purge -poly snfs.poly -nrels "$( zcat snfs.nodup.gz | wc -l)" -out snfs.purged snfs.nodup.gz > purge.aus 2> purge.err &
on a file with half a billion relations without using nohup, and the ssh connection from which I'd started it died.

I'm rerunning it, but the second pass is using 25G of vsize and the machine is swapping terribly, so I'm not expecting much progress.

2009-06-03, 23:00   #47
fivemack
(loop (#_fork))

Feb 2006
Cambridge, England

2·3·29·37 Posts

Quote:
 Originally Posted by thome If you could put your failing snfs.small file somewhere where I can grab it, it would be great
anonymous ftp to fivemack.dyndns.org and collect snfs.small.bz2 (710MB) and snfs.poly. My upload is quite slow so it may take a little while, I don't know if my ftp server supports resumption of partial transfers, if it gets frustrating tell me and I'll stick the file somewhere more accessible.

Last fiddled with by fivemack on 2009-06-03 at 23:15

 2009-06-03, 23:14 #48 fivemack (loop (#_fork))     Feb 2006 Cambridge, England 2×3×29×37 Posts Tiny command-line bug for transpose tool Since 'balance' doesn't appear to have a --transpose command-line option Code: nfsslave2@cow:/scratch/fib1039/with-cado$/home/nfsslave2/cado/cado-nfs-20090528-r2167/build/cow/linalg/balance --transpose --in snfs.small --out cabbage --nslices 1x4 --ramlimit 1G Unknown option: snfs.small Usage: ./bw-balance Typical options: --in input matrix filename --out output matrix filename --nslices [x] optimize for x strips --square pad matrix with zeroes to obtain square size More advanced: --remove-input remove the input file as soon as possible --ram-limit [kmgKMG] fix maximum memory usage --keep-temps keep all temporary files --subdir chdir to beforehand (mkdir if not found) --legacy produce only one jumbo matrix file I ran /home/nfsslave2/cado/cado-nfs-20090528-r2167/build/cow/linalg/transpose --in snfs.small --out snfs.small.T I killed the job after two hours; it was stuck in the argument-parsing loop! Using only one minus sign before 'in' and 'out' made it work, though it's now too late to start the balancing and bench jobs this evening. More later. Last fiddled with by fivemack on 2009-06-03 at 23:15  2009-06-03, 23:57 #49 fivemack (loop (#_fork)) Feb 2006 Cambridge, England 2·3·29·37 Posts what shape to use for decomposition? I think 100 seconds is too short for statistically significant comparisons for matrices this big, but (with four threads at each size) - 1x4 decomposition: 19 iterations in 104s, 5.47/1, 21.07 ns/coeff - 2x2 decomposition: 19 iterations in 102s, 5.34/1, 20.59 ns/coeff - 4x1 decomposition: 20 iterations in 104s, 5.18/1, 19.95 ns/coeff 20ns/coeff still feels a bit too long. To my limited surprise, explicitly transposing the matrix and running /home/nfsslave2/cado/cado-nfs-20090528-r2167/build/cow/linalg/bwc/u128_bench snfs.small.T -impl bucket gave exactly the same error as having u128_bench do the transposition. However, if I run balance on the transposed matrix then the u128_bench works with a 4x1 decomposition. 2x2 fails with the same error message as mentioned before. - 1x4, 2x2 decomposition: fails Assertion (n+np)*2 == (size_t) (spc - sp0)' - 4x1 decomposition: 21 iterations in 105s, 4.99/1, 19.22 ns/coeff If I give inadequately many parameters to a threaded call to u128_bench, it seems to read off the end of argv and into env: Code: nfsslave2@cow:/scratch/fib1039/with-cado$ taskset 0f /home/nfsslave2/cado/cado-nfs-20090528-r2167/build/cow/linalg/bwc/u128_bench -impl bucket -nthreads 4 -- butterfly14T.h* 4 threads requested, but 1 files given on the command line. Using implementation "bucket" no cache file butterfly14T.h*-bucket.bin T0 Building cache file for butterfly14T.h* no cache file (null)-bucket.bin T1 Building cache file for (null) no cache file TERM=xterm-color-bucket.bin T2 Building cache file for TERM=xterm-color no cache file SHELL=/bin/bash-bucket.bin fopen(butterfly14T.h*): No such file or directory fopen((null)): Bad address fopen(TERM=xterm-color): No such file or directory
 2009-06-04, 15:40 #50 joral     Mar 2008 5·11 Posts Ok. apparently it is random. I've had it fail at iteration 100, 1000, 1900, and 19500 (out of 29300).
 2009-06-04, 20:38 #51 thome   May 2009 2·11 Posts arg loop: fixed, thanks (this program is in fact unused -- does not really belong to the set of distributed prgs, yet it can be handy because it does a lot out of core). running off argv -- this has been fixed in one of the updated tarballs that I had posted. failing assert: the assert was wrong (sigh). Should have been (n+2*np)*2. A tiny example is the matrix which once piped through uniq -c'' gives the following output (one must also set HUGE_MPLEX_MIN to zero in matmul-bucket.cpp): 1 5000 5000 4365 0 1 1 1353 634 0 disappointing performance: I'm working on it. Thanks, E.
2009-06-04, 20:40   #52
thome

May 2009

2×11 Posts

Quote:
 Originally Posted by joral Ok. apparently it is random. I've had it fail at iteration 100, 1000, 1900, and 19500 (out of 29300).
ok perhaps you could give a try on a different machine ?

The good thing is that if you've got a dimm stick at fault, then now you have a handy way to pinpoint the culprit ;-).

E.

2009-06-04, 21:25   #53
frmky

Jul 2003
So Cal

1000100011002 Posts

Quote:
 Originally Posted by thome This warning no longer appears (yes, there's a new tarball).
I tried compiling the new source in Linux x86_64 using pthreads, but it ends with the error

CMake Error in linalg/bwc/CMakeLists.txt:
Cannot find source file "matmul-sub-large-fbi.S".

Sure enough, this file is referenced in the CMakeLists.txt and a corresponding .h file is #include'd in matmul-bucket.cpp, but it's not in the directory.

 2009-06-05, 00:30 #54 joral     Mar 2008 5×11 Posts Haven't been able to get it to build on my dual P3-700 yet. Don't want to compare to an athlon64x2 running at about 2Ghz. I may pull out my ubuntu cd later and run memtest against it to see what happens. I do find it interesting the examples run without incident. Last fiddled with by joral on 2009-06-05 at 00:31
 2009-06-05, 12:37 #55 joral     Mar 2008 3716 Posts Ok... Either I take my computer back to 1GB or I go buy a new memory stick. Ran memtest overnight, and it picked up about 808 bit errors right around the 1.5GB mark in 6 passes through. Good call, though bad for me...

 Similar Threads Thread Thread Starter Forum Replies Last Post jux CADO-NFS 25 2021-07-13 23:53 henryzz CADO-NFS 4 2017-11-20 15:14 akruppa Programming 22 2015-12-31 08:37 skan Information & Answers 1 2013-10-22 07:00 R.D. Silverman Factoring 4 2008-11-06 12:35

All times are UTC. The time now is 17:19.

Sun Oct 17 17:19:37 UTC 2021 up 86 days, 11:48, 1 user, load averages: 1.96, 2.05, 1.97