Advice for large SNFS jobs?
Hi all,
I'm working on factoring (2801^831)/2800 using a modified factmsieve.py  and yes, I have a cluster available to me... :) factmsieve tells me at the getgo: Fri Jan 18 09:58:41 2013 > Estimated minimum relations needed: 5.53168e+08 I'm able to make it up to about 200M relations with the default parameters (starting at rational q from 238450000, FAMAX = 476900000) before hitting GGNFS' limit: it can't handle special q >= 2^30  1. Does anyone have any advice for what to do with jobs this big? Try to change the sieving window somehow? (And if so  to what)? Here's the polynomial I'm using: [code]n: 4784427753962229503583191777575386925462640502543527013793934480234680863804447852383959785408791045459809147067083157248015897910382151758867576620242257524246139326208569043470479714282260046673050230392057658284742406595942226610043596316622243579005395853667131475327572196568483 m: 1829715316371090533839726975772594414416841479201 deg: 6 skew: 0 type: snfs c6: 1 c0: 2801[/code] which gives the factor base: [code]N 4784427753962229503583191777575386925462640502543527013793934480234680863804447852383959785408791045459809147067083157248015897910382151758867576620242257524246139326208569043470479714282260046673050230392057658284742406595942226610043596316622243579005395853667131475327572196568483 SKEW 3.75 A6 1 A0 2801 R1 1 R0 1829715316371090533839726975772594414416841479201 FAMAX 476900000 FRMAX 476900000 SALPMAX 4294967296 SRLPMAX 4294967296[/code] 
I'm impressed by the scale of your cluster, but factmsieve is not designed for jobs this big.
The polynomial is right, and the alim and lp look reasonable, but you're clearly using the wrong sieving binary since you're getting 0.25 relations per Q. I think you should be using 16e, and you should be using three large primes on the rational side (lpbr=32 mfbr=96 rlambda=3.6) For things this large I tend to start from small Q (eg Q=1e7) rather than Q=Qmax/2. 
What siever did factmsieve.py choose for such a job ? NFS@Home would probably choose ggnfslasieve4I16e, if not the corresponding lasieve5.

debrouxl: It's using gnfslasieve4I16e, as I expected.
fivemack: Thanks, I'll try starting with Q=1e7 and the mfbr/rlambda values you suggested... With those values, do you think there's a shot that GGNFS/msieve will be able to finish this thing? :) 
[QUOTE=ryanp;325395]debrouxl: It's using gnfslasieve4I16e, as I expected.
fivemack: Thanks, I'll try starting with Q=1e7 and the mfbr/rlambda values you suggested... With those values, do you think there's a shot that GGNFS/msieve will be able to finish this thing? :)[/QUOTE] Even if it doesn't assuming you are on linux you should be able to run the later version of the siever that will sieve higher Qs. 
[QUOTE=henryzz;325396]Even if it doesn't assuming you are on linux you should be able to run the later version of the siever that will sieve higher Qs.[/QUOTE]
Is that in the latest released version of GGNFS, or the version from head in SVN? 
Here is a link to the newer siever. There shouldn't be much speed difference unless you can get ecm working helpfully.
[url]http://mersenneforum.org/showpost.php?p=308178&postcount=15[/url] I don't think the source is in the svn. 
It's definitely been [URL="http://ggnfs.svn.sourceforge.net/viewvc/ggnfs/trunk/src/experimental/lasieve4_64/"]there[/URL] for two years and is patched for many found problems. [URL="http://ggnfs.svn.sourceforge.net/viewvc/ggnfs/trunk/contrib/remdups/"]remdups[/URL] is contrib/ directory.
I thought that the lasieve5 source was also there but it is not maintained; you'd have to figure everything for yourself. If it is not there, you can find a zip [URL="http://www.mersenneforum.org/showthread.php?p=298833&highlight=lasieve5#post298833"]somewhere[/URL] on this forum. 
[QUOTE=henryzz;325408]Here is a link to the newer siever. There shouldn't be much speed difference unless you can get ecm working helpfully.
[url]http://mersenneforum.org/showpost.php?p=308178&postcount=15[/url] I don't think the source is in the svn.[/QUOTE] Hmm, after downloading/untarring that version I still get: [code]./gnfslasieve4I16e k o spairs.out.test v n0 r input.job.test gnfslasieve4I16e (with asm64): L1_BITS=15, SVN $Revision: 399 $ Cannot handle special q >= 1073741823[/code] Same with building from source (in the src/experimental/lasieve4_64 tree, using the instructions in INSTALL). Am I doing something wrong, or is there a newer version somewhere that can handle large q? 
Erm, perhaps I missed this in INSTALL:
[code]NOTE for Phenom/K8 users: replace in athlon64/lsdefs.asm define(l1_bits,15)dnl => define(l1_bits,16)dnl and in athlon64/sieverconfig.h #define L1_BITS 15 => #define L1_BITS 16[/code] though that's not my CPU type... Is it still "safe"/advised? 
No, this is only for the CPUs that have 64Kb L1 cache, i.e. AMD CPUs. (log[SUB]2[/SUB]64Kb = 16) Don't change L1 bits for Intel CPUs.
How much of the q area have you already sieved? What side have you sieved on? There's no need really for a project of this size to go over q>2^30. Try to cover the area from q=10^7 to your current lower limit (where you started, 238450000). Even if you go over 2^30, the yield will be less and less. You may get a better yield by repeating some of the most productive (lower q) areas with the parameters that Tom (fivemack) suggested earlier. Have you used 3LP? Like  [code]lpbr: 33 lpba: 33 mfba: 66 mfbr: 96 alambda: 2.55 rlambda: 3.7 [/code] Have you tried to filter your existing set of relations? Last but not the least, do you have a computer (set of computers) to solve the resulting >40M matrix? (As the saying goes, take no offense,  it's not the size (of the cluster), it's how you use it that matters. Have you done a snfs~270280 before doing this snfs290?) If you really want to go to very high q values, use the link to lasieve5 message. 
I can tell it was late at night when I posted that. I posted a link to the 64bit asm siever.

[QUOTE=Batalov;325457]No, this is only for the CPUs that have 64Kb L1 cache, i.e. AMD CPUs. (log[SUB]2[/SUB]64Kb = 16) Don't change L1 bits for Intel CPUs.
How much of the q area have you already sieved? What side have you sieved on? There's no need really for a project of this size to go over q>2^30. Try to cover the area from q=10^7 to your current lower limit (where you started, 238450000). Even if you go over 2^30, the yield will be less and less. You may get a better yield by repeating some of the most productive (lower q) areas with the parameters that Tom (fivemack) suggested earlier. Have you used 3LP? Like  [code]lpbr: 33 lpba: 33 mfba: 66 mfbr: 96 alambda: 2.55 rlambda: 3.7 [/code][/QUOTE] Yeah, that's my current approach  resieving starting from q=10^7 with the parameters Tom suggested... [QUOTE=Batalov;325457]Have you tried to filter your existing set of relations? Last but not the least, do you have a computer (set of computers) to solve the resulting >40M matrix? (As the saying goes, take no offense,  it's not the size (of the cluster), it's how you use it that matters. Have you done a snfs~270280 before doing this snfs290?)[/QUOTE] Yes, I'm aware that the matrixsolving will be challenging and slow... I have a C250 (not sure exactly what the SNFS difficulty is) going on a multicore machine that's estimated to take about a month of matrix solving. My cluster isn't really easy to set up with MPI, but it's an option I'm considering... 
[COLOR=black][FONT=Verdana]Well, it is going to be a hard project. You can find the snfs difficulty level in the logs of that other project, and you can roughly project that every 30 digits of snfs difficulty translate into 10 times more time for the project. The matrix of this size will definitely need MPI (and for MPI, you will need very fast interconnect or else the cluster's productivity will be less than that of a single strongest node). Here's the size of the 6,373 matrix (this is a similarly sized project, run by B.Dodson on an MPI cluster; the parameters above were from it; it was a 33bit 3LP project):[/FONT][/COLOR]
[CODE]Sat Jul 21 12:36:44 2012 matrix is 39095900 x 39096100 (11555.1 MB) with weight 3371381266 (86.23/col) Sat Jul 21 12:36:44 2012 sparse part has weight 2638144630 (67.48/col) [/CODE] If you will run from low q's with the 33bit lpbr/a and mfba: 66 mfbr: 96, then you may get enough relations before reaching 300M. Bits were increased exactly to avoid running away too far on the q axis, definitely not to 1000M. The relations should be combinable. 
[QUOTE=Batalov;325478]Well, it is going to be a hard project. You can find the snfs difficulty level in the logs of that other project, and you can roughly project that every 30 digits of snfs difficulty translate into 10 times more time for the project. The matrix of this size will definitely need MPI (and for MPI, you will need very fast interconnect or else the cluster's productivity will be less than that of a single strongest node).[/QUOTE]
So, here's the good news: msieve was able to generate a matrix for (2801^831)/2800 with about 461M input relations (368M unique). Here's a snippet of the log, in case anyone is interested (the forum won't let me post more than this...) [code]Tue Jan 22 14:32:07 2013 begin with 179606350 relations and 160161527 unique ideals Tue Jan 22 14:34:22 2013 reduce to 179416277 relations and 154697630 ideals in 7 passes Tue Jan 22 14:34:22 2013 max relations containing the same ideal: 174 Tue Jan 22 14:35:31 2013 removing 7091785 relations and 5091785 ideals in 2000000 cliques Tue Jan 22 14:35:36 2013 commencing inmemory singleton removal Tue Jan 22 14:35:49 2013 begin with 172324492 relations and 154697630 unique ideals Tue Jan 22 14:37:10 2013 reduce to 172138029 relations and 149417492 ideals in 6 passes Tue Jan 22 14:37:10 2013 max relations containing the same ideal: 168 Tue Jan 22 14:38:16 2013 removing 6922412 relations and 4922412 ideals in 2000000 cliques Tue Jan 22 14:38:20 2013 commencing inmemory singleton removal Tue Jan 22 14:38:32 2013 begin with 165215617 relations and 149417492 unique ideals Tue Jan 22 14:40:03 2013 reduce to 165016535 relations and 144293839 ideals in 6 passes Tue Jan 22 14:40:03 2013 max relations containing the same ideal: 165 Tue Jan 22 14:41:15 2013 removing 6862286 relations and 4862286 ideals in 2000000 cliques Tue Jan 22 14:41:20 2013 commencing inmemory singleton removal Tue Jan 22 14:41:31 2013 begin with 158154249 relations and 144293839 unique ideals Tue Jan 22 14:42:58 2013 reduce to 157961804 relations and 139237031 ideals in 7 passes Tue Jan 22 14:42:58 2013 max relations containing the same ideal: 160 Tue Jan 22 14:44:02 2013 removing 6724377 relations and 4724377 ideals in 2000000 cliques Tue Jan 22 14:44:06 2013 commencing inmemory singleton removal Tue Jan 22 14:44:17 2013 begin with 151237427 relations and 139237031 unique ideals Tue Jan 22 14:45:35 2013 reduce to 151031091 relations and 134303757 ideals in 6 passes Tue Jan 22 14:45:35 2013 max relations containing the same ideal: 156 Tue Jan 22 14:46:38 2013 removing 6670225 relations and 4670225 ideals in 2000000 cliques Tue Jan 22 14:46:42 2013 commencing inmemory singleton removal Tue Jan 22 14:46:52 2013 begin with 144360866 relations and 134303757 unique ideals Tue Jan 22 14:48:03 2013 reduce to 144148234 relations and 129418117 ideals in 6 passes Tue Jan 22 14:48:03 2013 max relations containing the same ideal: 151 Tue Jan 22 14:49:04 2013 removing 6603405 relations and 4603405 ideals in 2000000 cliques Tue Jan 22 14:49:08 2013 commencing inmemory singleton removal Tue Jan 22 14:49:18 2013 begin with 137544829 relations and 129418117 unique ideals Tue Jan 22 14:50:20 2013 reduce to 137319456 relations and 124586283 ideals in 6 passes Tue Jan 22 14:50:20 2013 max relations containing the same ideal: 147 Tue Jan 22 14:51:14 2013 removing 6568551 relations and 4568551 ideals in 2000000 cliques Tue Jan 22 14:51:18 2013 commencing inmemory singleton removal Tue Jan 22 14:51:31 2013 begin with 130750905 relations and 124586283 unique ideals Tue Jan 22 14:52:43 2013 reduce to 130514605 relations and 119778057 ideals in 6 passes Tue Jan 22 14:52:43 2013 max relations containing the same ideal: 140 Tue Jan 22 14:53:36 2013 removing 6536343 relations and 4536343 ideals in 2000000 cliques Tue Jan 22 14:53:39 2013 commencing inmemory singleton removal Tue Jan 22 14:53:49 2013 begin with 123978262 relations and 119778057 unique ideals Tue Jan 22 14:55:10 2013 reduce to 123722752 relations and 114982284 ideals in 6 passes Tue Jan 22 14:55:10 2013 max relations containing the same ideal: 136 Tue Jan 22 14:56:06 2013 removing 6560987 relations and 4560987 ideals in 2000000 cliques Tue Jan 22 14:56:10 2013 commencing inmemory singleton removal Tue Jan 22 14:56:19 2013 begin with 117161765 relations and 114982284 unique ideals Tue Jan 22 14:57:27 2013 reduce to 116893302 relations and 110148493 ideals in 6 passes Tue Jan 22 14:57:27 2013 max relations containing the same ideal: 132 Tue Jan 22 14:58:12 2013 removing 6556656 relations and 4556656 ideals in 2000000 cliques Tue Jan 22 14:58:16 2013 commencing inmemory singleton removal Tue Jan 22 14:58:26 2013 begin with 110336646 relations and 110148493 unique ideals Tue Jan 22 14:59:35 2013 reduce to 110043331 relations and 105293451 ideals in 7 passes Tue Jan 22 14:59:35 2013 max relations containing the same ideal: 125 Tue Jan 22 15:00:18 2013 removing 6618713 relations and 4618713 ideals in 2000000 cliques Tue Jan 22 15:00:22 2013 commencing inmemory singleton removal Tue Jan 22 15:00:29 2013 begin with 103424618 relations and 105293451 unique ideals Tue Jan 22 15:01:26 2013 reduce to 103106567 relations and 100351046 ideals in 7 passes Tue Jan 22 15:01:26 2013 max relations containing the same ideal: 120 Tue Jan 22 15:02:15 2013 removing 4488221 relations and 3261291 ideals in 1226930 cliques Tue Jan 22 15:02:17 2013 commencing inmemory singleton removal Tue Jan 22 15:02:24 2013 begin with 98618346 relations and 100351046 unique ideals Tue Jan 22 15:03:13 2013 reduce to 98473321 relations and 96943205 ideals in 5 passes Tue Jan 22 15:03:13 2013 max relations containing the same ideal: 117 Tue Jan 22 15:04:15 2013 relations with 0 large ideals: 50400 Tue Jan 22 15:04:15 2013 relations with 1 large ideals: 10132 Tue Jan 22 15:04:15 2013 relations with 2 large ideals: 133132 Tue Jan 22 15:04:15 2013 relations with 3 large ideals: 995810 Tue Jan 22 15:04:15 2013 relations with 4 large ideals: 4302185 Tue Jan 22 15:04:15 2013 relations with 5 large ideals: 11705716 Tue Jan 22 15:04:15 2013 relations with 6 large ideals: 20984820 Tue Jan 22 15:04:15 2013 relations with 7+ large ideals: 60291126 Tue Jan 22 15:04:15 2013 commencing 2way merge Tue Jan 22 15:05:23 2013 reduce to 71534639 relation sets and 70004523 unique ideals Tue Jan 22 15:05:23 2013 commencing full merge Tue Jan 22 15:24:09 2013 memory use: 8280.4 MB Tue Jan 22 15:24:19 2013 found 38628551 cycles, need 38416723 Tue Jan 22 15:24:39 2013 weight of 38416723 cycles is about 2689414920 (70.01/cycle) Tue Jan 22 15:24:39 2013 distribution of cycle lengths: Tue Jan 22 15:24:39 2013 1 relations: 4501188 Tue Jan 22 15:24:39 2013 2 relations: 5120135 Tue Jan 22 15:24:39 2013 3 relations: 5404626 Tue Jan 22 15:24:39 2013 4 relations: 5022230 Tue Jan 22 15:24:39 2013 5 relations: 4474123 Tue Jan 22 15:24:39 2013 6 relations: 3769913 Tue Jan 22 15:24:39 2013 7 relations: 3034947 Tue Jan 22 15:24:39 2013 8 relations: 2310051 Tue Jan 22 15:24:39 2013 9 relations: 1670837 Tue Jan 22 15:24:39 2013 10+ relations: 3108673 Tue Jan 22 15:24:39 2013 heaviest cycle: 20 relations Tue Jan 22 15:24:53 2013 commencing cycle optimization Tue Jan 22 15:25:43 2013 start with 186046555 relations Tue Jan 22 15:29:51 2013 pruned 5360265 relations Tue Jan 22 15:29:52 2013 memory use: 6028.8 MB Tue Jan 22 15:29:52 2013 distribution of cycle lengths: Tue Jan 22 15:29:52 2013 1 relations: 4501188 Tue Jan 22 15:29:52 2013 2 relations: 5249050 Tue Jan 22 15:29:52 2013 3 relations: 5634416 Tue Jan 22 15:29:52 2013 4 relations: 5175687 Tue Jan 22 15:29:52 2013 5 relations: 4610943 Tue Jan 22 15:29:52 2013 6 relations: 3822774 Tue Jan 22 15:29:52 2013 7 relations: 3034347 Tue Jan 22 15:29:52 2013 8 relations: 2242055 Tue Jan 22 15:29:52 2013 9 relations: 1574214 Tue Jan 22 15:29:52 2013 10+ relations: 2572049 Tue Jan 22 15:29:52 2013 heaviest cycle: 18 relations Tue Jan 22 15:31:08 2013 RelProcTime: 14912 Tue Jan 22 15:31:08 2013 elapsed time 04:08:35[/code] The bad news: as you, I, and everyone else expected, it's huge, and it's going to take about 2 months to solve without MPI. :) [code]Tue Jan 22 16:09:42 2013 building initial matrix Tue Jan 22 16:26:47 2013 memory use: 13546.6 MB Tue Jan 22 16:29:10 2013 read 38416723 cycles Tue Jan 22 16:29:18 2013 matrix is 38416545 x 38416723 (11588.0 MB) with weight 3463583491 (90.16/col) Tue Jan 22 16:29:18 2013 sparse part has weight 2615145298 (68.07/col) Tue Jan 22 16:35:02 2013 filtering completed in 2 passes Tue Jan 22 16:35:11 2013 matrix is 38411484 x 38411662 (11587.7 MB) with weight 3463450692 (90.17/col) Tue Jan 22 16:35:11 2013 sparse part has weight 2615116589 (68.08/col) Tue Jan 22 16:37:29 2013 matrix starts at (0, 0) Tue Jan 22 16:37:36 2013 matrix is 38411484 x 38411662 (11587.7 MB) with weight 3463450692 (90.17/col) Tue Jan 22 16:37:36 2013 sparse part has weight 2615116589 (68.08/col) Tue Jan 22 16:37:36 2013 saving the first 48 matrix rows for later Tue Jan 22 16:37:41 2013 matrix includes 64 packed rows Tue Jan 22 16:37:46 2013 matrix is 38411436 x 38411662 (11154.4 MB) with weight 2753602949 (71.69/col) Tue Jan 22 16:37:46 2013 sparse part has weight 2539934136 (66.12/col) Tue Jan 22 16:37:46 2013 using block size 65536 for processor cache size 20480 kB Tue Jan 22 16:39:57 2013 commencing Lanczos iteration (32 threads) Tue Jan 22 16:39:57 2013 memory use: 18349.5 MB Tue Jan 22 16:43:56 2013 linear algebra at 0.0%, ETA 1600h46m Tue Jan 22 16:45:07 2013 checkpointing every 30000 dimensions[/code] So I guess I'll be checking back here in about 2 months unless I get MPI set up or find a kind soul who has an MPI cluster they can lend for a bit... hehe. 
Well, it [I]is[/I] good news; you ran it as a 32bit project so indeed, 368M unique relations would do. Two months LA is not a death sentence now with orthogonality checks. Do backup the whole dataset somewhere safe once, and then backup the .chk file every few days.

[QUOTE=Batalov;325539]Well, it [I]is[/I] good news; you ran it as a 32bit project so indeed, 368M unique relations would do. Two months LA is not a death sentence now with orthogonality checks. Do backup the whole dataset somewhere safe once, and then backup the .chk file every few days.[/QUOTE]
Managed to improve things a bit: with a bunch more sieving and running the msieve filtering step with 'D 100', I now have this matrix: [code]Wed Jan 23 17:35:02 2013 matrix is 31290418 x 31290595 (12587.7 MB) with weight 3718456345 (118.84/col) Wed Jan 23 17:35:02 2013 sparse part has weight 2955591831 (94.46/col) Wed Jan 23 17:41:36 2013 filtering completed in 2 passes Wed Jan 23 17:41:45 2013 matrix is 31289277 x 31289453 (12587.6 MB) with weight 3718418753 (118.84/col) Wed Jan 23 17:41:45 2013 sparse part has weight 2955581343 (94.46/col) Wed Jan 23 17:44:46 2013 matrix starts at (0, 0) Wed Jan 23 17:44:53 2013 matrix is 31289277 x 31289453 (12587.6 MB) with weight 3718418753 (118.84/col) Wed Jan 23 17:44:53 2013 sparse part has weight 2955581343 (94.46/col) Wed Jan 23 17:44:53 2013 saving the first 48 matrix rows for later Wed Jan 23 17:44:59 2013 matrix includes 64 packed rows Wed Jan 23 17:45:05 2013 matrix is 31289229 x 31289453 (12151.2 MB) with weight 3104613918 (99.22/col) Wed Jan 23 17:45:05 2013 sparse part has weight 2872469612 (91.80/col) Wed Jan 23 17:45:05 2013 using block size 65536 for processor cache size 20480 kB Wed Jan 23 17:47:19 2013 commencing Lanczos iteration (32 threads) Wed Jan 23 17:47:19 2013 memory use: 17684.1 MB Wed Jan 23 17:50:58 2013 linear algebra at 0.0%, ETA 1199h45m Wed Jan 23 17:52:09 2013 checkpointing every 30000 dimensions[/code] So, 50 days' work... not great, but not terrible either. 
When you have a .chk file, save it and you can carefully kill* and try to [B][SIZE="7"]ncr[/SIZE][/B] with different number of threads. (avoid accidentally repeating the last command line from shell history, i.e. nc2. Do ncr.)
On different systems, different t # (numbers of threads) turn out to be best. (Smaller number of threads may run each longer on an atomic portion, but then sync may happen faster. On a 2 x 6core Xeon workstation, I've tried many times and the best # of threads was 8 or 9. And setting affinities to threads only made things worse.) _________ [SIZE=1]*"carefully" involves some seemingly strange rites. Don't kill the job at dimension that is ~ within +1000 of a number divisible by 5000. This is because an orthogonality check too close to a save file will report failure (even though everything is within norm). There's a detailed explanation in the msieve threads, but just follow a rule of thumb: press ^C only when dimension's 4th digit from right (resposible for thousands) is 1,2,3 or 6,7,8.[/SIZE] 
I see that an invisible HWMNBN got a bit emotional and [COLOR=black][FONT=Verdana]beautified[/FONT][/COLOR] my message. Ah, memories, memories... [STRIKE]We all[/STRIKE] HWMNBN apparently has got a few. :razz:

[QUOTE=Batalov;325736]I see that an invisible HWMNBN got a bit emotional and [COLOR=black][FONT=Verdana]beautified[/FONT][/COLOR] my message. Ah, memories, memories... [STRIKE]We all[/STRIKE] HWMNBN apparently has got a few. :razz:[/QUOTE]
Maybe nc[B][SIZE="7"]r[/SIZE][/B] would have been even better. :smile: 
BTW, can you clarify what you meant when you said "you ran it as a 32bit project"? Are you referring to the fact that I used the 32bit gnfslasieve* binaries, or something else?
What would have been different if it had been run as a 64bit project? Would I have needed more or fewer relations, would I have gotten better yields from the lattice sievers, etc.? 
I'm fairly certain he's referring to the S*LPMAX values; they are both 2^32. In the ggnfs notation, they would be written as "lpb*: 32" ("the large prime bound is 2^32"). The relations required is asymptotically pi(lpb) (or equivalently pi(log(slpmax)) ) so if you had used a large(er) prime bound of 2^33 you would have needed more relations, while otoh getting more rels per q due to the easier factoring.
On Windows, the 32 bit binary is slightly faster, while on Linux, there is inline assembly that makes its 64 bit version significantly faster than either 32 bit version. (It seems like in the last few months Brian Gladman has been trying to port the asm to Windows, but I don't know how successful he's been.) 
Correct.
Your reported SALPMAX 4294967296, SRLPMAX 4294967296 were the indication that 2^32 large prime bounds were used. Sometimes referred to as a 32bit project (e.g. NFS @ Home has a whole spectrum from 29 to 33bit, and they, too, [URL="http://escatter11.fullerton.edu/nfs/crunching.php"]use this lingo[/URL]). For 6,373 and for 3,613, we were using 2^33 large prime bounds but ~2 times lower FAMAX/FRMAX of 2^28 (vs your 476900000) and starting from 20000000. I've seen similar settings work well in other hands too, so I didn't improvise too much. I remember some threefour year old trailblazing projects from Greg where he reached the limit, too (only 15e siever was available at the time), so they sieved extra on the other side. It turned out to be quite redundant, but it was the best what could be done for that project at that time. You'd need to page down for quite a while in the Factoring (or possibly in Cunninghams') forum to find notes from that project. So sieving on both sides was deemed to be generally avoided. 
In general, what's the strategy most people take for really large GNFS/SNFS projects? Sieve the same q values (e.g. q = 1e7 through q = 2^311) with different large prime bounds, e.g. 29 <= lpbr <= 33? Something else?
Is this described in any documentation/papers anywhere? Thanks! 
Sieving the same region with different largeprime bounds is a complete waste of time.
There is a limit to the size of jobs that can be efficiently done with the current versions of the current tools; that limit is rather smaller than the size of some jobs than have been done (that is, those jobs were done inefficiently). Fixing the tools to allow larger largeprime bounds is not trivial but is what would need doing; fixing them to allow larger sieving areas is much less trivial but would also be useful. 
[QUOTE=fivemack;330307]Sieving the same region with different largeprime bounds is a complete waste of time.
There is a limit to the size of jobs that can be efficiently done with the current versions of the current tools; that limit is rather smaller than the size of some jobs than have been done (that is, those jobs were done inefficiently). Fixing the tools to allow larger largeprime bounds is not trivial but is what would need doing; fixing them to allow larger sieving areas is much less trivial but would also be useful.[/QUOTE] Increasing the max large primes is a trial change in the siever. You just need to comment out the check. Did you mean the factor base size? I just compiled the siever with the change and sieved with 34 bit large primes. I ran the results through a filtering run in msieve. There were no errored relations. To prove there were 34bit large primes in there, I sieved again with 33 bit large primes and there was less relations. I repeated with a bound of 40 and same results. More relations and no errors. 
The concern is that every extra bit in large primes means you need about twice as many relations in order to converge to a matrix. The current postprocessing code we use has a hard limit of 4G relations, and we're 25% of the way there right now with 33bit LPs.
Msieve can handle up to 48bit large primes, an arbitrary number of them per relation, but cannot handle a dataset with 10^13 relations... 
[QUOTE=henryzz;330343]Increasing the max large primes is a trial change in the siever. You just need to comment out the check.[/quote]
No, I meant max large primes  the MPQS at present only works with 96bit cofactors and you need larger cofactors if you want sensibly to use three larger primes. (though this is a matter for experiment; maybe two 35bit large primes give useful yield) Doing the uniqifying and easysingletonremoval passes before giving the data to msieve gets the size down enough to allow maybe one more bit of large primes. 
[QUOTE=jasonp;330347]The concern is that every extra bit in large primes means you need about twice as many relations in order to converge to a matrix. The current postprocessing code we use has a hard limit of 4G relations, and we're 25% of the way there right now with 33bit LPs.
Msieve can handle up to 48bit large primes, an arbitrary number of them per relation, but cannot handle a dataset with 10^13 relations...[/QUOTE] If we are 25% there now with 33bit, we should be able to hopefully do 35bit. What causes this limit is it extendable? 
For the record, I wasn't asking whether using different large prime bounds was the most efficient way to do large jobs. I was only asking what's *possible*. :)
Currently once you hit q=2^301, that's too high for GGNFS to handle, and you need a different strategy. If your polynomial isn't good enough to get all the needed relations by q=2^301, what strategies are possible to get to the linear algebra stages? Or are factorizations of these large GNFS/SNFS jobs simply impossible at this point with the tools available? 
[QUOTE=henryzz;330357]If we are 25% there now with 33bit, we should be able to hopefully do 35bit.
What causes this limit is it extendable?[/QUOTE] Msieve refers to a relation during the postprocessing by its line number in the save file, and this is currently a 32bit integer. It's no big deal to make it a larger integer, but the assumption of 32bit relation numbers is everywhere in a complex codebase. It's in file formats, hashtables, etc. 
[QUOTE=fivemack;330355]No, I meant max large primes  the MPQS at present only works with 96bit cofactors and you need larger cofactors if you want sensibly to use three larger primes.
(though this is a matter for experiment; maybe two 35bit large primes give useful yield) Doing the uniqifying and easysingletonremoval passes before giving the data to msieve gets the size down enough to allow maybe one more bit of large primes.[/QUOTE] That is in the v4 siever. v5 has a second mpqs routine(mpqs3.c) that looks like it does upto 148 bits. There is also a built in ecm and p1 routine that is used if you provide a parameter file. I worked out the parameter file format but didn't make much headway against getting optimal parameters(I imagine we could improve the speed of numbers we are currently factoring by at least 10% with this if we work out parameters. Potentially we could gain much more). I now see there is a program ecmstat that might help with this. I think the v5 siever can do more than we are asking of it currently. If we run out of prime special Qs for a number we can always sieve some composite ones with the f version of the siever as well(from memory 1/3 of the speed). 
The CADO code also includes a tremendously fast ECM/P+1 implementation for up to 126 bit operands (two 64bit words max) that Alex reports is even faster than the MPQS code in the Franke/Kleinjung sievers.

[QUOTE=jasonp;330401]The CADO code also includes a tremendously fast ECM/P+1 implementation for up to 126 bit operands (two 64bit words max) that Alex reports is even faster than the MPQS code in the Franke/Kleinjung sievers.[/QUOTE]
How does the rest of the siever compare? It shouldn't be that hard to replace the large prime factoring code in the ggnfs siever if necessary. 
[QUOTE=henryzz;330450]How does the rest of the siever compare? It shouldn't be that hard to replace the large prime factoring code in the ggnfs siever if necessary.[/QUOTE]
A worthwhile endeavor!!!!! If NFS@Home has reached the limits of GGNFS, what about trying the CADO tools instead? Let's try to improve capabilities! 
[QUOTE=jasonp;330401]The CADO code also includes a tremendously fast ECM/P+1 implementation for up to 126 bit operands (two 64bit words max) that Alex reports is even faster than the MPQS code in the Franke/Kleinjung sievers.[/QUOTE]
It was a bit faster when I visited EPFL a few years ago. However, I have not worked on that code since, while Thorsten probably worked on his, and I have no idea how they compare with newer lasieve versions. 
If you read above we are working out ways of extending those limits. Correct me if I am wrong but I don't think cado has something capable of replacing msieve for this size job. It looks like above that msieve will soon be our limitation. Putting the cado ecm into ggnfs is purely an easy speedup not an extension of capability.
Eventually I imagine the ggnfs siever will be a problem because of its limitted sieve area but not yet. 
[QUOTE=henryzz;330510]If you read above we are working out ways of extending those limits. Correct me if I am wrong but I don't think cado has something capable of replacing msieve for this size job. [/QUOTE]
Hmm... as I recall it was CADO tools which postprocessed M1061. 
[QUOTE=henryzz;330510] It looks like above that msieve will soon be our limitation. Putting the cado ecm into ggnfs is purely an easy speedup not an extension of capability.
.[/QUOTE] I disagree. It should, in theory, result in an extension of capability. If one keeps the size of the sieve area fixed, but increases the sizes of the factor bases, then the yield per specialq over the fixed sieve region will increase. The problem is: [b]so will the number of false hits[/b]. You will get more cofactors that do not split into primes less than the large prime bounds. It will take extra time to dispose of these false hits. A faster method of splitting the cofactors will alleviate this difficulty. 
M1061 was postprocessed with Msieve.
The CADO postprocessing code is perfectly capable of handling large jobs; it was used to factor RSA704, which generated a matrix almost as large as what Greg needed for M1061. In fact the dataset for RSA704 was undersieved and Msieve couldn't handle the lack of excess relations. Each package has different strengths. Whether the CADO tools can handle worldrecordsize jobs is an open question; I can tell you that Msieve surely cannot. 
[QUOTE=jasonp;330517]
Whether the CADO tools can handle worldrecordsize jobs is an open question;[/QUOTE] This is work in progress. 
[QUOTE=ryanp;325647]So, 50 days' work... not great, but not terrible either.[/QUOTE]
50 days are over  is the job done successfull? :question: 
Almost there!
linear algebra completed 30676793 of 31289453 dimensions (98.0%, ETA 22h29m) 
2801^831 factored
And, here it is. (2801^831)/2800 factors into a 93 digit prime and a 191 digit prime:
[code]Fri Mar 15 09:30:52 2013 lanczos halted after 494801 iterations (dim = 31289229) Fri Mar 15 09:31:36 2013 recovered 34 nontrivial dependencies Fri Mar 15 09:31:36 2013 BLanczosTime: 59328 Fri Mar 15 09:31:36 2013 elapsed time 16:28:51 Fri Mar 15 09:45:42 2013 Fri Mar 15 09:45:42 2013 Fri Mar 15 09:45:42 2013 Msieve v. 1.50 (SVN exported) Fri Mar 15 09:45:42 2013 random seeds: e392a192 464a7410 Fri Mar 15 09:45:42 2013 factoring 4784427753962229503583191777575386925462640502543527013793934480234680863804447852383959785408791045459809147067083157248015897910382151758867576620242257524246139326208569043470479714282260046673050230392057658284742406595942226610043596316622243579005395853667131475327572196568483 (283 digits) Fri Mar 15 09:45:44 2013 searching for 15digit factors Fri Mar 15 09:45:45 2013 commencing number field sieve (283digit input) Fri Mar 15 09:45:45 2013 R0: 1829715316371090533839726975772594414416841479201 Fri Mar 15 09:45:45 2013 R1: 1 Fri Mar 15 09:45:45 2013 A0: 2801 Fri Mar 15 09:45:45 2013 A1: 0 Fri Mar 15 09:45:45 2013 A2: 0 Fri Mar 15 09:45:45 2013 A3: 0 Fri Mar 15 09:45:45 2013 A4: 0 Fri Mar 15 09:45:45 2013 A5: 0 Fri Mar 15 09:45:45 2013 A6: 1 Fri Mar 15 09:45:45 2013 skew 3.75, size 4.527e14, alpha 1.144, combined = 1.075e14 rroots = 2 Fri Mar 15 09:45:45 2013 Fri Mar 15 09:45:45 2013 commencing square root phase Fri Mar 15 09:45:45 2013 reading relations for dependency 1 Fri Mar 15 09:46:27 2013 read 15644347 cycles Fri Mar 15 09:46:53 2013 cycles contain 43719780 unique relations Fri Mar 15 10:00:12 2013 read 43719780 relations Fri Mar 15 10:06:03 2013 multiplying 43719780 relations Fri Mar 15 11:00:53 2013 multiply complete, coefficients have about 1248.28 million bits Fri Mar 15 11:00:58 2013 initial square root is modulo 397633 Fri Mar 15 12:40:46 2013 sqrtTime: 10501 Fri Mar 15 12:40:46 2013 prp93 factor: 320275002207928618516974240639287722386404036901633806415927872193124566002965261773254007319 Fri Mar 15 12:40:46 2013 prp191 factor: 14938498855605621302344771462931389549628225945089317981474820437041872729327482659006891215676069205302760273883150980375684105951144826837014488372171976593391874894148865313908766630812757 Fri Mar 15 12:40:46 2013 elapsed time 02:55:04[/code] 
Congrats! Nice sizeable job.
Wouldn't you want to get your name on the board with A.Cunnigham, The Cabal, A.K. Lenstra, Wagstaff, Dodson, Leyland, and many others? You could reserve [URL="http://mersenneforum.org/showthread.php?p=330015#post330015"]quite a few numbers[/URL] there with your throughput. (Especially those that are marked with MWanted?) 
Welcome to the big leagues ryanp :)

Congratulation! :bounce wave:

OK, help needed, part 2. :)
I'm working on the remaining 247digit cofactor of np_135 = 135^135 + 136^136, [url]http://factordb.com/index.php?id=1100000000295975717[/url]. So far I've sieved over a billion input relations, which leads to about 740M unique relations in filtering: [code]Sun Mar 24 08:28:31 2013 commencing relation filtering Sun Mar 24 08:28:31 2013 estimated available RAM is 64442.4 MB Sun Mar 24 08:28:31 2013 commencing duplicate removal, pass 1 Sun Mar 24 10:54:01 2013 error 9 reading relation 609926392 Sun Mar 24 10:54:36 2013 error 15 reading relation 613889282 Sun Mar 24 12:46:39 2013 skipped 985917 relations with b > 2^32 Sun Mar 24 12:46:39 2013 found 334458904 hash collisions in 1090446680 relations Sun Mar 24 12:47:39 2013 added 348416 free relations Sun Mar 24 12:47:39 2013 commencing duplicate removal, pass 2 Sun Mar 24 13:10:52 2013 found 347797003 duplicates and 742998093 unique relations Sun Mar 24 13:10:52 2013 memory use: 8780.0 MB Sun Mar 24 13:10:53 2013 reading ideals above 917307392 Sun Mar 24 13:10:53 2013 commencing singleton removal, initial pass[/code] I'm able to generate a matrix with "D 130" (target density; I think the flag was renamed after msieve 1.50). But once it actually gets to linear algebra: [code]Mon Mar 25 00:08:23 2013 commencing linear algebra Mon Mar 25 00:08:56 2013 read 47704177 cycles Mon Mar 25 00:10:49 2013 cycles contain 153106684 unique relations Mon Mar 25 01:03:03 2013 read 153106684 relations Mon Mar 25 01:09:43 2013 using 20 quadratic characters above 4294917296 Mon Mar 25 01:21:41 2013 building initial matrix Mon Mar 25 02:00:59 2013 memory use: 20663.6 MB Mon Mar 25 02:07:59 2013 read 47704177 cycles Mon Mar 25 02:08:08 2013 matrix is 47712511 x 47704177 (22480.6 MB) with weight 6449966745 (135.21/col) Mon Mar 25 02:08:08 2013 sparse part has weight 5368409052 (112.54/col) Mon Mar 25 02:26:05 2013 filtering completed in 2 passes Mon Mar 25 02:26:17 2013 matrix is 47703563 x 47698304 (22478.7 MB) with weight 6449395982 (135.21/col) Mon Mar 25 02:26:17 2013 sparse part has weight 5367974708 (112.54/col) Mon Mar 25 02:41:36 2013 matrix starts at (0, 0) Mon Mar 25 02:41:45 2013 matrix is 47703563 x 47698304 (22478.7 MB) with weight 6449395982 (135.21/col) Mon Mar 25 02:41:45 2013 sparse part has weight 5367974708 (112.54/col) Mon Mar 25 02:41:45 2013 matrix needs more columns than rows; try adding 23% more relations[/code] Is further sieving really required here? Should I try lowering the target density? (That seems like it will produce a matrix that will take even longer to solve, though). Or something else? 
Lowering the target density a little should do it.

Lowering density may help, but sieving some more will help more.
A 47M matrix shows that you are at the "cusp of convergence". :rolleyes: 
Can you post the full log from the filtering and LA? This is exactly what happened when Paul Zimmermann used Msieve on a dataset for RSA704, that was slightly undersieved. My concern is there's a bug in the code somewhere that is making the model of the matrix used in the filtering diverge from the actual underlying relations. The LA counts the rows and columns of the matrix exactly, starting only from the original relations, and it's suspicious that even the initial result of the matrix build clearly does not have any excess columns.

[QUOTE=jasonp;334926]Can you post the full log from the filtering and LA? This is exactly what happened when Paul Zimmermann used Msieve on a dataset for RSA704, that was slightly undersieved. My concern is there's a bug in the code somewhere that is making the model of the matrix used in the filtering diverge from the actual underlying relations. The LA counts the rows and columns of the matrix exactly, starting only from the original relations, and it's suspicious that even the initial result of the matrix build clearly does not have any excess columns.[/QUOTE]
Seems like it's too big to fit on a post here, so I've pasted it here: [url]http://pastebin.com/SaZwK3iq[/url] 
[QUOTE=ryanp;334898]OK, help needed, part 2. :)
I'm working on the remaining 247digit cofactor of np_135 = 135^135 + 136^136, [url]http://factordb.com/index.php?id=1100000000295975717[/url]. So far I've sieved over a billion input relations, which leads to about 740M unique relations in filtering: [code]Sun Mar 24 08:28:31 2013 commencing relation filtering Sun Mar 24 08:28:31 2013 estimated available RAM is 64442.4 MB Sun Mar 24 08:28:31 2013 commencing duplicate removal, pass 1 Sun Mar 24 10:54:01 2013 error 9 reading relation 609926392 Sun Mar 24 10:54:36 2013 error 15 reading relation 613889282 Sun Mar 24 12:46:39 2013 skipped 985917 relations with b > 2^32 Sun Mar 24 12:46:39 2013 found 334458904 hash collisions in 1090446680 relations Sun Mar 24 12:47:39 2013 added 348416 free relations Sun Mar 24 12:47:39 2013 commencing duplicate removal, pass 2 Sun Mar 24 13:10:52 2013 found 347797003 duplicates and 742998093 unique relations Sun Mar 24 13:10:52 2013 memory use: 8780.0 MB Sun Mar 24 13:10:53 2013 reading ideals above 917307392 Sun Mar 24 13:10:53 2013 commencing singleton removal, initial pass[/code] I'm able to generate a matrix with "D 130" (target density; I think the flag was renamed after msieve 1.50). But once it actually gets to linear algebra: [code]Mon Mar 25 00:08:23 2013 commencing linear algebra Mon Mar 25 00:08:56 2013 read 47704177 cycles Mon Mar 25 00:10:49 2013 cycles contain 153106684 unique relations Mon Mar 25 01:03:03 2013 read 153106684 relations Mon Mar 25 01:09:43 2013 using 20 quadratic characters above 4294917296 Mon Mar 25 01:21:41 2013 building initial matrix Mon Mar 25 02:00:59 2013 memory use: 20663.6 MB Mon Mar 25 02:07:59 2013 read 47704177 cycles Mon Mar 25 02:08:08 2013 matrix is 47712511 x 47704177 (22480.6 MB) with weight 6449966745 (135.21/col) Mon Mar 25 02:08:08 2013 sparse part has weight 5368409052 (112.54/col) Mon Mar 25 02:26:05 2013 filtering completed in 2 passes Mon Mar 25 02:26:17 2013 matrix is 47703563 x 47698304 (22478.7 MB) with weight 6449395982 (135.21/col) Mon Mar 25 02:26:17 2013 sparse part has weight 5367974708 (112.54/col) Mon Mar 25 02:41:36 2013 matrix starts at (0, 0) Mon Mar 25 02:41:45 2013 matrix is 47703563 x 47698304 (22478.7 MB) with weight 6449395982 (135.21/col) Mon Mar 25 02:41:45 2013 sparse part has weight 5367974708 (112.54/col) Mon Mar 25 02:41:45 2013 matrix needs more columns than rows; try adding 23% more relations[/code] Is further sieving really required here? Should I try lowering the target density? (That seems like it will produce a matrix that will take even longer to solve, though). Or something else?[/QUOTE] A theoretical tidbit for large NFS jobs: A calculus of variations analysis (similar to the one in my Optimal Paramterization of SNFS paper) performed on GNFS reveals the following. If one wants to optimize the SIZE of the sieving regions as a function of the special q, then the size should be proportional to the probability of finding a smooth relation with that particular q. This means that since smaller q's have higher yields that one should use much LARGER sieve regions for the smaller q, then gradually reduce the size of the region as q increases. This in intuitively obvious: one should want to do the most sieving when the probability of success is highest. How large should the first region be? It depends on the size of the factor base, and the number and size of the large primes that get accepted in relations. A question of how quickly one should shrink the sieve area gets a similar response. In practice, one does trial sieving to answer these questions. Note that using this strategy is problematic in practice for the Franke/ Kleinjung approach for building the subsieve region. Their construction requires that the size be a power of two (for efficiency regions). Making it a nonpoweroftwo would kill any gain one might get from the optimal sizing strategy that I suggest above. My siever, on the other hand can use an arbitrarily sized sieve region with no loss of efficiency. I have tried the suggested strategy and it does reduce total sieving time for numbers in the range that I can do. 
[QUOTE=Batalov;334922]Lowering density may help, but sieving some more will help more.
A 47M matrix shows that you are at the "cusp of convergence". :rolleyes:[/QUOTE] FYI, even lowering to D 100 produces the same result: [code]Tue Mar 26 02:08:53 2013 commencing linear algebra Tue Mar 26 02:09:11 2013 read 50682177 cycles Tue Mar 26 02:10:51 2013 cycles contain 153108686 unique relations Tue Mar 26 02:54:57 2013 read 153108686 relations Tue Mar 26 02:59:39 2013 using 20 quadratic characters above 4294917296 Tue Mar 26 03:11:41 2013 building initial matrix Tue Mar 26 03:48:02 2013 memory use: 20613.6 MB Tue Mar 26 03:53:34 2013 read 50682177 cycles Tue Mar 26 03:53:45 2013 matrix is 50690200 x 50682177 (20516.5 MB) with weight 5915593165 (116.72/col) Tue Mar 26 03:53:45 2013 sparse part has weight 4820773930 (95.12/col) Tue Mar 26 04:06:31 2013 filtering completed in 2 passes Tue Mar 26 04:06:46 2013 matrix is 50678722 x 50673715 (20514.6 MB) with weight 5915011768 (116.73/col) Tue Mar 26 04:06:46 2013 sparse part has weight 4820374768 (95.13/col) Tue Mar 26 04:17:50 2013 matrix starts at (0, 0) Tue Mar 26 04:18:02 2013 matrix is 50678722 x 50673715 (20514.6 MB) with weight 5915011768 (116.73/col) Tue Mar 26 04:18:02 2013 sparse part has weight 4820374768 (95.13/col) Tue Mar 26 04:18:02 2013 matrix needs more columns than rows; try adding 23% more relations[/code] I guess I'll try sieving more. But I'm worried that I've hit a bug somewhere... 
Jason may be on to something with his post. The filtering log looks as if it should converge much better than it did; the initial excess looked quite nice...
This is a degree 7 project. Could degree 7 throw off something in the filtering process? 
I doubt the filtering cares about degree 7.
Ryanp, could you apply the following patch to the latest SVN and see if the filtering produces a matrix that works? This is a workaround for a known limit in the maximum size of an array used internally in the filtering; it will make one of the filtering steps slightly less efficient but that's better than making it fail silently and randomly. [code]  common/filter/clique.c (revision 871) +++ common/filter/clique.c (working copy) @@ 328,6 +328,13 @@ curr_relation = relation_array; for (i = 0; i < num_relations; i++) { + uint64 relation_array_word = + ((uint32 *)curr_relation  + (uint32 *)relation_array); + + if (relation_array_word > (uint32)(1)) + break; + /* for each ideal in the relation */ for (j = 0; j < curr_relation>ideal_count; j++) { @@ 347,8 +354,7 @@ sizeof(ideal_relation_t)); } reverse_array[num_reverse].relation_array_word =  (uint32)((uint32 *)curr_relation   (uint32 *)relation_array); + (uint32)relation_array_word; reverse_array[num_reverse].next = ideal_map[ideal].payload; ideal_map[ideal].payload = num_reverse++; [/code] 
So I'm trying np_135 with a degree6 poly. I have 1 billion *unique* input relations (after some prefiltering with remdups4: [url]https://github.com/radii/ggnfs/blob/master/contrib/remdups/remdups4.c[/url]).
Now, the results are even stranger. msieve in the nc1 stage seems to be progressing fine, but at the "commencing full merge" step it just spontaneously quits: [code]Sat Apr 6 14:46:43 2013 found 488933189 duplicates and 1045319666 unique relations Sat Apr 6 14:46:43 2013 memory use: 8780.0 MB Sat Apr 6 14:46:45 2013 reading ideals above 1180368896 Sat Apr 6 14:46:45 2013 commencing singleton removal, initial pass Sat Apr 6 20:39:22 2013 memory use: 11024.0 MB Sat Apr 6 20:39:23 2013 reading all ideals from disk Sat Apr 6 20:45:46 2013 memory use: 17418.8 MB Sat Apr 6 20:46:44 2013 commencing inmemory singleton removal Sat Apr 6 20:47:44 2013 begin with 1045319666 relations and 606983981 unique ideals Sat Apr 6 20:58:04 2013 reduce to 897226224 relations and 452105348 ideals in 9 passes Sat Apr 6 20:58:04 2013 max relations containing the same ideal: 43 Sat Apr 6 20:59:14 2013 reading ideals above 720000 Sat Apr 6 20:59:16 2013 commencing singleton removal, initial pass Sun Apr 7 04:57:11 2013 memory use: 11024.0 MB Sun Apr 7 04:57:14 2013 reading all ideals from disk Sun Apr 7 05:09:45 2013 memory use: 41279.9 MB Sun Apr 7 05:13:55 2013 keeping 565821661 ideals with weight <= 200, target excess is 5301508 Sun Apr 7 05:18:12 2013 commencing inmemory singleton removal Sun Apr 7 05:20:34 2013 begin with 897226224 relations and 565821661 unique ideals Sun Apr 7 05:38:04 2013 reduce to 897145674 relations and 565741109 ideals in 7 passes Sun Apr 7 05:38:04 2013 max relations containing the same ideal: 200 Sun Apr 7 05:46:41 2013 removing 7725870 relations and 6066094 ideals in 2000000 cliques Sun Apr 7 05:46:59 2013 commencing inmemory singleton removal Sun Apr 7 05:49:16 2013 begin with 889419804 relations and 565741109 unique ideals Sun Apr 7 06:10:48 2013 reduce to 886565127 relations and 562866789 ideals in 9 passes ...<snip>... Sun Apr 7 22:40:28 2013 reduce to 181718136 relations and 169856367 ideals in 6 passes Sun Apr 7 22:40:28 2013 max relations containing the same ideal: 75 Sun Apr 7 22:42:23 2013 removing 6493084 relations and 4493084 ideals in 2000000 cliques Sun Apr 7 22:42:28 2013 commencing inmemory singleton removal Sun Apr 7 22:42:48 2013 begin with 175225052 relations and 169856367 unique ideals Sun Apr 7 22:44:58 2013 reduce to 175037719 relations and 165173835 ideals in 6 passes Sun Apr 7 22:44:58 2013 max relations containing the same ideal: 71 Sun Apr 7 22:46:49 2013 removing 6527275 relations and 4527275 ideals in 2000000 cliques Sun Apr 7 22:46:54 2013 commencing inmemory singleton removal Sun Apr 7 22:47:14 2013 begin with 168510444 relations and 165173835 unique ideals Sun Apr 7 22:49:23 2013 reduce to 168303311 relations and 160436917 ideals in 6 passes Sun Apr 7 22:49:23 2013 max relations containing the same ideal: 70 Sun Apr 7 22:51:09 2013 removing 5851606 relations and 4134962 ideals in 1716644 cliques Sun Apr 7 22:51:13 2013 commencing inmemory singleton removal Sun Apr 7 22:51:33 2013 begin with 162451705 relations and 160436917 unique ideals Sun Apr 7 22:53:37 2013 reduce to 162293013 relations and 156141699 ideals in 6 passes Sun Apr 7 22:53:37 2013 max relations containing the same ideal: 69 Sun Apr 7 22:55:44 2013 relations with 0 large ideals: 0 Sun Apr 7 22:55:44 2013 relations with 1 large ideals: 173655 Sun Apr 7 22:55:44 2013 relations with 2 large ideals: 1978706 Sun Apr 7 22:55:44 2013 relations with 3 large ideals: 9536170 Sun Apr 7 22:55:44 2013 relations with 4 large ideals: 25297029 Sun Apr 7 22:55:44 2013 relations with 5 large ideals: 40738077 Sun Apr 7 22:55:44 2013 relations with 6 large ideals: 41638132 Sun Apr 7 22:55:44 2013 relations with 7+ large ideals: 42931244 Sun Apr 7 22:55:44 2013 commencing 2way merge Sun Apr 7 22:58:09 2013 reduce to 117355986 relation sets and 111204670 unique ideals Sun Apr 7 22:58:09 2013 commencing full merge[/code] I'm not sure what to do at this point. Is this another bug? Is msieve able to handle jobs this big (1B unique relations, target density 120)? 
Did this run have the source patched to include the changes two posts back?
You should have plenty of relations (even your first run should have at least had enough to build a matrix) 
[QUOTE=jasonp;336459]Did this run have the source patched to include the changes two posts back?
You should have plenty of relations (even your first run should have at least had enough to build a matrix)[/QUOTE] Yes, I'm running an msieve build from SVN (revision 879) with the clique.c patch from several posts back. It still just seems to choke during the filtering stage and exits with no output, despite the fact that I have tons of input relations... Here's another example: [url]http://www.factordb.com/index.php?id=1100000000516059656[/url] with 900M input relations and target_density=120. [code]Sat Apr 20 13:55:54 2013 max relations containing the same ideal: 113 Sat Apr 20 13:58:34 2013 relations with 0 large ideals: 92549 Sat Apr 20 13:58:34 2013 relations with 1 large ideals: 37921 Sat Apr 20 13:58:34 2013 relations with 2 large ideals: 410319 Sat Apr 20 13:58:34 2013 relations with 3 large ideals: 2680359 Sat Apr 20 13:58:34 2013 relations with 4 large ideals: 10361767 Sat Apr 20 13:58:34 2013 relations with 5 large ideals: 25184023 Sat Apr 20 13:58:34 2013 relations with 6 large ideals: 39791171 Sat Apr 20 13:58:34 2013 relations with 7+ large ideals: 86618563 Sat Apr 20 13:58:34 2013 commencing 2way merge Sat Apr 20 14:01:40 2013 reduce to 105502139 relation sets and 102713290 unique ideals Sat Apr 20 14:01:40 2013 commencing full merge[/code] Maybe I just have too few relations for this target density? But it's strange that msieve just quits during the full merge without any sort of useful output. 
Just recently, an msieve binary that was built with pgcc (instead of gcc) had similar problems of sudden death. Was your build a gcc build?
Another thing could be the platform/arch mismatch (in the SVN Makefile, there's a rigid line OPT_FLAGS = O3 fomitframepointer march=core2 \) Would you try march=native or a specific value? 
If this is in linux, could it be the OOMkiller?

So this doesn't seem to be an OOM or architecture issue, as far as I can tell. Running some of my recent jobs with target_density=90 (again, msieve built from revision 879 in SVN) seems to be working.
With target_density=120, it will just exit silently without writing anything to the log. Is this a bug, or do I just need to sieve for more relations? 
To be clear, the filtering completes with target density 90 but then the LA destroys the result on startup?

No, what seems to be happening is:
target_density=90 works fine, produces a valid matrix, which can then be solved successfully in nc2 and nc3. target_density=120 just exits silently in the full merge (if not earlier). This is what seems to be happening. I need to do a few more test runs to confirm it (and with projects this large each one takes a few days...) 
What OS is it running under? If it's Linux or some other UNIX variant check for:
core files in the working directory, messages in /var/log/messages (or anywhere else syslogd writes to), sar reports of memory use. Also what CPUs has it got (on Linux /proc/cpuinfo tells you)? How much memory has it got (see /proc/meminfo)? Chris 
Tried this job on another system with the default target_density. Here's the output:
[code]found 188589196 hash collisions in 1045319671 relations commencing duplicate removal, pass 2 found 5 duplicates and 1045319666 unique relations memory use: 8268.0 MB reading ideals above 619380736 commencing singleton removal, initial pass failed to reallocate 1895825408 bytes[/code] And, earlier: [code]Sun Apr 28 09:28:42 2013 estimated available RAM is 32177.6 MB[/code] This is an 8core Opteron 2384 system (2.7 GHz) and /proc/meminfo confirms msieve's estimate of available RAM. [code]MemTotal: 32949884 kB[/code] 
The first singleton removal pass has the worst memory use of all the filtering steps, though it's odd that such a small reallocation failed on a machine with so much memory.

[QUOTE=jasonp;338750]The first singleton removal pass has the worst memory use of all the filtering steps, though it's odd that such a small reallocation failed on a machine with so much memory.[/QUOTE]
Probably caused by memory fragmentation and lack of a garbage collector. Realloc requires contiguous memory. 
[QUOTE=R.D. Silverman;338764]Probably caused by memory fragmentation and lack of a garbage collector.
Realloc requires contiguous memory.[/QUOTE]Yup, BTDT. 32bit Windoze was notorious for it. 
Is this the latest SVN? Paul Zimmermann found a bug in v1.50 that caused crashes for memory allocations that crossed a 2GB boundary...

All times are UTC. The time now is 08:21. 
Powered by vBulletin® Version 3.8.11
Copyright ©2000  2020, Jelsoft Enterprises Ltd.