mersenneforum.org (https://www.mersenneforum.org/index.php)

 EdH 2020-04-14 12:44

After too many restarts to kick the server into issuing WUs (I really thought I was using an earlier version that never did this before.), I'm closing in on the target relations.

@Curtis, What was the suggested density for my msieve matrix?

 VBCurtis 2020-04-14 16:22

For C175ish, and your ratio of sieving to matrix resources, I'd go for 116 or 120.

For a normal person (say, using one or two machines), I'd go for 105-110.

 VBCurtis 2020-04-14 16:34

[QUOTE=charybdis;542616]
Yes, even if I=15 turns out to be faster I guess 32/32 will give a better data point. I suppose the double large prime bounds should go up too?
[/QUOTE]

Yep! Mfb should be 60 on both sides; lambda0 and lambda1 are floating-point controls of mfb, in effect- multiply lambda by LP and you get the effective MFB that CADO is using. I round that up to choose mfb, but the choice isn't doing much because lambda is a tighter control.

1.85 * 32 = 59.2, so I used 60. With that lambda setting, only cofactors that split into one factor smaller than 27.2 (and thus the other bigger than 32) bits are wasted effort. 2^27.2 is 154M, so with lim set to that value there is no wasted cofactorization effort- every split yields a relation.

This also shows you how I compute lambda: If lim1 is raised to 180M, that's 27.43 bits. 27.43/32 is 0.857, so lambda1 should be raised to at least 1.86, since no cofactor can split into a prime smaller than lim. Adding 0.01 raises yield and doesn't seem to require too many more relations, so perhaps try 1.87 for lambda1, and 1.85 or 1.855 for lambda0 if you use 32/32 and 140/180M for lim's.

 charybdis 2020-04-14 17:01

Done a few more filtering runs for the first c177.
250M relations gave:
[code]matrix is 13947683 x 13947907 (5100.4 MB) with weight 1347451665 (96.61/col)
[/code]
245M gave:
[code]matrix is 14494203 x 14494428 (5303.1 MB) with weight 1399857738 (96.58/col)
[/code]
240M wasn't enough to build a matrix. These were all at target density 90; higher would presumably be better but we're probably talking savings of less than a day (the 14.5M matrix had an ETA of ~81h).

What's optimal will depend heavily on both the individual setup and whether the aim is to be able to factor a single c17x as quickly as possible (for which more relations is better for the faster matrix) or to factor lots in succession, as then the matrix can be left running while the next one sieves.

[QUOTE=VBCurtis;542646]Yep! Mfb should be 60 on both sides; lambda0 and lambda1 are floating-point controls of mfb, in effect- multiply lambda by LP and you get the effective MFB that CADO is using. I round that up to choose mfb, but the choice isn't doing much because lambda is a tighter control.

1.85 * 32 = 59.2, so I used 60. With that lambda setting, only cofactors that split into one factor smaller than 27.2 (and thus the other bigger than 32) bits are wasted effort. 2^27.2 is 154M, so with lim set to that value there is no wasted cofactorization effort- every split yields a relation.

This also shows you how I compute lambda: If lim1 is raised to 180M, that's 27.43 bits. 27.43/32 is 0.857, so lambda1 should be raised to at least 1.86, since no cofactor can split into a prime smaller than lim. Adding 0.01 raises yield and doesn't seem to require too many more relations, so perhaps try 1.87 for lambda1, and 1.85 or 1.855 for lambda0 if you use 32/32 and 140/180M for lim's.[/QUOTE]

Thanks again! I've started the next c177.

 EdH 2020-04-16 14:06

@Curtis:

Sieving is complete and I am now trying to run msieve for the LA. Here is a portion of the log:
[code]
PID15335 2020-04-16 09:03. . .
Debug:Lattice Sieving: stderr is: b"# redoing q=93240031, rho=4478888 because 1s buckets are full\n# Fullest level-1s bucket #1090, wrote 3135/3072\n# Average J=15601 for 558 special-q's, max bucket fill -bkmult 1,1s:1.07153\n# Discarded 0 special-q's out of 558 pushed\n# Wasted cpu time due to 1 bkmult adjustments: 8.39\n# Total cpu time 8830.48s [norm 7.92+19.5, sieving 8481.8 (4614.5 + 259.7 + 3607.7), factor 321.2 (321.0 + 0.2)] (not incl wasted time)\n# Total elapsed time 1299.00s, per special-q 2.32796s, per relation 0.0650575s\n# PeakMemusage (MB) = 3094 \n# Total 19967 reports [0.442s/r, 35.8r/sq] in 1.3e+03 elapsed s [679.8% CPU]\n"
Debug:Lattice Sieving: Newly arrived stats: {'stats_avg_J': '15601.0 558', 'stats_total_time': '1299.0', 'stats_total_cpu_time': '8830.48', 'stats_max_bucket_fill': '1,1s:1.07153'}
Debug:Lattice Sieving: Combined stats: {'stats_avg_J': '16023.856554636326 5345903', 'stats_total_time': '21977179.10000008', 'stats_total_cpu_time': '93963886.25000069', 'stats_max_bucket_fill': '1.0,1s:1.416720'}
. . .
Info:Lattice Sieving: Aggregate statistics:
Info:Lattice Sieving: Total number of relations: 270013333
Info:Lattice Sieving: Average J: 16023.9 for 5345903 special-q, max bucket fill -bkmult 1.0,1s:1.416720
Info:Lattice Sieving: Total time: 2.19772e+07s
Info:Filtering - Duplicate Removal, splitting pass: Stopping at duplicates1
[/code]Is there something else of value I should seek for you?

 EdH 2020-04-16 16:04

t_d=120 didn't build:
[code]
Thu Apr 16 10:03:48 2020
Thu Apr 16 10:03:48 2020
Thu Apr 16 10:03:48 2020 Msieve v. 1.54 (SVN 1018)
Thu Apr 16 10:03:48 2020 random seeds: f85ef96f 7298ed39
Thu Apr 16 10:03:48 2020 factoring 76552370139504036674890813564032281493867343366619508594816489005834882856199128873928842970710045044111574726594936894404957063604759585302342441093226844531070349677623657609 (176 digits)
Thu Apr 16 10:03:49 2020 searching for 15-digit factors
Thu Apr 16 10:03:50 2020 commencing number field sieve (176-digit input)
Thu Apr 16 10:03:50 2020 R0: -10749206376460432970317818596117873
Thu Apr 16 10:03:50 2020 R1: 4023609444811856477743
Thu Apr 16 10:03:50 2020 A0: 91389778824609164214454779424151524400880
Thu Apr 16 10:03:50 2020 A1: 16573333756774205759678902993899502
Thu Apr 16 10:03:50 2020 A2: -8753197000583595457254903663
Thu Apr 16 10:03:50 2020 A3: -1186820920867031701728
Thu Apr 16 10:03:50 2020 A4: 77519198521772
Thu Apr 16 10:03:50 2020 A5: 533400
Thu Apr 16 10:03:50 2020 skew 1.00, size 3.822e-17, alpha -6.645, combined = 8.280e-16 rroots = 5
Thu Apr 16 10:03:50 2020
Thu Apr 16 10:03:50 2020 commencing relation filtering
Thu Apr 16 10:03:50 2020 setting target matrix density to 120.0
Thu Apr 16 10:03:50 2020 estimated available RAM is 15926.6 MB
Thu Apr 16 10:03:50 2020 commencing duplicate removal, pass 1
Thu Apr 16 10:03:51 2020 error -1 reading relation 189590
. . .
Thu Apr 16 10:38:29 2020 error -1 reading relation 267865222
Thu Apr 16 10:38:58 2020 found 88770665 hash collisions in 271761268 relations
Thu Apr 16 10:39:31 2020 added 122298 free relations
Thu Apr 16 10:39:31 2020 commencing duplicate removal, pass 2
Thu Apr 16 10:46:02 2020 found 125086447 duplicates and 146797119 unique relations
Thu Apr 16 10:46:02 2020 memory use: 1449.5 MB
Thu Apr 16 10:46:03 2020 reading ideals above 139919360
Thu Apr 16 10:46:03 2020 commencing singleton removal, initial pass
Thu Apr 16 11:01:31 2020 memory use: 3012.0 MB
Thu Apr 16 11:01:31 2020 reading all ideals from disk
Thu Apr 16 11:01:49 2020 memory use: 2357.7 MB
Thu Apr 16 11:01:53 2020 commencing in-memory singleton removal
Thu Apr 16 11:01:57 2020 begin with 146797119 relations and 142617272 unique ideals
Thu Apr 16 11:02:39 2020 reduce to 56506171 relations and 38799310 ideals in 18 passes
Thu Apr 16 11:02:39 2020 max relations containing the same ideal: 21
Thu Apr 16 11:02:42 2020 reading ideals above 720000
Thu Apr 16 11:02:42 2020 commencing singleton removal, initial pass
Thu Apr 16 11:13:13 2020 memory use: 1506.0 MB
Thu Apr 16 11:13:13 2020 reading all ideals from disk
Thu Apr 16 11:13:30 2020 memory use: 2241.3 MB
Thu Apr 16 11:13:36 2020 keeping 54258109 ideals with weight <= 200, target excess is 313347
Thu Apr 16 11:13:42 2020 commencing in-memory singleton removal
Thu Apr 16 11:13:47 2020 begin with 56506171 relations and 54258109 unique ideals
Thu Apr 16 11:14:50 2020 reduce to 56030216 relations and 53781539 ideals in 13 passes
Thu Apr 16 11:14:50 2020 max relations containing the same ideal: 200
Thu Apr 16 11:15:17 2020 removing 3684525 relations and 3284525 ideals in 400000 cliques
Thu Apr 16 11:15:18 2020 commencing in-memory singleton removal
Thu Apr 16 11:15:23 2020 begin with 52345691 relations and 53781539 unique ideals
Thu Apr 16 11:16:08 2020 reduce to 52174610 relations and 50324306 ideals in 10 passes
Thu Apr 16 11:16:08 2020 max relations containing the same ideal: 197
Thu Apr 16 11:16:33 2020 removing 2772195 relations and 2372195 ideals in 400000 cliques
Thu Apr 16 11:16:34 2020 commencing in-memory singleton removal
Thu Apr 16 11:16:38 2020 begin with 49402415 relations and 50324306 unique ideals
Thu Apr 16 11:17:17 2020 reduce to 49291975 relations and 47840809 ideals in 9 passes
Thu Apr 16 11:17:17 2020 max relations containing the same ideal: 190
Thu Apr 16 11:17:40 2020 removing 2488158 relations and 2088158 ideals in 400000 cliques
Thu Apr 16 11:17:41 2020 commencing in-memory singleton removal
Thu Apr 16 11:17:45 2020 begin with 46803817 relations and 47840809 unique ideals
Thu Apr 16 11:18:22 2020 reduce to 46708746 relations and 45656840 ideals in 9 passes
Thu Apr 16 11:18:22 2020 max relations containing the same ideal: 185
Thu Apr 16 11:18:43 2020 removing 2334687 relations and 1934687 ideals in 400000 cliques
Thu Apr 16 11:18:44 2020 commencing in-memory singleton removal
Thu Apr 16 11:18:49 2020 begin with 44374059 relations and 45656840 unique ideals
Thu Apr 16 11:19:19 2020 reduce to 44283467 relations and 43630836 ideals in 8 passes
Thu Apr 16 11:19:19 2020 max relations containing the same ideal: 182
Thu Apr 16 11:19:40 2020 removing 1701806 relations and 1412658 ideals in 289148 cliques
Thu Apr 16 11:19:41 2020 commencing in-memory singleton removal
Thu Apr 16 11:19:44 2020 begin with 42581661 relations and 43630836 unique ideals
Thu Apr 16 11:20:14 2020 reduce to 42532132 relations and 42168363 ideals in 8 passes
Thu Apr 16 11:20:14 2020 max relations containing the same ideal: 176
Thu Apr 16 11:20:41 2020 relations with 0 large ideals: 1038
Thu Apr 16 11:20:41 2020 relations with 1 large ideals: 1550
Thu Apr 16 11:20:41 2020 relations with 2 large ideals: 21593
Thu Apr 16 11:20:41 2020 relations with 3 large ideals: 198057
Thu Apr 16 11:20:41 2020 relations with 4 large ideals: 1072004
Thu Apr 16 11:20:41 2020 relations with 5 large ideals: 3623143
Thu Apr 16 11:20:41 2020 relations with 6 large ideals: 7969344
Thu Apr 16 11:20:41 2020 relations with 7+ large ideals: 29645403
Thu Apr 16 11:20:41 2020 commencing 2-way merge
Thu Apr 16 11:21:10 2020 reduce to 25695897 relation sets and 25332128 unique ideals
Thu Apr 16 11:21:10 2020 commencing full merge
Thu Apr 16 11:40:43 2020 memory use: 1167.7 MB
Thu Apr 16 11:40:44 2020 found 84937 cycles, need 5310047
Thu Apr 16 11:40:44 2020 too few cycles, matrix probably cannot build
[/code]More relations or less dense? I'll try both while waiting for your reply.

 VBCurtis 2020-04-16 16:39

The C177 from above had 156M unique relations and built a nice matrix. You have 146M, so your poly happened to generate more duplicate relations. I'd aim for that 155M unique number since it worked well on the C177; your duplicate rate is quite high, so something like 20M more raw relations might get you there. 25M wouldn't be bad.

 VBCurtis 2020-04-16 16:47

[QUOTE=EdH;542856]@Curtis:

Sieving is complete and I am now trying to run msieve for the LA. Here is a portion of the log:
Info:Lattice Sieving: Aggregate statistics:
Info:Lattice Sieving: Total number of relations: 270013333
Info:Lattice Sieving: Average J: 16023.9 for 5345903 special-q, max bucket fill -bkmult 1.0,1s:1.416720
Info:Lattice Sieving: Total time: 2.19772e+07s
Info:Filtering - Duplicate Removal, splitting pass: Stopping at duplicates1
[/code]Is there something else of value I should seek for you?[/QUOTE]

This is great! We see that on your farm, 22 million thread-seconds produced 270M raw relations. If we refine params for 170-180 digit problems in the future, we have that time to compare to. In this case, you need another 8% relations or so, so we can add 10% to time and say 24M thread-seconds of sieving for this C176.
Comparison points, using my own params:
RichD did a C150 in 0.7M core-sec of not-HT i5 (I forget what speed).
Only the C150 params had multiple test/refine cycles, the rest were first-guesses like this C175.

 EdH 2020-04-16 17:17

[QUOTE=VBCurtis;542871]The C177 from above had 156M unique relations and built a nice matrix. You have 146M, so your poly happened to generate more duplicate relations. I'd aim for that 155M unique number since it worked well on the C177; your duplicate rate is quite high, so something like 20M more raw relations might get you there. 25M wouldn't be bad.[/QUOTE]
I've told CADO-NFS to run to 300M raw relations, but I can do an msieve test prior to that without disturbing CADO-NFS.

 charybdis 2020-04-16 17:30

Isn't that client-seconds, rather than thread-seconds? Just above it in the log that Ed posted was:
[code]Debug:Lattice Sieving: Combined stats: {'stats_avg_J': '16023.856554636326 5345903', 'stats_total_time': '21977179.10000008', 'stats_total_cpu_time': '93963886.25000069', 'stats_max_bucket_fill': '1.0,1s:1.416720'}[/code]
One of these lines gets printed to the log each time a workunit arrives. From looking at my logs, 'stats_total_time' is client-seconds and 'stats_total_cpu_time' is thread-seconds. The "Total time: 2.19772e+07s" line, from which you're getting 22 million seconds, is using the final 'stats_total_time', which isn't very helpful.

When a CADO job finishes completely (including postprocessing), the time given for the whole factorization appears to use 'stats_total_cpu_time' instead, as it should.

Edit: for comparison, the final such line from my c177 before I ran filtering:
[code]Debug:Lattice Sieving: Combined stats: {'stats_total_cpu_time': '67364457.04999968', 'stats_total_time': '34176880.70000007', 'stats_max_bucket_fill': '1.0,1s:1.081950', 'stats_avg_J': '15365.574654104747 10509757'}[/code]
Ed's c176 seems to have taken significantly longer in CPU-time than the c177, though I'm not sure how much of this is down to what CPUs are being used. Is there any adverse effect from running a larger number of threads per client, as Ed seems be doing?

 VBCurtis 2020-04-16 18:13

Hrmmmmm..... I've been using the screen display "total sieving time" that pops up at the end of the sieve phase, and matches the end-of-factorization summary printed to screen. I can't say I delve into the log file hardly ever.

Ed's timing falls into my curve-fitting of time vs input size from the other jobs I listed, so I am quite confused. I appreciate you pointing out the inconsistency in the log report!

All times are UTC. The time now is 09:38.