mersenneforum.org 3,748+ c204 Smaller-but-Needed
 Register FAQ Search Today's Posts Mark Forums Read

 2021-11-16, 16:37 #254 RichD     Sep 2008 Kansas 52·139 Posts Nice work on the stats. How can I tell how much I contributed? (I'm sure it is less than 1%.)
2021-11-16, 18:59   #255
charybdis

Apr 2020

11·53 Posts

Quote:
 Originally Posted by kruoli Also, Seth added a graph which shows the WU yield against Q. You can quickly spot where we changed the sieving strategy (at least I thought the first big jump is that, but IIRC we switched the strategy later, didn't we?), where I experimented with A=31 and maybe even the few outliers because of A=32.
Nice work! The jump is from the initial change from strategy 2 to strategy 0. The later change back to strategy 2 had a much smaller effect and is basically impossible to spot.

 2021-11-16, 20:15 #256 VBCurtis     "Curtis" Feb 2005 Riverside, CA 10100001000012 Posts Yay for Seth-stats! Also, nice chart for yield per WU. Boo for learning how little I contributed to the job, but seeing how slow (in rel/sec) my machine sieves is a nice reminder to maybe go Ryzen-shopping in 2022. What should our target be for matrix size to say "good enough, let's stop sieving and run the matrix"? 55M? 50? The C207 team sieve was 72M matrix using msieve. Our uniques ratio had gone to hell because we shifted from I=16 to I=15 too early, so sieving more for a smaller matrix wasn't productive.
2021-11-17, 08:20   #257
kruoli

"Oliver"
Sep 2017
Porta Westfalica, DE

2×5×83 Posts

Quote:
 Originally Posted by RichD How can I tell how much I contributed?
PM'ed you. For the final stats, I will add names of those who were okay with this manually and strip out the clients manually or something like this.

Quote:
 Originally Posted by charybdis Nice work!
Thanks.
Quote:
 Originally Posted by charybdis The jump is from the initial change from strategy 2 to strategy 0. The later change back to strategy 2 had a much smaller effect and is basically impossible to spot.
Ah, this is interesting. I wonder how the graph would have turned out if we never changed to strategy 0.

Quote:
 Originally Posted by VBCurtis What should our target be for matrix size to say "good enough, let's stop sieving and run the matrix"? 55M? 50?
Regarding test-(filtering and building a matrix), I tried to learn how to do this yesterday. Please comment on my approach. I used the README.msieve file, but at least some things there changed.
1. I copy all *.gz files in the upload directory of the work directory to a safe temporary work space (second drive in my case).
2. Create a file list of these (one file per line).
3. Compile convert_poly (this should help me avoiding manual work error) and use it to change the poly into msieve format.
4. Call ./check_rels -poly {CADO poly} -out fixed.gz -lpb0 33 -lpb1 34 -filelist {file list file} and let it run. It is quite slow at around quarter a million relations per second and only uses one core. Is this optimisable? Then I have msieve-able relations in fixed.gz, gunzip fixed.gz; mv fixed msieve.dat.
5. Remove free relations (How? Does the grep -aEv '^[0-9]+,0$' relations > relations_fixed approach work here, too?) and precede the relations file with N {number to factor}. 6. Call msieve -nc -v {number to be factored} and look out for the matrix size. If too large, Ctrl+C (I just now realised I wrote Strg+C in this thread multiple times…). 2021-11-17, 11:25 #258 charybdis Apr 2020 11·53 Posts Quote:  Originally Posted by kruoli Regarding test-(filtering and building a matrix), I tried to learn how to do this yesterday. Please comment on my approach. I used the README.msieve file, but at least some things there changed.I copy all *.gz files in the upload directory of the work directory to a safe temporary work space (second drive in my case). Create a file list of these (one file per line). Compile convert_poly (this should help me avoiding manual work error) and use it to change the poly into msieve format. Call ./check_rels -poly {CADO poly} -out fixed.gz -lpb0 33 -lpb1 34 -filelist {file list file} and let it run. It is quite slow at around quarter a million relations per second and only uses one core. Is this optimisable? Then I have msieve-able relations in fixed.gz, gunzip fixed.gz; mv fixed msieve.dat. Remove free relations (How? Does the grep -aEv '^[0-9]+,0$' relations > relations_fixed approach work here, too?) and precede the relations file with N {number to factor}. Call msieve -nc -v {number to be factored} and look out for the matrix size. If too large, Ctrl+C (I just now realised I wrote Strg+C in this thread multiple times…).
I just run cat *.gz >> [msieve directory path]/msieve.dat.gz in the upload directory and then msieve -nc -v -t [threads] in the msieve directory. If you want to avoid creating a completely new msieve.dat.gz each time you run filtering, you can control the Q ranges of the files you're cat-ing, e.g. cat 3_748plus.4????????-?????????.????????.gz >> ... for the 400M-500M range (or in this case even cat 3_748plus.4*.gz, as there aren't any other ranges starting with a 4). Msieve can read gzipped relations as long as it was compiled without NO_ZLIB=1.

There is no need for check_rels. Msieve will check the relations itself. You may end up with some bogus bad relations where the joins between the original *.gz files were; don't worry, you aren't losing any real relations.

CADO does not put free relations in the upload directory, so there won't be any for you to remove. Msieve will add them the first time you run filtering.

Finally, make sure to reverse the signs of all the algebraic coefficients (A6 to A0) in the msieve.fb file, because otherwise msieve will complain about the negative leading coefficient when it finally gets to sqrt.

Last fiddled with by charybdis on 2021-11-17 at 11:28

2021-11-17, 12:16   #259
kruoli

"Oliver"
Sep 2017
Porta Westfalica, DE

2×5×83 Posts

Quote:
 Originally Posted by charybdis I just run cat *.gz >> [msieve directory path]/msieve.dat.gz […]
Unfortunately, the command line flows over in this case. I tried find . -name '*.gz' | while read line; do cat $line >> {target path}/msieve.dat.gz; done, but this was really slow (60 MB/s), then I used find . -name '*.gz' | xargs -d'\n' -L 128 cat >> {target path}/msieve.dat.gz and this worked great (> 1 GB/s)! Edit: Thanks for the suggestions! Last fiddled with by kruoli on 2021-11-17 at 12:21 Reason: Additions. 2021-11-17, 12:35 #260 charybdis Apr 2020 11×53 Posts Quote:  Originally Posted by kruoli Unfortunately, the command line flows over in this case. I tried find . -name '*.gz' | while read line; do cat$line >> {target path}/msieve.dat.gz; done, but this was really slow (60 MB/s), then I used find . -name '*.gz' | xargs -d'\n' -L 128 cat >> {target path}/msieve.dat.gz and this worked great (> 1 GB/s)! Edit: Thanks for the suggestions!
Well I can't say I've ever tried to cat >200k files at once before Glad you found a good solution.

Of course I should have said "A5 to A0" rather than "A6 to A0" in my previous post. Not that it matters much.

2021-11-17, 13:13   #261
swellman

Jun 2012

2×3×13×43 Posts

Quote:
 Originally Posted by charybdis Finally, make sure to reverse the signs of all the algebraic coefficients (A6 to A0) in the msieve.fb file, because otherwise msieve will complain about the negative leading coefficient when it finally gets to sqrt.
Quote:
 Of course I should have said "A5 to A0" rather than "A6 to A0" in my previous post. Not that it matters much.
I believe this issue has been fixed.

2021-11-17, 13:29   #262
charybdis

Apr 2020

58310 Posts

Quote:
 Originally Posted by swellman I believe this issue has been fixed.
Only in Greg's version of msieve, not the "official" one.

 2021-11-17, 15:16 #263 EdH     "Ed Hall" Dec 2009 Adirondack Mtns 47·89 Posts An alternate I use for the relations: Code: for i in *.gz do cat "$i" >> total.gz done zcat total.gz | remdups4 1000 >msieve.dat The remdups4 can be left out. I use it to save Msieve some work and to get an early idea about the uniques. I also wrote a routine to swap the poly to Msieve format, but I believe that CADO-NFS provides one as well. From the CADO-NFS README.msieve: Code: 1) Create a file msieve.fb, which contains: N R0 R1 A0 A1 A2 A3 A4 A5 This can be done with:$ ./convert_poly -of msieve < cxxx.poly > msieve.fb Seth's page looks great! I was working on a way to add some of the data to the original page, but, of course this is much better. Thanks, Seth, for a great resource and helping it work for this project!
2021-11-17, 15:33   #264
EdH

"Ed Hall"
Dec 2009

47×89 Posts

Quote:
 Originally Posted by RichD Nice work on the stats. How can I tell how much I contributed? (I'm sure it is less than 1%.)
This may not be entirely correct, but you should be able to figure out your specific client(s) with a little bit of observation, by checking the last workunit submitted date/time against the date/time shown on you client's las command line. You still need to account for the delay in refresh of the page, and they may be off a little, but I "seem" to be able to identify my clients in that manner.

Edit: Actually the date/time is in the "Subprocess has PID..." line.

Last fiddled with by EdH on 2021-11-17 at 15:38

 Similar Threads Thread Thread Starter Forum Replies Last Post fivemack Factoring 3 2017-09-19 08:52 skan YAFU 6 2013-02-26 13:57 akruppa Factoring 114 2012-08-20 14:01 fortega Data 2 2005-06-16 22:48 marc Factoring 6 2004-10-09 14:17

All times are UTC. The time now is 19:57.

Tue Jan 18 19:57:53 UTC 2022 up 179 days, 14:26, 0 users, load averages: 1.69, 1.84, 1.76