 Originally Posted by WraithX This is my first time contributing to a distributed sieving project like this, so hopefully everything will go ok. One question though: the file to zip up and then upload is the same file from "-o ", correct? -David C.
Not to take the wind out of fivemack's sails, but... Welcome! I like to see projects gain momentum, so it's nice to see new people contributing.

If you already have ggnfs up and running, everything should go smoothly. And yes, that's the file to upload.

 Originally Posted by WraithX I'd like to reserve 60-61. This is my first time contributing to a distributed sieving project like this, so hopefully everything will go ok. One question though: the file to zip up and then upload is the same file from "-o ", correct? -David C.
Welcome to the project! If you have any trouble feel free to PM me. Yes, the file to zip up and upload is the one named by -o; it'll be easy to recognize because it's about 100MB long.

 2008-03-19, 01:54 #14 joral     Mar 2008 5·11 Posts If it's still necessary, I'll take the range from 61-63
 Originally Posted by joral If it's still necessary, I'll take the range from 61-63

 2008-03-19, 10:45 #16 Andi47     Oct 2004 Austria 2·17·73 Posts @Fivemack: Just for statistics: How much relations are in the files (39M-40516687) which I uploaded recently? fivemack: 1443960 Last fiddled with by fivemack on 2008-03-19 at 13:33
 2008-03-19, 23:35 #17 FactorEyes     Oct 2006 vomit_frame_pointer 23·32·5 Posts Sign me up 63-67 please. fivemack: done Last fiddled with by fivemack on 2008-03-20 at 00:34
 2008-03-20, 00:20 #18 WraithX     Mar 2006 48610 Posts While I probably won't finish my 1M range till sometime Friday, I was wondering a couple of things: 1) Is gnfs-lasieve4I14e (or similar) multi-threaded, so that we could run a larger range in many threads to finish sieving faster? If it is not multi-threaded, I guess running multiple copies of the sieve will be sufficient? 2) Is it recommended to upload result in 1M chunks? Or, if a larger range was reserved, could all relations be uploaded in one file?
 Originally Posted by WraithX While I probably won't finish my 1M range till sometime Friday, I was wondering a couple of things: 1) Is gnfs-lasieve4I14e (or similar) multi-threaded, so that we could run a larger range in many threads to finish sieving faster? If it is not multi-threaded, I guess running multiple copies of the sieve will be sufficient? 2) Is it recommended to upload result in 1M chunks? Or, if a larger range was reserved, could all relations be uploaded in one file?
1) gnfs-lasieve4I14e is not multi-threaded, but it seems to scale pretty well if you run as many copies as you have CPUs; feel free to break up the range as you like (eg -f 61375000 -c 125000); feel free to concatenate output files, the order of relations within a file is pretty immaterial.

2) Both chiark, on which I'm staging the results, and the workstation on which I process them have enough disc and enough network connectivity to be able to handle results in chunks of whatever size you prefer (I've handled chunks of 2-3GB). I'll admit to a vague preference for the largest chunks possible, if only because

Code:
00:37:02 fivemack@kolmogorov:/home/nfsworld/P1188/relations\$ ls
25.0-26.0.bz2  28.0-28.5.bz2   33.0-34.0b.bz2  38.0-39.0a.bz2  44.5-45.0.bz2
26.0-26.1.bz2  28.5-29.0b.bz2  33.0-34.0c.bz2  38.0-39.0b.bz2  45.0-45.5.bz2
26.1-26.2.bz2  28.5-29.0.bz2   34.0-35.0.bz2   39.0-41.0a.bz2  45.5-46.0.bz2
26.2-26.4.bz2  29.0-30.0.bz2   35.0-36.0a.bz2  39.0-41.0b.bz2  46.0-47.0.bz2
26.4-26.6.bz2  30.0-31.0.bz2   35.0-36.0b.bz2  41.0-41.5.bz2   47.0-48.0.bz2
26.6-26.8.bz2  31.0-31.5.bz2   36.0-37.0a.bz2  41.5-42.0.bz2   48.0-49.0.bz2
26.8-27.0.bz2  31.5-32.0.bz2   36.0-37.0b.bz2  42.0-43.0.bz2   49.0-50.0.bz2
27.0-27.5.bz2  32.0-33.0.bz2   37.0-38.0a.bz2  43.0-44.0.bz2   53.0-54.0.bz2
27.5-28.0.bz2  33.0-34.0a.bz2  37.0-38.0b.bz2  44.0-44.5.bz2   54.0-55.0.bz2
is starting to get a little unmanageable already.

I'll be out of touch from Friday morning to Monday night - it's a long long weekend in England - but feel free to make reservations on the thread. I expect to have enough relations by the time we get to 85 million, so there's plenty of space left to reserve in.

Statistics so far

 Originally Posted by fivemack I expect to have enough relations by the time we get to 85 million, so there's plenty of space left to reserve in.
I agree - according to my calculations, we need a range size of approx. 68M (if we want 80M relations), which would be from 15M to 82M.

Statistics, calculations of Q per range, and estimations of required range and cpu-time, based on the relations found so far, are attached in a zipped excel-file. Columns in blue color are calculated values.
Attached Files
 GNFS_Statistic_M2376.zip (6.0 KB, 332 views)

 2008-03-20, 08:07 #21 fivemack (loop (#_fork))     Feb 2006 Cambridge, England 643810 Posts Thanks for collating that data. This is a large enough range that I think estimation does need some form of correction for the way that the rels/Q value drops off with increasing Q, so I'm not sure your linear model is quite appropriate. I got the 85M figure by noticing that J15 was getting about 2 rels/Q in the 15M-25M range, so there would be at least 20M from that when it's finished, and assuming 1 rel/Q from gnfs-lasieve4I14e in the higher range; from your data, 0.75rel/Q would be a safer estimate there, and that means the top limit ought to be nearer 95M. Perhaps 100M to account for the increasing cost of duplicates as we collect more relations. In any case oversieving doesn't hurt, and unless the Bordeaux cluster is available we're in no danger of running out of Q range before Tuesday. Last fiddled with by fivemack on 2008-03-20 at 08:07
 Originally Posted by fivemack Thanks for collating that data. This is a large enough range that I think estimation does need some form of correction for the way that the rels/Q value drops off with increasing Q, so I'm not sure your linear model is quite appropriate. I got the 85M figure by noticing that J15 was getting about 2 rels/Q in the 15M-25M range, so there would be at least 20M from that when it's finished, and assuming 1 rel/Q from gnfs-lasieve4I14e in the higher range; from your data, 0.75rel/Q would be a safer estimate there, and that means the top limit ought to be nearer 95M. Perhaps 100M to account for the increasing cost of duplicates as we collect more relations. In any case oversieving doesn't hurt, and unless the Bordeaux cluster is available we're in no danger of running out of Q range before Tuesday.
You are right with a linear model being inaccurate, but as quite a lot of sieving in the low range was done with I14e in the lower range and with the more efficient J15 in the mid range, and yet more will be done with J15 in the higher ranges, I kept using the linear model to get a rough estimation for MINIMUM Q range required. (I would have taken the decrease in rels/Q into account if only I14e would have been used.)

