mersenneforum.org (https://www.mersenneforum.org/index.php)
-   NFSNET Discussion (https://www.mersenneforum.org/forumdisplay.php?f=17)
-   -   Status of previous two projects (https://www.mersenneforum.org/showthread.php?t=2364)

 dsouza123 2004-04-18 23:17

Status of previous two projects

What is the status of 2_811M and 10_223P ?

Filtering ? Linear algebra ?

 xilman 2004-04-19 08:59

[QUOTE=dsouza123]What is the status of 2_811M and 10_223P ?

Filtering ? Linear algebra ?[/QUOTE]
Both are currently in the linear algebra stage.

2_811M has a [i]very[/i] hard matrix. It has 14.56 million rows and columns!

Paul

 akruppa 2004-04-19 09:36

[QUOTE]2_811M has a very hard matrix. It has 14.56 million rows and columns![/QUOTE]

:shock: OMG. How much memory does it occupy?

Alex

 R.D. Silverman 2004-04-19 12:58

[QUOTE=akruppa]:shock: OMG. How much memory does it occupy?

Alex[/QUOTE]

I just finished the matrix for 2,653+. It had 4.78million rows and took
866 Mbytes and 352 hours to solve on my PC. But it was very sparse --
total weight was 162million.

 smh 2004-04-19 13:08

[QUOTE=xilman]Both are currently in the linear algebra stage.

2_811M has a [i]very[/i] hard matrix. It has 14.56 million rows and columns!

Paul[/QUOTE]

I haven't seen anything about [URL=http://www.rkmath.rikkyo.ac.jp/~kida/snfs248e.htm]this[/URL] factorization in this forum, but am i right that this number was easier to do (at least post processing)?

 R.D. Silverman 2004-04-19 14:52

[QUOTE=smh]I haven't seen anything about [URL=http://www.rkmath.rikkyo.ac.jp/~kida/snfs248e.htm]this[/URL] factorization in this forum, but am i right that this number was easier to do (at least post processing)?[/QUOTE]

The result for 2,1642M should have been HARDER to achieve. It is 10 bits
larger than M811. I do not understand why their matrix was only 7+Million
rows, while that for M811 was 14+M. Further, they did the sieving in only 50
days with fewer machines. [although I suspect that they had faster machines].

Note that they used a lattice siever, not a line siever.

One of these days I will have to write my own lattice siever... [if I can find
the time!] :unsure:

 ET_ 2004-04-19 19:25

[QUOTE=Bob Silverman]
Note that they used a lattice siever, not a line siever.[/QUOTE]

Are there links, indications or explications for this diffrence?

Luigi

 xilman 2004-04-20 09:41

[QUOTE=Bob Silverman]The result for 2,1642M should have been HARDER to achieve. It is 10 bits
larger than M811. I do not understand why their matrix was only 7+Million
rows, while that for M811 was 14+M. [/QUOTE]
The explanation for the huge matrix is both simple and depressing. :sad:

Converting raw relations into a matrix involves a process whereby useless relations are discarded and useful ones are merged into relation-sets. Precisely what "merging" means is not particularly relevant to this story so I won't go into details. Relation sets are, as the name suggests, collections of relations each member of which shares a property in common. The property is that at least two members of a set contain the same prime value but that, again, isn't relevant here. What is relevant is that a relation set behaves as if it were a single relation and that, in general, it is "heavier" than a raw relation. "Heavy" relations are bad news for the matrix step and so we make a point of collecting plenty of extra relations and so have an excess which allows us to throw away the very heaviest relation sets and yet still have enough data to enable us to solve the matrix.

The above paragraph is an extremely simplified overview of the filtering stage. Much more detail can be obtained relatively easily but I don't think it's necessary for present purposes.

Now to specific numbers. For the M811 project we collected something over 87 million different raw relations. Without any filtering at all, these would generate a matrix with over 87 million rows and columns. Throwing out the useless relations reduced the data to around 63 million relations. This is still a rather large data set.

A simple merging operation reduced the data to 30.8 million relation sets and indicated that we had an excess of 6.6 million, allowing us to throw away up to that number of the heaviest relation sets. Note that we've already reduced the matrix from 87M to 30.8M by this stage. This point had been reached by 4th March, the day after sieving finished. Progress was so quick because relatively little data was being manipulated and relatively little computation was being performed on it. Nonetheless, the data files were between 4Gb and 7Gb in size and the processing took several hours on a 2.5GHz machine fitted with a gigabyte of RAM.

The next stage was much more memory and cpu intensive, and required much more care in the selection of the parameters fed into the filtering program. I ran several copies, each with slightly different parameters, on machines which have 2Gb RAM each. One run crashed with an out-of-memory error after about a day or so, another did the same after running for almost three days. I restarted each of these with different parameters chosen to reduce memory usage. One of the original programs finished successfully on March 10th after five days running. I took this data and built a matrix out of it which had 10.19M rows and 11.61M columns. Unfortunately, it was useless because there were 6798 duplicate rows. This should not be possible and there is clearly a bug, either in the filter program or the bulidmatrix program or both. Note the date: 10th March. In the wee hours of 12th March I was due to head off to the airport to start a week's vacation.

The next day and a half was unbelievably hectic. Another attempt at filtering produced another useless matrix. I went back a stage and took the data which would generate the 30.8M matrix and ran a very gentle filter on it. This produced a usable but large matrix which has 14,572,496 rows and 15,607,468 columns; it was ready by around noon on the 11th. Other attempts to get a better matrix continued well into the evening without success. At around 9pm on the 11th, I gave up and set the 14.57M matrix going on the cluster. At 3am the next morning I got up to go on vacation.

First day back in the office was 22nd March and, from the progress made so far, I worked out that the matrix would finish in early May. I returned to the attempt to produce a much smaller matrix which could be processed in much less time and so finish before the end of April. No luck. Every attempt produced a useless matrix. I couldn't work out why and describing the symptoms of the failure to various colleagues didn't produce any good leads. On March 31, I put something over 7 gigabytes of data onto the MSR anon-ftp site, from where Don Leclair downloaded it and started his investigation. After more than a week, he found a problem in the MS C runtime I/O library. It turns out that if a program reads more than four gigabytes of data in text mode, something gets confused over how many CRLF pairs it has read and two or more lines can become concatenated. This behaviour does not apparently occur when files are read in binary mode. Needless to say, the relations and matrix files are all humungous text files. The matrix currently running is under 2 gigabytes and the relations fie from which it is built is 3.78Gb and so just squeaks under the threshhold.

Don's result came far too late to build a smaller matrix, which would still have been around 9M in size, so we decided to continue with the one we have running.

I hope this give some insight into the efforts that we have to go to behind the scenes. I should also make clear that although I've concentrated on my experiences, because those are the ones I know in greatest detail, the other members of the NFSNET team also put in heroic efforts. Richard Wackerbarth in partcular spent a great deal of time and resources and his effort has been largely unsung thus far.

Paul

 Death 2004-04-27 13:21

so there's one week passed.

when we see 811 results? how long it takes??

 xilman 2004-04-27 13:36

[QUOTE=Death]so there's one week passed.

when we see 811 results? how long it takes??[/QUOTE]
The only honest answers are "eventually" and "as long as necessary" respectively.

I added a brief update to the front page of [url]www.nfsnet.org[/url] yesterday. My best estimate is that the linear algebra will finish in the first week of May and the factors should be available a day or few afterwards. Both these estimates assume that nothing goes badly wrong.

Patience is a desirable characteristic when undertaking bleeding edge factorizations of integers.

Paul

 junky 2004-04-27 17:11

dont worry Paul, we understand that ya're working on the 811 project.
Anyways, our CPU is working, and we're working on the 1137, so it's fine, projects are done (or almost done) :banana:

All times are UTC. The time now is 12:21.