mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Msieve

Reply
 
Thread Tools
Old 2010-08-20, 08:34   #1
tgrdy
 
May 2010

1016 Posts
Default msieve C157 sqrt error: relation xxxx corrupt

I have sucessfuly factored C153, C154, C155 gnfs numbers.
But the last stage - sqrt error on a C157, on both Linux and Windows.

I have checked the C157 relations data file, it contains some messy lines(like "顫 ?vSM#驧E"), should be error via FTP tranfer or merge.
After merge, all the data file was not changed.

With the data file, filtering was OK, matrix LA was OK.
Though,there was many read-error lines in log file on Filtering stage
(more than 300 lines error).

The C157, Only failed on sqrt stage, for all the 31 dependencies.

The C153, C154, C155 data file may also contain messy lines.But sqrt stage were OK.

Please help, Thanks.


Wed Aug 18 17:58:35 2010
Wed Aug 18 17:58:35 2010
Wed Aug 18 17:58:35 2010 Msieve v. 1.45
Wed Aug 18 17:58:35 2010 random seeds: e20f8890 a57d6d14
Wed Aug 18 17:58:35 2010 factoring ... (157 digits)
Wed Aug 18 17:58:37 2010 searching for 15-digit factors
Wed Aug 18 17:58:39 2010 commencing number field sieve (157-digit input)
Wed Aug 18 17:58:39 2010 R0: -718286775230264074412218462160
Wed Aug 18 17:58:39 2010 R1: 330882384079102889
Wed Aug 18 17:58:39 2010 A0: -574785673446953991103337093662087263
Wed Aug 18 17:58:39 2010 A1: -2849845913901309456445653108969
Wed Aug 18 17:58:39 2010 A2: 19403845861276275944787296
Wed Aug 18 17:58:39 2010 A3: 40609767955771924744
Wed Aug 18 17:58:39 2010 A4: 185982982727102
Wed Aug 18 17:58:39 2010 A5: 33078600
Wed Aug 18 17:58:39 2010 skew 468676.75, size 2.585210e-015, alpha -7.129592, combined = 1.913363e-012
Wed Aug 18 17:58:39 2010
Wed Aug 18 17:58:39 2010 commencing square root phase
Wed Aug 18 17:58:39 2010 reading relations for dependency 1
Wed Aug 18 17:58:42 2010 read 2447826 cycles
Wed Aug 18 17:58:51 2010 cycles contain 6631430 unique relations
Wed Aug 18 18:00:22 2010 error: relation 30973623 corrupt

Wed Aug 18 18:00:22 2010
Wed Aug 18 18:00:22 2010
Wed Aug 18 18:00:22 2010 Msieve v. 1.45
Wed Aug 18 18:00:22 2010 random seeds: ace20dc0 520462f9
Wed Aug 18 18:00:22 2010 factoring ... (157 digits)
Wed Aug 18 18:00:24 2010 searching for 15-digit factors
Wed Aug 18 18:00:26 2010 commencing number field sieve (157-digit input)
Wed Aug 18 18:00:26 2010 R0: -718286775230264074412218462160
Wed Aug 18 18:00:26 2010 R1: 330882384079102889
Wed Aug 18 18:00:26 2010 A0: -574785673446953991103337093662087263
Wed Aug 18 18:00:26 2010 A1: -2849845913901309456445653108969
Wed Aug 18 18:00:26 2010 A2: 19403845861276275944787296
Wed Aug 18 18:00:26 2010 A3: 40609767955771924744
Wed Aug 18 18:00:26 2010 A4: 185982982727102
Wed Aug 18 18:00:26 2010 A5: 33078600
Wed Aug 18 18:00:26 2010 skew 468676.75, size 2.585210e-015, alpha -7.129592, combined = 1.913363e-012
Wed Aug 18 18:00:26 2010
Wed Aug 18 18:00:26 2010 commencing square root phase
Wed Aug 18 18:00:26 2010 reading relations for dependency 2
Wed Aug 18 18:00:29 2010 read 2446791 cycles
Wed Aug 18 18:00:37 2010 cycles contain 6625818 unique relations
Wed Aug 18 18:05:13 2010 error: relation 30973627 corrupt

Last fiddled with by tgrdy on 2010-08-20 at 09:27
tgrdy is offline   Reply With Quote
Old 2010-08-20, 11:44   #2
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

67168 Posts
Default

Did you ftp in ascii mode? It's possible your ftp client inserted newlines somewhere, and that will mess up everything. Msieve is complaining because a relation that it needed for the square root could not be parsed from the relation file.
jasonp is offline   Reply With Quote
Old 2010-08-20, 15:28   #3
tgrdy
 
May 2010

1016 Posts
Default

I set ftp always in binary mode.

I list steps, for factoring C153,C154,C155, C157 numbers:

on pcs,
1. run gnfs-lasieve in many linux pc
2. after finished sieve, upload all relation data to a linux x64 server,
via FTP, binary mode.

now, finish the sieve stage.

on sever,
3. merge all the relations data in to C15x.dat, binary mode copy.

4. run msieve do the Flitering, Matrix, Sqrt, in sequence, no break.


The result was :
C153, C154,C155 are factored OK, But C157 is failed on sqrt stage.

For C157, this week, I tested 3 times, msieve v1.45, v1.46, v1.47:

Both Flitering and Matrix-LA steps are always finished sucessfully.
But the Sqrt step is always failed .

I never move the C157.dat , or modify it. All the 3 times test, I checked the md5 hash of
c157.dat, they were the same.

Thanks.

Quote:
Originally Posted by jasonp View Post
Did you ftp in ascii mode? It's possible your ftp client inserted newlines somewhere, and that will mess up everything. Msieve is complaining because a relation that it needed for the square root could not be parsed from the relation file.

Last fiddled with by tgrdy on 2010-08-20 at 15:35
tgrdy is offline   Reply With Quote
Old 2010-08-20, 16:55   #4
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

2×3×19×31 Posts
Default

Well, another possibility is that the matrix somehow does not correspond to the underlying relations. Still another possibility is that the relation reading code does not parse relations for the square root in exactly the same way as it does for the filtering and LA. That's unlikely but not impossible; I can look around a bit tonight.

Does anyone have a utility that prints out relations that survive the duplicate removal? If the bad text in the file is causing problems, then we can edit that out and then rerun the filtering on the cleaned-up file.

To patch the code to do that, on line 114 of gnfs/filter/singleton.c, under the 'get the large ideals' comment, add
Code:
printf("%s", buf);
then on line 142, after the loop, add 'exit(-1)'. This will print to stdout the text of relations that parse correctly and are not duplicates. If you're willing to risk two more days on this, you can try running the filtering on the result.

If you want, we can take this to email as well.

Last fiddled with by jasonp on 2010-08-20 at 16:56
jasonp is offline   Reply With Quote
Old 2010-08-20, 18:30   #5
tgrdy
 
May 2010

24 Posts
Default

I have done some c code to process the old. dat file (including un-ascii numbers) , into a new .dat file.

my main idea is, processing it simple:
copy old.dat byte by byte into new .dat.
if it is not a ascii char (0x80 - 0xFF should be messy byte), convert it to NULL byte (\0).

dataclean 200m 200m_new
total = 66 MB, errbytes = 0 bytes, 0 KB, speed = 33 MB/s
total = 132 MB, errbytes = 0 bytes, 0 KB, speed = 33 MB/s
total = 191 MB, errbytes = 0 bytes, 0 KB, speed = 31 MB/s
...

dataclean.c

Code:

/*
*  data clean
*  remove the non-ascii chars
*/


#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define BUF_SIZE 1024*1024
static unsigned char buf[BUF_SIZE]; // 1024 KB buf
int main(int argc, char ** argv)
{
    FILE *fp1, *fp2;
    int errbytes, rlen;
    int i;
    int time1, time2, speed, sec, total;

    if(argc < 3) {
        printf("usage:\r\n%s oldfile newfile\r\n",argv[0]);
        return 0;
    }

    fp1 = fopen(argv[1], "rb");
    fp2 = fopen(argv[2], "w+b");
    errbytes = 0;
    total = 0;
    time1 = time2 =  time(NULL);
    while( (rlen=fread(buf, 1, BUF_SIZE, fp1)) != 0 ) {
        for(i = 0; i < rlen; i++) {
            if(buf[i] & 0x80) {
                ++errbytes;
                buf[i] = 0;
            }
            fwrite(buf+i, 1, 1, fp2);
        }

        total += (rlen/1024);
        if(time(NULL) - time2 > 1) {    
            time2 = time(NULL);
            sec = time2 - time1;
            speed = total/sec;
            printf("total = %d MB, errbytes = %d bytes, %d KB, speed = %d MB/s\r\n", total/1024, errbytes, errbytes/1024, speed/1024);
        }
    }

    fclose(fp1);
    fclose(fp2);
    printf("Done all !\n");
    return 0;
}

Last fiddled with by tgrdy on 2010-08-20 at 19:30
tgrdy is offline   Reply With Quote
Old 2010-08-20, 19:37   #6
tgrdy
 
May 2010

24 Posts
Default

test:
dataclean old.dat new.dat
total = 19 MB, errbytes = 12 bytes, 0 KB, speed = 19 MB/s
total = 54 MB, errbytes = 12 bytes, 0 KB, speed = 27 MB/s
total = 93 MB, errbytes = 12 bytes, 0 KB, speed = 31 MB/s
total = 132 MB, errbytes = 12 bytes, 0 KB, speed = 33 MB/s
total = 171 MB, errbytes = 12 bytes, 0 KB, speed = 34 MB/s
total = 201 MB, errbytes = 12 bytes, 0 KB, speed = 33 MB/s
...
tgrdy is offline   Reply With Quote
Old 2010-08-20, 21:51   #7
Batalov
 
Batalov's Avatar
 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2

24×11×53 Posts
Default

ftping gzipped files (even the .mat, .cyc, and .dep) is a way of detecting transmission errors because of the internal crc32 (which is better than nothing). Better yet is to have md5sum or sha1sum on both ends.

Of course the file could have been corrupt already at the sender. A QC-ing script in perl or python or in C can help. (check for only printable chars, then for special cases "N ...", "#...", "a,0:", and all others should have "a,b:f1,f2[,f3]*:f4,f5[,f6]*" i.e. three :-separated parts, first part in decimal, 2nd and 3rd in comma-separated hex. Or more sophistication; e.g. I am thinking if it is faster to leave only "a,b" on the sender side, gzip, ftp, and restore the factorizations; no need to re-invent the wheel, these programs are discussed elsewhere on the forum... msieve can be hacked to fulfil this function on the fly) Possibilities are endless.
Batalov is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
error: cannot locate relation cardmaker Factoring 16 2017-07-17 12:38
Error while running Msieve 1.53 with factmsieve.py FelicityGranger Msieve 2 2016-12-04 10:44
Error reading relation jux YAFU 24 2016-02-13 10:43
msieve MPI error david314 Msieve 5 2013-04-14 00:07
Error compiling msieve Wishper Msieve 2 2009-12-09 01:31

All times are UTC. The time now is 11:52.

Tue Mar 2 11:52:33 UTC 2021 up 89 days, 8:03, 0 users, load averages: 2.60, 2.34, 2.18

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.