mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software > Mlucas

Reply
 
Thread Tools
Old 2015-05-16, 06:23   #12
alexvong1995
 
Dec 2014

37 Posts
Default

Yeah! So I have stepped through that fetal error and retry safely. I really need to learn gdb so that I can also debug my code like what you did. Pointers and segfaults are the most confusing things to newbie C programmers. Three Star Programmer is a good old joke. Regarding the duplicated lines, I think I should not be piping stdout to the stat file. Actually, I run mlucas by this command in tty1
Code:
$ nice ./Mlucas &> /dev/null
My .bashrc is as followed: if it is tty1, run mlucas with the above command. The ttys are set up to be auto-logined. Will piping to /dev/null affect anything?
alexvong1995 is offline   Reply With Quote
Old 2015-05-16, 07:48   #13
alexvong1995
 
Dec 2014

37 Posts
Default Building Mlucas for i386 on amd64

Quote:
Originally Posted by ewmayer
Again, I would be interested in seeing the specific error message(s) you get - this sounds like it might be the ran-out-of-registers issue I mentioned, since unoptimized builds need more registers for the C code surrounding the inline-asm.
(The message becomes too long so I have to post it here.) Yes. I think you are right, optimization makes better use of registers thus eliminating the run-out-register error.
About the i386 build, I have managed to build after applying a patch to the source code (by trial-and-error) and build with
Code:
$ gcc -m32  -Di386_build -o mlucas *.c -lm
I try to build without -DUSE_SSE2 and -DUSE_THREADS because it is the case where mlucas almost get built (90/95 source files get built vs 5x/95 in the -DUSE_SSE2, -DUSE_THREADS or both attempts)
I find problem in radix44_ditN_cy_dif1.c and radix176_ditN_cy_dif1.c. Compiler complains about undeclared a0, a1 ... a9, b0, b1 ... b9 in the expansion of radix44_main_carry_loop.h and radix176_main_carry_loop.h respectively. So what I do is to copy the declreation inside the #ifdef USE_SSE2 ... #endif in front of the #include radix44_main_carry_loop.h, and it seems working magically. The problems and the solutions of radix44_ditN_cy_dif1.c and radix176_ditN_cy_dif1.c are identical.

Besides, there is also problem in get_fft_radices.c. When I try to compile, there are two case 7.
Code:
        case 7 :
            numrad = 6; rvec[0] = 16; rvec[1] =  8; rvec[2] =  8; rvec[3] =  8; rvec[4] =  8; rvec[5] = 16; break;
      #ifndef USE_ONLY_LARGE_LEAD_RADICES
        case 7 :
            numrad = 5; rvec[0] =  8; rvec[1] = 16; rvec[2] = 16; rvec[3] = 32; rvec[4] = 16; break;
So when USE_ONLY_LARGE_LEAD_RADICES is defined, there will be 2 case 7, causing compiler error. I try to add something like #ifdef USE_ONLY_LARGE_LEAD_RADICES. It does works, but mlucas will segfault after it finishes self-testing via
Code:
$ ./mlucas -s m
So instead I try to add -DUSE_SSE2 when compiling this particular source file and it works just fine. But then I see -DUSE_SSE2 does nothing but to #define USE_ONLY_LARGE_LEAD_RADICES. So I add #if ... || defined(i386_build) #define USE_ONLY_LARGE_LEAD_RADICES #endif instead to simplify things.

Finally, when I do the self-testing mentioned above, mlucas exit after a fetal error when using 144 as one of its radices and the error message comes from
Code:
            default :
                sprintf(cbuf,"FATAL: radix %d not available for ditN_cy_dif1. Halting...\n",radix_vec0); fprintf(stderr,"%s", cbuf);    ASSERT(HERE, 0,cbuf);
in mers_mod_square.c
The solution I attempted is to add a case 144 by copying case 288 below. This prevents mlucas from exiting after the fetal error occurs. Instead, a fetal error caused by insane ROE occurs, mlucas halt testing and try another radix set instead of 144.

These are what I find out last night. The patch is included in the attachment. I define the new feature test macro i386_build just to prevent the new changes break existing code. Maybe I will post the self-testing result after I am done with it.
Attached Files
File Type: bz2 i386_build.diff.bz2 (1.6 KB, 141 views)
alexvong1995 is offline   Reply With Quote
Old 2015-06-01, 02:57   #14
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
Rep├║blica de California

1157010 Posts
Default

Alex, I reran the first ~30M iterations of your test on my Haswell quad and the rest on my new Broadwell NUC, final result matches yours. Max roundoff error I encountered using an AVX2-mode (that is an FMA-using) build was 0.375 (3 such during the run). Zipped .stat file attached.
Attached Files
File Type: bz2 p67773569.stat.bz2 (137.4 KB, 147 views)
ewmayer is offline   Reply With Quote
Old 2015-06-09, 15:03   #15
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

29·113 Posts
Default

Quote:
Originally Posted by ewmayer View Post
Alex, I reran the first ~30M iterations of your test on my Haswell quad and the rest on my new Broadwell NUC, final result matches yours. Max roundoff error I encountered using an AVX2-mode (that is an FMA-using) build was 0.375 (3 such during the run). Zipped .stat file attached.
I'm doing a full double-check of M67773569 on Prime95 so we should see how that does in ~58 hours.
Madpoo is offline   Reply With Quote
Old 2015-06-12, 01:25   #16
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

29·113 Posts
Default

Quote:
Originally Posted by Madpoo View Post
I'm doing a full double-check of M67773569 on Prime95 so we should see how that does in ~58 hours.
Done, it matched. M67773569
Madpoo is offline   Reply With Quote
Old 2015-06-12, 06:13   #17
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
Rep├║blica de California

2×5×13×89 Posts
Default

Thanks for the verify - I had no doubts after my DC, but because my code isn't officially allowed to do both the 1st and 2nd runs (no power-of-2 residue shift is used), you just made it official.

Does your logfile indicate what the max ROE for your run was?
ewmayer is offline   Reply With Quote
Old 2015-06-13, 03:12   #18
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

CCD16 Posts
Default

Quote:
Originally Posted by ewmayer View Post
Thanks for the verify - I had no doubts after my DC, but because my code isn't officially allowed to do both the 1st and 2nd runs (no power-of-2 residue shift is used), you just made it official.

Does your logfile indicate what the max ROE for your run was?
I'm just running Prime95 so only if it went over 0.4 at any point. I don't have that saved but it doesn't happen often and I tend to remember when it does. Well, not a specific exponent, but if I've seen one at all in the past few days. And I don't think I saw any the day I checked that one in.

I *think* it used the 3584K FFT size. Again, I kind of remember paying attention to that since it came up, and I made a mental note of that.
Madpoo is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Automatic submit results + fetch assignments for mfaktc? DuskFalls GPU Computing 5 2017-12-02 00:34
How do we prevent miraculously true DC results from manual submit? leonardyan96 PrimeNet 77 2017-06-01 16:18
how to submit manually a job with exponent 100M which is done by mfaktc? fairsky Information & Answers 17 2013-09-16 19:49
Only submit part of ECM results? dabaichi PrimeNet 5 2011-12-07 19:27
Unable to submit / retrieve new work Unregistered Information & Answers 12 2011-11-12 20:07

All times are UTC. The time now is 06:46.

Wed Dec 2 06:46:55 UTC 2020 up 83 days, 3:57, 1 user, load averages: 1.62, 1.54, 1.53

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.