mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software > Mlucas

Reply
 
Thread Tools
Old 2009-11-12, 17:41   #12
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2·13·443 Posts
Default

Quote:
Originally Posted by smoky View Post
Congratulations on this milestone!

May I ask about the roadmap for the RISC versions of Mlucas? It is fully understandable why they wouldn't be a priority, but one can still hope, right? A feature like PrimeNet integration would be an awesome advance!

-smoky
The code should build fine without modification on most RISC platforms - no SSE2 support for those, obviously - users may simply have to find the best set of compiler options for their individual platforms.

Regarding Primenet support, my plan is to first get it working for x86-style platforms, then if the resulting code can be ported to support a wider variety of platforms without terrible difficulty, to proceed with that. I will likely ask for the open-source community's help with the latter, to encompass as broad a variety of platforms as possible, without requiring me to work on that aspect full-time.

Quote:
Originally Posted by lfm View Post
While trying Mlucas 3.0x (binary download for Linux 64)

./Mlucas_AMD64 -s a

...

seems like a problem with the radix 28?
More likely it's a sharad-library issue. Could you try building the source locally (just copy and past the one-line compile sequence on the README page) and retry the self-test? I may have to post a static binary instead.

Thanks,
-Ernst

Last fiddled with by ewmayer on 2009-11-12 at 17:42
ewmayer is offline   Reply With Quote
Old 2009-11-13, 10:24   #13
lfm
 
lfm's Avatar
 
Jul 2006
Calgary

52×17 Posts
Default

Quote:
Originally Posted by ewmayer View Post
More likely it's a sharad-library issue. Could you try building the source locally (just copy and past the one-line compile sequence on the README page) and retry the self-test? I may have to post a static binary instead.
Seems like that was it. After a local build it runs OK (so far).
lfm is offline   Reply With Quote
Old 2009-11-13, 17:04   #14
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2·13·443 Posts
Default

Quote:
Originally Posted by lfm View Post
Seems like that was it. After a local build it runs OK (so far).
I just replaced the Mlucas_AMD64.gz zipped binary with a new statically-linked one ... if you get the chance, please try it out and let me know if that solves the self-test issues you saw with the shared-lib build.

Thanks,
-Ernst
ewmayer is offline   Reply With Quote
Old 2009-11-15, 12:27   #15
lfm
 
lfm's Avatar
 
Jul 2006
Calgary

1A916 Posts
Default

Quote:
Originally Posted by ewmayer View Post
I just replaced the Mlucas_AMD64.gz zipped binary with a new statically-linked one ... if you get the chance, please try it out and let me know if that solves the self-test issues you saw with the shared-lib build.
Very strange. Today when I tried a few more tests of the old(er) dynamically linked version it won't fail for me any more. Not sure exactly but I think Ubuntu sent out a libc/libm patch and now it doesn't fail (just a theory). For the sake of smaller downloads, so far as I am concerned, you can go back to dynamically linked.
lfm is offline   Reply With Quote
Old 2009-11-15, 18:29   #16
pegaso56
 
pegaso56's Avatar
 
Oct 2006
Rosario, Argentina

37 Posts
Default

Hi, below are the results for AMD 6000
AMD Athlon(tm) 64 X2 Dual Core Processor 6000+
CPU speed: 1800.45 MHz, 2 cores
CPU features: RDTSC, CMOV, Prefetch, 3DNow!, MMX, SSE, SSE2
L1 cache size: 64 KB
L2 cache size: 1 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 32
L2 TLBS: 512

Running openSUSE 11.2 Linux athlon 2.6.32-rc5-git3-1-desktop #1 SMP PREEMPT 2009-11-03 15:41:35 +0100 x86_64 x86_64 x86_64 GNU/Linux


3.0x
1024 sec/iter = 0.057 ROE[min,max] = [0.250000000, 0.312500000] radices = 32 16 32 32 0 0 0 0 0 0
1152 sec/iter = 0.067 ROE[min,max] = [0.250000000, 0.250000000] radices = 36 32 32 16 0 0 0 0 0 0
1280 sec/iter = 0.072 ROE[min,max] = [0.250000000, 0.343750000] radices = 40 16 32 32 0 0 0 0 0 0
1408 sec/iter = 0.081 ROE[min,max] = [0.312500000, 0.312500000] radices = 44 16 32 32 0 0 0 0 0 0
1536 sec/iter = 0.091 ROE[min,max] = [0.265625000, 0.269042969] radices = 24 8 16 16 16 0 0 0 0 0
1792 sec/iter = 0.111 ROE[min,max] = [0.312500000, 0.312500000] radices = 28 8 16 16 16 0 0 0 0 0
2048 sec/iter = 0.128 ROE[min,max] = [0.281250000, 0.343750000] radices = 32 32 32 32 0 0 0 0 0 0
2304 sec/iter = 0.142 ROE[min,max] = [0.242187500, 0.281250000] radices = 36 8 16 16 16 0 0 0 0 0
2560 sec/iter = 0.160 ROE[min,max] = [0.281250000, 0.312500000] radices = 40 8 16 16 16 0 0 0 0 0
2816 sec/iter = 0.181 ROE[min,max] = [0.328125000, 0.343750000] radices = 44 32 32 32 0 0 0 0 0 0
3072 sec/iter = 0.208 ROE[min,max] = [0.250000000, 0.250000000] radices = 24 16 16 16 16 0 0 0 0 0
3584 sec/iter = 0.248 ROE[min,max] = [0.281250000, 0.281250000] radices = 28 16 16 16 16 0 0 0 0 0
1024 sec/iter = 0.057 ROE[min,max] = [0.250000000, 0.312500000] radices = 32 16 32 32 0 0 0 0 0 0
1152 sec/iter = 0.068 ROE[min,max] = [0.250000000, 0.250000000] radices = 36 32 32 16 0 0 0 0 0 0
1280 sec/iter = 0.072 ROE[min,max] = [0.250000000, 0.343750000] radices = 40 16 32 32 0 0 0 0 0 0
1408 sec/iter = 0.082 ROE[min,max] = [0.312500000, 0.312500000] radices = 44 16 32 32 0 0 0 0 0 0
1536 sec/iter = 0.092 ROE[min,max] = [0.265625000, 0.269042969] radices = 24 8 16 16 16 0 0 0 0 0
1792 sec/iter = 0.110 ROE[min,max] = [0.312500000, 0.312500000] radices = 28 8 16 16 16 0 0 0 0 0
2048 sec/iter = 0.128 ROE[min,max] = [0.281250000, 0.343750000] radices = 32 32 32 32 0 0 0 0 0 0
2304 sec/iter = 0.142 ROE[min,max] = [0.242187500, 0.281250000] radices = 36 8 16 16 16 0 0 0 0 0
2560 sec/iter = 0.160 ROE[min,max] = [0.281250000, 0.312500000] radices = 40 8 16 16 16 0 0 0 0 0
2816 sec/iter = 0.182 ROE[min,max] = [0.328125000, 0.343750000] radices = 44 8 16 16 16 0 0 0 0 0
3072 sec/iter = 0.209 ROE[min,max] = [0.250000000, 0.250000000] radices = 24 16 16 16 16 0 0 0 0 0
3584 sec/iter = 0.249 ROE[min,max] = [0.281250000, 0.281250000] radices = 28 16 16 16 16 0 0 0 0 0
128 sec/iter = 0.006 ROE[min,max] = [0.312500000, 0.312500000] radices = 16 16 16 16 0 0 0 0 0 0
144 sec/iter = 0.007 ROE[min,max] = [0.273437500, 0.273437500] radices = 36 8 16 16 0 0 0 0 0 0
160 sec/iter = 0.008 ROE[min,max] = [0.265625000, 0.265625000] radices = 20 16 16 16 0 0 0 0 0 0
192 sec/iter = 0.009 ROE[min,max] = [0.250000000, 0.250000000] radices = 24 16 16 16 0 0 0 0 0 0
224 sec/iter = 0.011 ROE[min,max] = [0.312500000, 0.312500000] radices = 28 16 16 16 0 0 0 0 0 0
256 sec/iter = 0.012 ROE[min,max] = [0.257812500, 0.296875000] radices = 16 16 32 16 0 0 0 0 0 0
288 sec/iter = 0.015 ROE[min,max] = [0.312500000, 0.312500000] radices = 36 16 16 16 0 0 0 0 0 0
320 sec/iter = 0.016 ROE[min,max] = [0.250000000, 0.312500000] radices = 20 16 32 16 0 0 0 0 0 0
384 sec/iter = 0.020 ROE[min,max] = [0.234375000, 0.250000000] radices = 24 16 16 32 0 0 0 0 0 0
448 sec/iter = 0.024 ROE[min,max] = [0.281250000, 0.312500000] radices = 28 16 32 16 0 0 0 0 0 0
512 sec/iter = 0.026 ROE[min,max] = [0.281250000, 0.312500000] radices = 16 16 32 32 0 0 0 0 0 0
576 sec/iter = 0.030 ROE[min,max] = [0.250000000, 0.281250000] radices = 36 16 32 16 0 0 0 0 0 0
640 sec/iter = 0.035 ROE[min,max] = [0.281250000, 0.343750000] radices = 40 16 16 32 0 0 0 0 0 0
704 sec/iter = 0.040 ROE[min,max] = [0.312500000, 0.312500000] radices = 44 16 16 32 0 0 0 0 0 0
768 sec/iter = 0.043 ROE[min,max] = [0.250000000, 0.250000000] radices = 24 32 32 16 0 0 0 0 0 0
896 sec/iter = 0.053 ROE[min,max] = [0.312500000, 0.312500000] radices = 28 32 32 16 0 0 0 0 0 0
1024 sec/iter = 0.057 ROE[min,max] = [0.250000000, 0.312500000] radices = 32 16 32 32 0 0 0 0 0 0
1152 sec/iter = 0.068 ROE[min,max] = [0.250000000, 0.250000000] radices = 36 32 32 16 0 0 0 0 0 0
1280 sec/iter = 0.072 ROE[min,max] = [0.250000000, 0.343750000] radices = 40 16 32 32 0 0 0 0 0 0
1408 sec/iter = 0.082 ROE[min,max] = [0.312500000, 0.312500000] radices = 44 16 32 32 0 0 0 0 0 0
1536 sec/iter = 0.091 ROE[min,max] = [0.265625000, 0.269042969] radices = 24 32 32 32 0 0 0 0 0 0
1792 sec/iter = 0.109 ROE[min,max] = [0.312500000, 0.312500000] radices = 28 8 16 16 16 0 0 0 0 0
2048 sec/iter = 0.126 ROE[min,max] = [0.281250000, 0.343750000] radices = 32 32 32 32 0 0 0 0 0 0
2304 sec/iter = 0.140 ROE[min,max] = [0.242187500, 0.281250000] radices = 36 8 16 16 16 0 0 0 0 0
2560 sec/iter = 0.158 ROE[min,max] = [0.281250000, 0.312500000] radices = 40 8 16 16 16 0 0 0 0 0
2816 sec/iter = 0.179 ROE[min,max] = [0.328125000, 0.343750000] radices = 44 8 16 16 16 0 0 0 0 0
3072 sec/iter = 0.207 ROE[min,max] = [0.250000000, 0.250000000] radices = 24 16 16 16 16 0 0 0 0 0
3584 sec/iter = 0.246 ROE[min,max] = [0.281250000, 0.281250000] radices = 28 16 16 16 16 0 0 0 0 0
4096 sec/iter = 0.281 ROE[min,max] = [0.250000000, 0.312500000] radices = 16 16 16 16 32 0 0 0 0 0
4608 sec/iter = 0.314 ROE[min,max] = [0.257812500, 0.257812500] radices = 36 16 16 16 16 0 0 0 0 0

Best regards, Carlos
pegaso56 is offline   Reply With Quote
Old 2009-11-20, 09:07   #17
moebius
 
moebius's Avatar
 
Jul 2009
Germany

347 Posts
Exclamation

Quote:
Originally Posted by ewmayer View Post
I just replaced the Mlucas_AMD64.gz zipped binary with a new statically-linked one ... if you get the chance, please try it out and let me know if that solves the self-test issues you saw with the shared-lib build.

Thanks,
-Ernst



I wanted to try your software at a windows XP-32 bit system, but the FTP server does not seem to be up.
moebius is offline   Reply With Quote
Old 2009-11-20, 21:05   #18
smh
 
smh's Avatar
 
"Sander"
Oct 2002
52.345322,5.52471

29×41 Posts
Thumbs down

No need to shout!
smh is offline   Reply With Quote
Old 2009-11-20, 23:55   #19
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

101100111111102 Posts
Default

Quote:
Originally Posted by moebius View Post
I wanted to try your software at a windows XP-32 bit system, but the FTP server does not seem to be up.
It seems ftp service is down – I can view http pages, but not upload/download anything via ftp. I just sent e-mail to John Pierce (owner of the Hogranch) about the problem.

This also made me realize that there is an inconsistency in my README - some files are linked via http, others (including the source tarball you are trying to get) via ftp. I made the needed changes so all files use http, but I can't upload the new file, since that needs ftp! :(

As a workaround (while we wait for ftp to be revived), you can manually change over from ftp to http for any file you need by copying the URL and changing the leading

ftp://hogranch.com/pub/mayer...

to

http://hogranch.com/mayer...

For example to get the source tarball via http, use

http://hogranch.com/mayer/src/C/Mlucas_11.06.2009.zip

To get the .vcproj file needed for Win32/Visual Studio builds, use

http://hogranch.com/mayer/bin/Mlucas.vcproj

Last fiddled with by ewmayer on 2009-11-21 at 00:01
ewmayer is offline   Reply With Quote
Old 2012-03-01, 05:12   #20
emily
 
Feb 2012
Athens, Greece

47 Posts
Default compile error (linux64)

I get these compilation errors... how do I compile it?

$ gcc -m64 -o Mlucas *.o -lm
fermat_mod_square.o: In function `fermat_mod_square':
fermat_mod_square.c:(.text+0x1c8a): undefined reference to `radix32_ditN_cy_dif1'
fermat_mod_square.c:(.text+0x2072): undefined reference to `radix16_ditN_cy_dif1'
fermat_mod_square.c:(.text+0x4ab5): undefined reference to `radix16_dif_pass1'
fermat_mod_square.c:(.text+0x4b96): undefined reference to `radix32_dif_pass1'
fermat_mod_square.c:(.text+0x4e0a): undefined reference to `radix32_dit_pass1'
fermat_mod_square.c:(.text+0x4ed2): undefined reference to `radix16_dit_pass1'
mers_mod_square.o: In function `mers_mod_square':
mers_mod_square.c:(.text+0x173f): undefined reference to `radix32_dit_pass1'
mers_mod_square.c:(.text+0x1807): undefined reference to `radix16_dit_pass1'
mers_mod_square.c:(.text+0x19a2): undefined reference to `radix32_dif_pass1'
mers_mod_square.c:(.text+0x1a6a): undefined reference to `radix16_dif_pass1'
mers_mod_square.c:(.text+0x1dab): undefined reference to `radix32_ditN_cy_dif1'
mers_mod_square.c:(.text+0x2199): undefined reference to `radix16_ditN_cy_dif1'
secure5.o: In function `make_v5_client_key':
secure5.c:(.text+0xe): undefined reference to `md5_raw_output'
secure5.c:(.text+0x18e): undefined reference to `md5_raw_input'
secure5.c:(.text+0x198): undefined reference to `strupper'
secure5.o: In function `secure_v5_url':
secure5.c:(.text+0x210): undefined reference to `md5'
secure5.c:(.text+0x21a): undefined reference to `strupper'
collect2: ld returned 1 exit status
emily is offline   Reply With Quote
Old 2013-08-03, 11:18   #21
sanaris
 
"Yury Vorobyov"
Jul 2013
Chelyabinsk

19 Posts
Default

Hello!

I have the error at performing line carry_gcc64.h:687
which cause SIGILL at radix16_ditN_cy_dif1.c:2156 .

Code:
Program received signal SIGILL, Illegal instruction.
0x000000000047c953 in radix16_ditN_cy_dif1 (a=a@entry=0x7ffff61de080, n=n@entry=1048576, nwt=1024, nwt_bits=10, wt0=0x1, wt1=<optimized out>, si=0x9e1340, rn0=rn0@entry=0x0, rn1=rn1@entry=0x0,
    base=base@entry=0x9c11e0 <base.6704>, baseinv=baseinv@entry=0x9c11f0 <baseinv.6705>, iter=iter@entry=1, fracmax=fracmax@entry=0x7fffffffbc48, p=p@entry=20000047) at radix16_ditN_cy_dif1.c:2156
Code:
   ┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
   │0x47c8f4 <radix16_ditN_cy_dif1+12540>   add    %rax,%rbx                                                                                                                                                       │
   │0x47c8f7 <radix16_ditN_cy_dif1+12543>   add    %rax,%rdx                                                                                                                                                       │
   │0x47c8fa <radix16_ditN_cy_dif1+12546>   add    %rax,%rcx                                                                                                                                                       │
   │0x47c8fd <radix16_ditN_cy_dif1+12549>   mulpd  0x100(%rax),%xmm2                                                                                                                                               │
   │0x47c905 <radix16_ditN_cy_dif1+12557>   mulpd  0x100(%rax),%xmm6                                                                                                                                               │
   │0x47c90d <radix16_ditN_cy_dif1+12565>   mulpd  0x110(%rax),%xmm3                                                                                                                                               │
   │0x47c915 <radix16_ditN_cy_dif1+12573>   mulpd  0x110(%rax),%xmm7                                                                                                                                               │
   │0x47c91d <radix16_ditN_cy_dif1+12581>   mulpd  (%rdi),%xmm2                                                                                                                                                    │
   │0x47c921 <radix16_ditN_cy_dif1+12585>   mulpd  (%rbx),%xmm6                                                                                                                                                    │
   │0x47c925 <radix16_ditN_cy_dif1+12589>   mulpd  0x40(%rdx),%xmm3                                                                                                                                                │
   │0x47c92a <radix16_ditN_cy_dif1+12594>   mulpd  0x40(%rcx),%xmm7                                                                                                                                                │
   │0x47c92f <radix16_ditN_cy_dif1+12599>   mov    0x545332(%rip),%rcx        # 0x9c1c68 <cy_r01.6782>                                                                                                             │
   │0x47c936 <radix16_ditN_cy_dif1+12606>   mov    0x54533b(%rip),%rdx        # 0x9c1c78 <cy_r23.6783>                                                                                                             │
   │0x47c93d <radix16_ditN_cy_dif1+12613>   mulpd  %xmm3,%xmm1                                                                                                                                                     │
   │0x47c941 <radix16_ditN_cy_dif1+12617>   mulpd  %xmm7,%xmm5                                                                                                                                                     │
   │0x47c945 <radix16_ditN_cy_dif1+12621>   addpd  (%rcx),%xmm1                                                                                                                                                    │
   │0x47c949 <radix16_ditN_cy_dif1+12625>   addpd  (%rdx),%xmm5                                                                                                                                                    │
   │0x47c94d <radix16_ditN_cy_dif1+12629>   movaps %xmm1,%xmm3                                                                                                                                                     │
   │0x47c950 <radix16_ditN_cy_dif1+12632>   movaps %xmm5,%xmm7                                                                                                                                                     │
  >│0x47c953 <radix16_ditN_cy_dif1+12635>   roundpd $0x0,%xmm3,%xmm3                                                                                                                                               │
   │0x47c959 <radix16_ditN_cy_dif1+12641>   roundpd $0x0,%xmm7,%xmm7                                                                                                                                               │
   │0x47c95f <radix16_ditN_cy_dif1+12647>   mov    0x54549a(%rip),%rbx        # 0x9c1e00 <sign_mask.6724>                                                                                                          │
   │0x47c966 <radix16_ditN_cy_dif1+12654>   subpd  %xmm3,%xmm1                                                                                                                                                     │
   │0x47c96a <radix16_ditN_cy_dif1+12658>   subpd  %xmm7,%xmm5                                                                                                                                                     │
   │0x47c96e <radix16_ditN_cy_dif1+12662>   andpd  (%rbx),%xmm1                                                                                                                                                    │
   │0x47c972 <radix16_ditN_cy_dif1+12666>   andpd  (%rbx),%xmm5                                                                                                                                                    │
   │0x47c976 <radix16_ditN_cy_dif1+12670>   maxpd  %xmm5,%xmm1                                                                                                                                                     │
   │0x47c97a <radix16_ditN_cy_dif1+12674>   maxpd  -0x20(%rax),%xmm1                                                                                                                                               │
   │0x47c97f <radix16_ditN_cy_dif1+12679>   movaps %xmm1,-0x20(%rax)                                                                                                                                               │
   │0x47c983 <radix16_ditN_cy_dif1+12683>   mov    %rsi,%rdi                                                                                                                                                       │
   │0x47c986 <radix16_ditN_cy_dif1+12686>   mov    %rsi,%rbx                                                                                                                                                       │
   │0x47c989 <radix16_ditN_cy_dif1+12689>   shr    $0x14,%rdi                                                                                                                                                      │
   │0x47c98d <radix16_ditN_cy_dif1+12693>   shr    $0x16,%rbx                                                                                                                                                      │
   │0x47c991 <radix16_ditN_cy_dif1+12697>   and    $0x30,%rdi                                                                                                                                                      │
   │0x47c995 <radix16_ditN_cy_dif1+12701>   and    $0x30,%rbx                                                                                                                                                      │
   │0x47c999 <radix16_ditN_cy_dif1+12705>   add    %rax,%rdi                                                                                                                                                       │
   │0x47c99c <radix16_ditN_cy_dif1+12708>   add    %rax,%rbx                                                                                                                                                       │
   │0x47c99f <radix16_ditN_cy_dif1+12711>   movaps %xmm3,%xmm1                                                                                                                                                     │
   │0x47c9a2 <radix16_ditN_cy_dif1+12714>   movaps %xmm7,%xmm5                                                                                                                                                     │
   │0x47c9a5 <radix16_ditN_cy_dif1+12717>   mulpd  0xc0(%rdi),%xmm3                                                                                                                                                │
   │0x47c9ad <radix16_ditN_cy_dif1+12725>   mulpd  0xc0(%rbx),%xmm7                                                                                                                                                │
   └───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
child process 24789 In: radix16_ditN_cy_dif1                                                                                                                                                Line: 2156 PC: 0x47c953
Output: attachment.
Machine: sse sse2 sse4a
Attached Files
File Type: txt sigill.txt (3.0 KB, 149 views)

Last fiddled with by sanaris on 2013-08-03 at 11:20
sanaris is offline   Reply With Quote
Old 2013-08-03, 16:12   #22
ldesnogu
 
ldesnogu's Avatar
 
Jan 2008
France

24×3×11 Posts
Default

It looks like roundpd is an SSE4.1 instruction which your Opteron 6124 doesn't seem to support (it's not part of SSE4a; see Wipedia). I guess Ernst will have to explain why he pretends that Mlucas is an SSE2 program
ldesnogu is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Mlucas v18 available ewmayer Mlucas 48 2019-11-28 02:53
Mlucas on ubuntu Damian Mlucas 17 2017-11-13 18:12
Mlucas version 17 ewmayer Mlucas 3 2017-06-17 11:18
MLucas on IBM Mainframe Lorenzo Mlucas 52 2016-03-13 08:45
mlucas on sun delta_t Mlucas 14 2007-10-04 05:45

All times are UTC. The time now is 16:39.

Wed Sep 30 16:39:07 UTC 2020 up 20 days, 13:50, 0 users, load averages: 1.70, 1.79, 1.80

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.