mersenneforum.org  

Go Back   mersenneforum.org > Extra Stuff > Programming

Reply
 
Thread Tools
Old 2022-08-21, 14:34   #1
User140242
 
Jul 2022

22·13 Posts
Default Lucas-Lehmer Primality Testing

I made this version of Lucas-Lehmer Primality Testing if (2^P - 1) is prime then the elements of the vector esp_L_i are all 1.

The process is a little slow but it works on the sum of exponents and there are no multiplications do you think it can be useful?

I want to clarify that I am not a programmer, I came up with the idea and I realized it quickly without trying to optimize it because it is too slow and I do not know if it is worth spending time to improve it.


Code:
#include <iostream>
#include <cmath>
#include <algorithm>
#include <vector>
#include <cstdlib>
#include <stdint.h>

bool ll_test(uint64_t P)
{ 
  
    std::vector<uint64_t> esp_L_i(P);
    std::vector<uint64_t> esp_L_i_p2(P);
    std::vector<uint64_t> esp_L_i_m2(P);
    
    esp_L_i[2]=1;
    esp_L_i_m2[1]=1;
    for (uint64_t step=2;step<P;step++)
    {
        if (esp_L_i[1]==0)
        {
            for (uint64_t i=0;i<P;i++)
                esp_L_i_p2[i]=esp_L_i[i];
            esp_L_i_p2[1]=1;
        }
        else
        {
            esp_L_i_p2[0]=esp_L_i[0];
            esp_L_i_p2[1]=1+esp_L_i[1];
            for (uint64_t i=2;i<P;i++)
            {
                esp_L_i_p2[i]=esp_L_i[i]+esp_L_i_p2[i-1]/2;
                esp_L_i_p2[i-1]%=2;
            }
            while (esp_L_i_p2[P-1]>1)
            {
                esp_L_i_p2[0]+=esp_L_i_p2[P-1]/2;
                esp_L_i_p2[P-1]%=2;
                for (uint64_t i=1;i<P;i++)
                {
                    esp_L_i_p2[i]+=esp_L_i_p2[i-1]/2;
                    esp_L_i_p2[i-1]%=2;
                }
            }
        
        }
        for (uint64_t i=0;i<P;i++)
            esp_L_i[i]=0;

        for (uint64_t i=0;i<P;i++)
            if (esp_L_i_m2[i]==1)
                for (uint64_t j=0;j<P;j++)        
                    if (esp_L_i_p2[j]==1)
                    {
                        esp_L_i[(i+j)%P]+=1;
                        if (esp_L_i[(i+j)%P]==P)
                        {
                            for (uint64_t i=1;i<P;i++)
                            {
                                esp_L_i[i]+=esp_L_i[i-1]/2;
                                esp_L_i[i-1]%=2;
                            }
                            while (esp_L_i[P-1]>1)
                            {
                                esp_L_i[0]+=esp_L_i[P-1]/2;
                                esp_L_i[P-1]%=2;
                                for (uint64_t i=1;i<P;i++)
                                {
                                    esp_L_i[i]+=esp_L_i[i-1]/2;
                                    esp_L_i[i-1]%=2;
                                }
                            }
                        }
                    }

        for (uint64_t i=1;i<P;i++)
        {
            esp_L_i[i]+=esp_L_i[i-1]/2;
            esp_L_i[i-1]%=2;
        }
        while (esp_L_i[P-1]>1)
        {
            esp_L_i[0]+=esp_L_i[P-1]/2;
            esp_L_i[P-1]%=2;
            for (uint64_t i=1;i<P;i++)
            {
                esp_L_i[i]+=esp_L_i[i-1]/2;
                esp_L_i[i-1]%=2;
            }
        }  
        for (uint64_t i=0;i<P;i++)
            esp_L_i_m2[i]=esp_L_i[i];

        if (esp_L_i[1]==0)
            esp_L_i[1]=1;
        else
        {
            esp_L_i[1]+=1;
            for (uint64_t i=2;i<P;i++)
            {
                esp_L_i[i]+=esp_L_i[i-1]/2;
                esp_L_i[i-1]%=2;
            }
            while (esp_L_i[P-1]>1)
            {
                esp_L_i[0]+=esp_L_i[P-1]/2;
                esp_L_i[P-1]%=2;
                for (uint64_t i=1;i<P;i++)
                {
                    esp_L_i[i]+=esp_L_i[i-1]/2;
                    esp_L_i[i-1]%=2;
                }
            }
        }
    }
    uint64_t i;
    for (i=0;i<P;i++)
        if(esp_L_i[i]==0)
            break;
    if (i<P)
        return false;
    return true;
}

int main()
{ 

    std::cout <<"ll test: "<< ll_test(607)<< std::endl;

    return 0;
}
User140242 is offline   Reply With Quote
Old 2022-08-21, 15:13   #2
User140242
 
Jul 2022

648 Posts
Default

I want to post a further clarification the idea was born why

I observed this detail in L_2 and L_4 note that if L_1=4 then L_2=L_1^2-2=(L_1-2)(L_1+2)+2

L_2 = 14 = 2^3 + 2^2 + 2^1 and the exponents 3%3=0 , 1 , 2%3=2 they are all present

L_4 = 37634 = 2^15 + 2^12 + 2^9 + 2^8 + 2^1 and the exponents 15%5=0 , 1 , 12%5=2 , 8%5=3 , 9%5=4 they are all present

I thought about developing the program and testing it for larger numbers.

In fact I think that adding the exponents of 2 is the same as doing a multiplication.

Last fiddled with by User140242 on 2022-08-21 at 15:54
User140242 is offline   Reply With Quote
Old 2022-08-22, 07:15   #3
User140242
 
Jul 2022

22·13 Posts
Default

In rewriting the code I made a mistake I forgot the 0 to initialize the vectors in the declaration.

What it does is perform the multiplication of (L_i-2)*(L_i+2) and at the same time the modulo (2^P-1) is done.

In fact for example 2^(b*P) = 1 (mod 2^P -1) so 1 is added in esp_L_i[0] furthermore
we have 2^(a + b*P) = 2^a *(2^b*P-1) + 2^a and a* 2^(P-1) = a/2 * (2^P -1) + a/2
then the value of (L_i-2)*(L_i+2) is stored for use in the next step and the value of L_(i+1) is calculated by adding 2.

Last fiddled with by User140242 on 2022-08-22 at 07:19
User140242 is offline   Reply With Quote
Old 2022-08-22, 08:04   #4
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

664110 Posts
Default

I'm not sure what you are trying to achieve by writing the code.

I'm guessing you are already aware that there are plenty of arbitrary precision calculators already available for free. So assuming you just want to play with some code for fun and experience then if you want it to be faster for large numbers, the you will need a different multiply algorithm. Look into FFT or NTT.
retina is online now   Reply With Quote
Old 2022-08-22, 08:10   #5
User140242
 
Jul 2022

22·13 Posts
Default

Quote:
Originally Posted by retina View Post
I'm not sure what you are trying to achieve by writing the code.

I'm guessing you are already aware that there are plenty of arbitrary precision calculators already available for free. So assuming you just want to play with some code for fun and experience then if you want it to be faster for large numbers, the you will need a different multiply algorithm. Look into FFT or NTT.

Thanks for the advice, in fact I wrote the code just for fun.
User140242 is offline   Reply With Quote
Old 2022-08-22, 15:20   #6
User140242
 
Jul 2022

3416 Posts
Default

Even if it is useless, this is the best I have been able to do to optimize it.



Code:
#include <iostream>
#include <cmath>
#include <algorithm>
#include <vector>
#include <cstdlib>
#include <stdint.h>

bool ll_test(uint64_t P)
{ 
  
    std::vector<uint64_t> bin_L_i(P,0);
    std::vector<uint64_t> esp_L_i_m2(P,0);
    
    uint64_t l_m2=1;
    esp_L_i_m2[0]=1;
    for (uint64_t step=2;step<P;step++)
    {
        for (uint64_t i=0;i<l_m2;i++)
        {
            bin_L_i[(2+esp_L_i_m2[i])%P]+=1;
            for (uint64_t j=0;j<l_m2;j++)            
                bin_L_i[(esp_L_i_m2[i]+esp_L_i_m2[j])%P]+=1;
        }

        for (uint64_t i=1;i<P;i++)
        {
            bin_L_i[i]+=bin_L_i[i-1]>>1;
            bin_L_i[i-1]&=1;
        }
        while (bin_L_i[P-1]>1)
        {
            bin_L_i[0]+=bin_L_i[P-1]>>1;
            bin_L_i[P-1]&=1;
            for (uint64_t i=1;i<P;i++)
            {
                bin_L_i[i]+=bin_L_i[i-1]>>1;
                bin_L_i[i-1]&=1;
            }
        }
        
        l_m2=0;    
        for (uint64_t i=0;i<P;i++)
            if (bin_L_i[i]==1)
            {
                bin_L_i[i]=0;
                esp_L_i_m2[l_m2]=i;
                l_m2++;
            }
    }

    if (l_m2==P-1 && esp_L_i_m2[1]==2)
        return true;
    return false;
}

int main()
{ 

    std::cout <<"ll test: "<< ll_test(607)<< std::endl;

    return 0;
}
User140242 is offline   Reply With Quote
Old 2022-08-22, 17:11   #7
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5·1,423 Posts
Default

Quote:
Originally Posted by User140242 View Post
I made this version of Lucas-Lehmer Primality Testing if (2^P - 1) is prime then the elements of the vector esp_L_i are all 1.

The process is a little slow but it works on the sum of exponents and there are no multiplications do you think it can be useful?
Useful? Other than as a learning exercise, I doubt it. It's not clear to me what's going on there. I see deeply nested for loops that look to me like its run time may scale as high as O(p5) although if the conditional branches are taken rarely it may be lower order. Competitive code scales as p2 log p log log p per primality test, or roughly p2.1. 62M(5-2.1) would be a very large speed disadvantage. Even an O(p^0.5) disadvantage can be deadly to usefulness. See also https://www.mersenneforum.org/showpo...21&postcount=7

Also I don't see anything that looks like multithreading. Or use of floating point for speed. Or output of the least significant 64 bits of the final residue for error detection by double-checking. Or Jacobi symbol check. It looks to me very wasteful of cycles and space. I suspect an optimizing compiler is saving it somewhat. Perhaps I misunderstand what you're doing there. Some explanatory comments in the post, code, or both might help.

Compiling it via msys32/mingw and using its environment for timing runs, I get O(p~3) with a large proportionality constant, such that it is thousands or millions of times slower than prime95 on the same CPU core, so unusable at ~100K or higher.
Attached Files
File Type: pdf primality testing run time scalings compared.pdf (32.3 KB, 30 views)

Last fiddled with by kriesel on 2022-08-22 at 17:20
kriesel is online now   Reply With Quote
Old 2022-08-22, 18:14   #8
User140242
 
Jul 2022

648 Posts
Default

Quote:
Originally Posted by kriesel View Post
Useful? Other than as a learning exercise, I doubt it. It's not clear to me what's going on there.
...

I had already realized that it was very slow I posted it only because I had proposed a different way.

The code in my last post does this:

example if P=5 and MP=2^5 -1 =31 in the single step=3 we know that L2=14

L2_m2=(L2-2)%31= 12 = 2^2+2^3

(L3-2)%31= L2_m2*L2_m2 +4*L2_m2 = 192%31= 6

multiplication is done by adding the exponents of 2

(2^2 +2^3) * (2^2 +2^3) +2^2 *(2^2 +2^3)=
=2^2 * 2^2 + 2^2 * 2^3 + 2^3 * 2^2 + 2^3 * 2^3 + 2^2 * 2^2 +2^2 * 2^3=
= 2^4 +2^5 +2^5+ 2^6 +2^4 +2^5

if module 5 of the exponents is made you get

2^4 +2^0 +2^0 + 2^1 + 2^4 +2^0

you get
bin_L_i=[3 1 0 0 2]

after for loop
bin_L_i=[1 0 1 0 2]

and after the while loop
bin_L_i=[0 1 1 0 0]


so is found
esp_L_i_m2=[1 , 2]

which corresponds to 2^1+2^2 = 6 =192%31
User140242 is offline   Reply With Quote
Old 2022-08-26, 07:49   #9
User140242
 
Jul 2022

22·13 Posts
Default

It is always very slow but looking at the report in


Quote:
Originally Posted by kriesel View Post
Compiling it via msys32/mingw and using its environment for timing runs, I get O(p~3) with a large proportionality constant, such that it is thousands or millions of times slower than prime95 on the same CPU core, so unusable at ~100K or higher.


the execution time for P = 4423 is 981s making some small changes it is possible to go down to 90s I don't know if it is possible to optimize it again.


Code:
#include <iostream>
#include <cmath>
#include <algorithm>
#include <vector>
#include <cstdlib>
#include <stdint.h>

bool ll_test(uint64_t P)
{ 
  
    std::vector<uint64_t> bin_L_i(P,0);
    std::vector<uint64_t> esp_L_i_m2(P,0);
    
    uint64_t b = 31;
    if (P<32)
        b=P-1;
    std::vector<uint64_t> C_max(b,0);
    for (uint64_t i=1;i<b;i++)
        C_max[i]=1<<i;
    uint64_t cd = 0;
    uint64_t esp_1=P/2+1;
    uint64_t l_m2=1;
    uint64_t i1,j1;
    esp_L_i_m2[0]=1;
    for (uint64_t step=2;step<P;step++)
    {
        cd=0;
        for (i1=0;i1<l_m2 && esp_L_i_m2[i1]<esp_1;i1++)
        {
            bin_L_i[2+esp_L_i_m2[i1]]+=1;
            bin_L_i[2*esp_L_i_m2[i1]]+=1;
            for (j1=i1+1;j1<l_m2 && (1+esp_L_i_m2[i1]+esp_L_i_m2[j1])<P;j1++)
                bin_L_i[1+esp_L_i_m2[i1]+esp_L_i_m2[j1]]+=1;
            for (;j1<l_m2;j1++)
                bin_L_i[1+esp_L_i_m2[i1]+esp_L_i_m2[j1]-P]+=1;
        }
        for (;i1<l_m2 && (2+esp_L_i_m2[i1])<P;i1++)
        {
            bin_L_i[2+esp_L_i_m2[i1]]+=1;
            bin_L_i[2*esp_L_i_m2[i1]-P]+=1;
            for (j1=i1+1;j1<l_m2;j1++)
                bin_L_i[1+esp_L_i_m2[i1]+esp_L_i_m2[j1]-P]+=1;
        }
        for (;i1<l_m2;i1++)
        {
            bin_L_i[2+esp_L_i_m2[i1]-P]+=1;
            bin_L_i[2*esp_L_i_m2[i1]-P]+=1;
            for (j1=i1+1;j1<l_m2;j1++)
                bin_L_i[1+esp_L_i_m2[i1]+esp_L_i_m2[j1]-P]+=1;
        }
        bin_L_i[2]+=bin_L_i[1]>>1;
        bin_L_i[1]&=1;
        bin_L_i[0]+=bin_L_i[P-1]>>1;
        bin_L_i[P-1]&=1;
        bin_L_i[1]+=bin_L_i[0]>>1;
        bin_L_i[0]&=1;
        for (uint64_t esp=P-b+1;esp<P-1;esp++)
         {
            if (bin_L_i[esp]>=C_max[P-esp])
            {

                 bin_L_i[0]+=bin_L_i[esp]/C_max[P-esp];
                 bin_L_i[esp]%=C_max[P-esp];
            }
         }
        l_m2=0;
        for (uint64_t i=1;i<P;i++)
        {
            bin_L_i[i]+=bin_L_i[i-1]>>1;
            bin_L_i[i-1]&=1;
            if (bin_L_i[i-1]==1)
            {
                esp_L_i_m2[l_m2]=i-1;
                l_m2++;
                cd++;
            }
        }
        if (bin_L_i[P-1]==1)
        {
            esp_L_i_m2[l_m2]=P-1;
            l_m2++;
        }
        else
        {
            if (bin_L_i[P-1]<4)
            {
                if (cd==P-1 && bin_L_i[P-1]==2)
                {
                    l_m2=1;
                    esp_L_i_m2[0]=P-1;
                    bin_L_i[P-1]=0;                    
                }
                else if (cd==P-1 && bin_L_i[P-1]==3)
                {
                    l_m2=1;
                    esp_L_i_m2[0]=0;    
                    bin_L_i[P-1]=0;
                }                 
                else if (cd==P-2 && bin_L_i[0]==0)
                {
                    l_m2=P-1;
                    if (bin_L_i[P-1]==3)
                        l_m2++;
                    for (uint64_t i=0;i<l_m2;i++)
                        esp_L_i_m2[i]=i;
                    bin_L_i[P-1]=0;
                }
            }
        }
        while (bin_L_i[P-1]>1)
        {
            l_m2=0;
            bin_L_i[0]+=bin_L_i[P-1]>>1;
            bin_L_i[P-1]&=1;
            for (uint64_t i=1;i<P;i++)
            {
                bin_L_i[i]+=bin_L_i[i-1]>>1;
                bin_L_i[i-1]&=1;
                if (bin_L_i[i-1]==1)
                {
                    esp_L_i_m2[l_m2]=i-1;
                    l_m2++;
                }
            }
            if (bin_L_i[P-1]==1)
            {
               esp_L_i_m2[l_m2]=P-1;
               l_m2++;
            }
        }
        std::fill(bin_L_i.begin(), bin_L_i.end(), 0);
    }
    if (l_m2==P-1 && esp_L_i_m2[1]==2)
       return true;
    return false;
}

int main()
{ 

    std::cout <<"ll test: "<< ll_test(607)<< std::endl;

    return 0;
}

Last fiddled with by User140242 on 2022-08-26 at 08:47
User140242 is offline   Reply With Quote
Old 2022-08-26, 18:41   #10
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5×1,423 Posts
Default

It takes multithreading and efficient implementation of the fastest known algorithms to be competitive (IBDWT or possibly NTT). Grammar school multiplication is known to be competitive for only the shortest operands, <~256 bits, because it is O(p2) per iteration, so O(p3) per primality test, a very slow algorithm for large operands as in GIMPS.
It would be helpful if your code accepted exponent as a command line parameter, instead of hard coding and requiring recompile for each.
Attached Files
File Type: pdf primality testing run time scalings compared.pdf (56.7 KB, 14 views)

Last fiddled with by kriesel on 2022-08-26 at 18:50
kriesel is online now   Reply With Quote
Old 2022-08-27, 15:32   #11
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5×1,423 Posts
Default

Oops, g++ default optimization is -O0 (none). -O3 provides considerable improvement. (Mprime is still estimated ~105 times faster at p~1M on the same CPU core, and has an increasingly large advantage at higher exponents.)

Modified main() to accept exponent input:
Code:
int main(int argc,char* argv[])
{ 
 int expon;
// cout << "You have entered " << argc         << " arguments:" << "\n"; 
// for (int i = 0; i < argc; ++i)       cout << argv[i] << "\n";
    if ( argc > 1 ) {
        expon=std::stoi(argv[1]);
        printf ( "Exponent %d\n",expon);
        std::cout <<"ll test (1 prime, 0 composite): "<< ll_test(expon)<< std::endl;
    } else {
        printf ( "Enter an integer exponent greater than one, as a command line parameter.\n");
    }
    return 0;
}
Attached Files
File Type: pdf primality testing run time scalings compared.pdf (58.2 KB, 15 views)
kriesel is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Alternative Math for Lucas Lehmer Testing and Perfect Numbers LarsNet Miscellaneous Math 0 2021-03-09 22:44
Lucas-Lehmer Primes henryzz And now for something completely different 42 2019-06-03 14:09
Modifying the Lucas Lehmer Primality Test into a fast test of nothing Trilo Miscellaneous Math 25 2018-03-11 23:20
Lucas-Lehmer Dougal Information & Answers 9 2009-02-06 10:25
Lucas-Lehmer primality test CMOSMIKE Math 5 2002-09-06 18:46

All times are UTC. The time now is 15:20.


Tue Dec 6 15:20:50 UTC 2022 up 110 days, 12:49, 1 user, load averages: 0.78, 0.97, 1.04

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔