 mersenneforum.org GWNUM program
 Register FAQ Search Today's Posts Mark Forums Read  2018-12-27, 16:52   #45
paulunderwood

Sep 2002
Database er0rr

3×1,291 Posts Here is 2-selfridge Baillie-PSW code:
Attached Files bpsw-2.c (4.3 KB, 354 views)

Last fiddled with by paulunderwood on 2018-12-27 at 17:17   2018-12-28, 15:36   #46
paulunderwood

Sep 2002
Database er0rr

3×1,291 Posts I have tidied all programs. I also fixed a problem with factorial primes/PRPs where a suitable jacobi symbols could not be found.

s2: (x+2)^(n+1) == 5+2*a (mod n, x^2-a*x+1) where a minimal such that jacobi(a^2-4,n) == -1.

lucasPRP: x^(n+1) == 1 (mod n, x^2-a*x+1) where minimal a>2 such that jacobi(a^2-4,n) == -1.

hybrid: as "s2" for a=0 or a=1 and otherwise (x+1)^(n+1) = 2+a (mod n, x^2-a*x+1) where minimal a>2 such that jacobi(a^2-4,n) == -1,

fusion: as "hybrid" except tests (2*x)^((n+1)/2) == jacobi(2*(a+2),n)*2 (mod m, x^2-a*x+1) for a>2 such that jacobi(a^2-4,n) == -1

bpsw-2: (2*x)^((n+1)/2) == jacobi(2*(a+2),n)*2 (mod n, x^2-a*x+1) for minimal a>2 such that jacobi(a^2-4,n) == -1
Attached Files s2.c (5.8 KB, 393 views) lucasPRP.c (3.8 KB, 420 views) hybrid.c (5.9 KB, 377 views) fusion.c (6.3 KB, 380 views) bpsw-2.c (4.5 KB, 378 views)

Last fiddled with by paulunderwood on 2018-12-28 at 15:56   2018-12-30, 06:58 #47 paulunderwood   Sep 2002 Database er0rr 1111001000012 Posts Below is the result of not using threads for this size of number. Obviously, I should test the input size to set the number of threads. 4 threads: Code: time ./pfgw64 -k -f0 -od -q'39924*10^30000+3*2^99654+455' | ../../coding/gwnum/prp2 - Is 2-PRP! real 0m17.745s user 0m33.124s sys 0m13.808s 1 thread: Code: time ./pfgw64 -k -f0 -od -q'39924*10^30000+3*2^99654+455' | ../../coding/gwnum/prp2 - Is 2-PRP! real 0m11.615s user 0m11.776s sys 0m0.004s So I have done this to all of my programs, not knowing the sweet spot: Code: if (LEN > 1000000) gwset_num_threads ( gwdata, 4 ); Last fiddled with by paulunderwood on 2018-12-30 at 07:39   2019-02-04, 20:10 #48 paulunderwood   Sep 2002 Database er0rr 387310 Posts Array Arithmetic Although what I present here may not be applicable to GWNUM, it is useful where one has to construct one's own arbitrary arithmetic using arrays. Let w be 2^64 in 64bit machines, be 2^54 in JavaScript etc. Find minimal r>0 such that jacobiSymbol(w^(2*r)-4,n)==-1 where n is the prime candidate. Then test: (w*x)^((n+1)/2)==w*kronecker(w^r+2,n) (mod n, x^2-w^r*x+1) Intermediate calculations in the left to right binary exponentiation are then: Squaring: (s*x + t)^2 --> s*(w^r*s + 2*t)*x + (t - s)*(t + s) (mod n) Multiplying by the base: (s*x + t)*w*x --> (w^(r + 1)*s + w*t)*x - w*s (mod n) With some simple manipulation of array indices, this should be quite quick. I would be impressed by someone if they implemented this. Last fiddled with by paulunderwood on 2019-02-04 at 20:44   2019-02-05, 02:18 #49 paulunderwood   Sep 2002 Database er0rr 74418 Posts On second thoughts, a large "r" could really balloon the modular reductions in the above. However if w=2 and the test is: (2*x)^((n+1)/2) == 2*jacobi(2*(2^r+2)) (mod n, x^2-2^r*x+1) then the squaring part would be: (s*x+t)^2 --> s*(s< (s<<(r+1)+t<<1)*x - s<<1 mod n. Last fiddled with by paulunderwood on 2019-02-05 at 02:27   2019-05-08, 15:43 #50 paulunderwood   Sep 2002 Database er0rr 3·1,291 Posts During left to right binary exponentiation of S*x+T to a power over x^2+1, the square of the intermediate value s*x+t is 2*s*t*x+t^2-s^2. My question is about FFT. Is it quicker to do 2 forward FFTs, 3 convolutions, and 3 reverse FFTs than it is to do 4 forward FFTs, 2 convolutions and 2 reverse FFTs? Or, to put it another way, is it better to compute the difference of squares in this case? Last fiddled with by paulunderwood on 2019-05-08 at 16:30   2019-05-08, 19:38   #51
R. Gerbicz

"Robert Gerbicz"
Oct 2005
Hungary

2×32×83 Posts Quote:
 Originally Posted by paulunderwood During left to right binary exponentiation of S*x+T to a power over x^2+1, the square of the intermediate value s*x+t is 2*s*t*x+t^2-s^2. My question is about FFT. Is it quicker to do 2 forward FFTs, 3 convolutions, and 3 reverse FFTs than it is to do 4 forward FFTs, 2 convolutions and 2 reverse FFTs? Or, to put it another way, is it better to compute the difference of squares in this case?
Why not:
t^2-s^2=(t-s)*(t+s)
and also note that (t-s)+(t+s)=2*t, but you'd loss in precision if you calculate one transform with this.   2019-05-08, 20:01   #52
R. Gerbicz

"Robert Gerbicz"
Oct 2005
Hungary

27268 Posts Quote:
 Originally Posted by R. Gerbicz Why not: t^2-s^2=(t-s)*(t+s) and also note that (t-s)+(t+s)=2*t, but you'd loss in precision if you calculate one transform with this.
And ofcourse (t+s)-(t-s)=2*s !!!
But you could do this also using the transforms of s,t to get t+s,t-s.   2019-05-08, 20:21   #53
paulunderwood

Sep 2002
Database er0rr

3·1,291 Posts Quote:
 Originally Posted by R. Gerbicz And ofcourse (t+s)-(t-s)=2*s !!! But you could do this also using the transforms of s,t to get t+s,t-s.
Answering my own question: It is quicker to go down the (t-s)*(t+s) route I conclude after running my own tests. It seems that forward and inverse FFTs need less computation than point-wise multiplication.

Last fiddled with by paulunderwood on 2019-05-08 at 20:50   2020-07-31, 11:03   #54
paulunderwood

Sep 2002
Database er0rr

3·1,291 Posts Here is a gwnum program that tests Euler+Frobenius as per https://mersenneforum.org/showpost.p...6&postcount=74. Attached Files quadratic.c (4.7 KB, 328 views) power_quadratic.c (5.4 KB, 306 views)

Last fiddled with by paulunderwood on 2020-08-04 at 20:10   2021-02-10, 12:46   #55
paulunderwood

Sep 2002
Database er0rr

3·1,291 Posts xp1_euler_frobenius

Here is my GWNUM code for the test given in this thread:
• a minimal
• Jacobi(a,n)==-1
• a^((n-1)/2)==-1 (mod n)
• (x+1)^n==-x+1 (mod n, x^2-a)

I will review the Frobenius section and would welcome any improvements to it.

Rather than calculating 2*s*t and (a*s+t)*(s+t) - (a+1)*s*t, by pre-calculating forward FFTs the latter has fewer additions and mults by small numbers if it is calculated as a*s^2+t^2.. Done! 15% faster.

One thing that will be done is to use gwswap instead of gwcopy in the Frobenius "if" clause. Done!

A test shows this program to be a good 1+2.5 selfridges Branched Euler PRP -- if 2 then we can use gwstartnextfft (illegally)

----------------

I have also uploaded "quickest.c" that does the following (with code using forward FFTs)
Code:
condition       base      quadratic        ~selfridges
n == 3 mod 4    (x+2)     [x^2+1]          2.2
n == 5 mod 8    (x+1)     [x^2+2]          2.8
n == 5 mod 6    (x+2)     [x^2-x+1]        2.2
default         (x+1)     [x^2-a*x+1]      2.2
It is easy to comment out the second test.
(Fixed compile instructions corrupted by global remove of "quick" )

EDIT: I removed illegal instructions from both programs . Sorry for the inconvenience.
Attached Files xp1_euler_frobenius.c (4.5 KB, 93 views) quickest.c (6.9 KB, 94 views)

Last fiddled with by paulunderwood on 2021-03-06 at 14:35   Thread Tools Show Printable Version Email this Page Similar Threads Thread Thread Starter Forum Replies Last Post Jean Penné Software 25 2010-11-01 15:18 Unregistered Information & Answers 3 2010-09-12 19:52 leizhoucn Programming 2 2007-11-05 09:34 tmorrow GMP-ECM 5 2007-04-04 00:39 Cyclamen Persicum Software 1 2007-01-02 20:53

All times are UTC. The time now is 05:37.

Mon Oct 25 05:37:26 UTC 2021 up 94 days, 6 mins, 0 users, load averages: 0.90, 0.93, 0.99