mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   New Mersenne Software For Test Mersenne Prime Numbers On Android (https://www.mersenneforum.org/showthread.php?t=23963)

axn 2019-01-04 02:41

How difficult will it be to compile MLucas for android ?

I would love to see the performance of an actual state-of-the-art code running on one of these.

LaurV 2019-01-04 06:53

@OP: can you give us some info about the multiplication algorithm used? Is it school grade, is it karatsuba? FFT?

As per firejuggler's suggestion, we tried to run this toy in an android emulator - our kingdom is Cortex M, but we have some android emus laying around, as we have colleagues who develop applications with them, for our customers. We wanted to see how the tests scale in time - if the application is bogus, like reading exponents from a list and saying prime/composite, and doing other things meantime, then the time-scaling would be also odd. We have some idea and could guess the multiplication algorithm used, if any tuning was done or it is just blind school-grade high-level stuff. And we consider this to be a good test, because it would not be easy for the guy to fake the process (which involves exponentiations and multiplications related not only to the size of the exponents, but also to the number of bits which are 1 involved). And our toys can also debug/disassemble code. But unluckily, we can not run it. Our emulator does not like it, and our colleagues who could help didn't come back from their NY holidays yet...

Our advice for now: avoid it.

P.S. Carlos, you seem to be the only one liking it? Can you detail why? Or was that a voting mistake? :razz:

pinhodecarlos 2019-01-04 07:11

Being Portuguese I’m being supportive with our Spanish brother but also I’m affected by flu since new year on my leave. Call me crazy: am I tired of fights with Spanish or is flu damaging my brain?!??? Or both?!

LaurV 2019-01-04 07:22

Ok, that makes sense! (being supportive). :tu:

GP2 2019-01-04 09:29

Here is a Mersenne prime test. I didn't write it. It uses the GMP library.

It prints a 64-bit hexadecimal residue: "0" for a Mersenne prime, non-zero otherwise.

I compiled on Linux and ran on a single Skylake core on AWS.

For 86243 it takes 11 seconds; for 216091 it takes 1 minute 38.5 seconds.

And yet this is much slower than mprime/Prime95.

[CODE]
/* Adapted from http://rosettacode.org/wiki/Lucas-Lehmer_test#GMP */

#include <stdio.h>
#include <stdlib.h>
#include <gmp.h>

void lucas_lehmer(unsigned long p, mpz_t *V)
{
mpz_t mp, t;
unsigned long k;

mpz_init_set_ui(t, p);

mpz_init(mp);
mpz_setbit(mp, p);
mpz_sub_ui(mp, mp, 1);

mpz_init_set_ui(*V, 4);
for (k = 3; k <= p; k++) {
mpz_mul(*V, *V, *V);
mpz_sub_ui(*V, *V, 2);
/* mpz_mod(*V, *V, mp) but more efficiently done given mod 2^p-1 */
if (mpz_sgn(*V) < 0) mpz_add(*V, *V, mp);
/* while (n > mp) { n = (n >> p) + (n & mp) } if (n==mp) n=0 */
/* but in this case we can have at most one loop plus a carry */
mpz_tdiv_r_2exp(t, *V, p);
mpz_tdiv_q_2exp(*V, *V, p);
mpz_add(*V, *V, t);
while (mpz_cmp(*V, mp) >= 0) mpz_sub(*V, *V, mp);
}

mpz_clear(t); mpz_clear(mp);

/* Residue is returned in V */
}

int main(int argc, char* argv[]) {
mpz_t m64, V;

/* We want to print a 64-bit residue */
mpz_init(m64);
mpz_setbit(m64, 64);
mpz_sub_ui(m64, m64, 1);

unsigned long p;
if (argc >= 2) {
p = strtoul(argv[1], 0, 10);
}
else {
fprintf(stderr, "Provide one argument, a Mersenne exponent, for instance: 11213\n");
return 1;
}

lucas_lehmer(p, &V);

mpz_and(V, V, m64);
mpz_out_str(stdout, 16, V);
fprintf(stdout, "\n");

mpz_clear(m64); mpz_clear(V);

return 0;
}
[/CODE]

LaurV 2019-01-04 09:48

[QUOTE=LaurV;504878]Ok, that makes sense! (being supportive). :tu:[/QUOTE]
Please don't do this to my posts! Now there are two of you, and I don't know who to blame! :rant:
(now I know why Xyzzy gave the knife to Batalov too! :davar55:)

Batalov 2019-01-04 15:53

[QUOTE=thorken;504778]M86243 In 3 minutes 7 seconds on s4 mini gt i9195
M216091 in 23min 7 sec in s4 mini gt-i9195
[/QUOTE]
[QUOTE=GP2;504895]Here is a Mersenne prime test. I didn't write it. It uses the GMP library.

It prints a 64-bit hexadecimal residue: "0" for a Mersenne prime, non-zero otherwise.

I compiled on Linux and ran on a single Skylake core on AWS.

For 86243 it takes [COLOR=green][B]11 seconds[/B][/COLOR]; for 216091 it takes [COLOR=Green][B]1 minute 38.5 seconds.[/B][/COLOR]
[/QUOTE]

Great job, Gord! :tu:

axn 2019-01-04 16:28

3 min 7 sec / 11 sec = 17x
23min 7 sec / 1 min 38.5 sec = 14x

OP's code has better scaling than GMP-based one !

Mysticial 2019-01-04 17:25

[QUOTE=axn;504938]3 min 7 sec / 11 sec = 17x
23min 7 sec / 1 min 38.5 sec = 14x

OP's code has better scaling than GMP-based one ![/QUOTE]

Granted, GMP isn’t a very high bar to beat. Their refusal to use floating point means they don’t use any fast algorithms until much larger sizes.

science_man_88 2019-01-04 17:34

[QUOTE=axn;504938]3 min 7 sec / 11 sec = 17x
23min 7 sec / 1 min 38.5 sec = 14x

OP's code has better scaling than GMP-based one ![/QUOTE]

by a factor of roughly 1/2< ratio of n's > if my math is correct, but it in theory takes until exponents of 2.9 to 3 million to surpass it.

kriesel 2019-01-04 19:21

Fixed overhead can give lower order empirical timing scaling from low p measurements and make an implementation look like it's using a better algorithm than it is. In other words, the order bends upward to higher power of p at higher exponent, as setup overhead gets diluted by longer runtimes for higher exponent. This can be seen in runtime scaling measurements and plots I made for cllucas, CUDALucas and gpuowl. The more setup overhead, the lower the apparent order at low p. OpenCL's compile on the fly guaranteed around 2 seconds of setup overhead at each fft length. CUDALucas scaling: [URL]https://www.mersenneforum.org/showpost.php?p=488523&postcount=2;[/URL] gpuowl V5 scaling: [URL]https://www.mersenneforum.org/showpost.php?p=502776&postcount=10;[/URL] prime95 scaling: [URL]https://www.mersenneforum.org/showpost.php?p=502778&postcount=2[/URL] The per-iteration time scaling of gpulucas, cllucas, CUDALucas, and gpuowl v1.9 are compared at [URL]https://www.mersenneforum.org/showpost.php?p=488476&postcount=8[/URL]

I'd expect grammar-school multiplication based primality testing to scale as ~p[SUP]3[/SUP]. With sufficient overhead, it may scale as p[SUP]2.43[/SUP] at the low end, as in my ancient inefficient primitive algorithm code from the late 80s and early 90s does at low p. (There, inefficient TF is part of the run time, as is printout of every iteration's res64 and very frequent state saves for resumption as insurance against the vagaries of MS-DOS and no UPS.) Thorken's timings indicate scaling of p[SUP]2.42[/SUP], very close to that. It could be grammar school, or something better.


All times are UTC. The time now is 06:20.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.