mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Math

Reply
 
Thread Tools
Old 2020-07-28, 17:32   #12
R. Gerbicz
 
R. Gerbicz's Avatar
 
"Robert Gerbicz"
Oct 2005
Hungary

2·3·233 Posts
Default

Quote:
Originally Posted by preda View Post
What if, additionally, I bound the magnitude of X, let's say by log2(X) <= 10 * log2(N#) -- does this restriction change anything?
You have a really good chance to halve the number of ones from the expected L/2 to L/4, assuming log2(n)=L:
binomial(10*L,L/4)>39^(L/4)=n^(log(39)/log(2)/4)=n^1.321 >> n
R. Gerbicz is offline   Reply With Quote
Old 2020-07-29, 03:12   #13
Citrix
 
Citrix's Avatar
 
Jun 2003

62516 Posts
Default

Quote:
Originally Posted by preda View Post
Hi, I have this little question:

given a bound N, taking the primorial N# (the product of all the primes up to N),

we count the number of 1-bits in the binary representation of N#: pops(N#)

I would expect the pops to be around 1/2 * log2(N#), coming from a roughly equal chances of bits being 0/1 in the binary representation of N#.

I would like to find a low-pops multiple of that primorial.

i.e.: find X such that pops(N# * X) is [a lot] smaller then pops(N#). Let's define the gain in bits: gain(X)=pops(N#) - pops(N# * X)

How much "gain" can I expect, and how to find a X multiplier that produces a good gain?

What if, additionally, I bound the magnitude of X, let's say by log2(X) <= 10 * log2(N#) -- does this restriction change anything?

My intuition says that this is a "hopeless" problem, meaning that it's extremely hard to find an X that produces a gain of some significant ratio of log2(N#), e.g. a gain of 1/4*log2(N#).

Is there some mathematical argument in this area?
thanks!

PS: maybe the problem can be generalized to any number M (not only a primorial), becoming: given a number M, find a low-pops multiple of it.
I am not sure if I understand your question correctly.
From what I understand you are looking for X*N# where the number of '1' bits are very low.

Another way of phrasing this problem would be:
Find 2^x such that 2^x==y (mod N#) and y is small and negative
This becomes a discrete log problem
With N# consisting of all the small primes this can be further be divided for each prime under N. This can be computed extremely fast.
For a random number M similar method can be used assuming p-1 of each factor is smooth to some extent.

If you want to keep x small then we can try solving 2^x+2^z==y (mod N#) and so on...

You can adjust your search space based on your needs.

Edit- This is a way of finding the solution but the gains would not be significant for data compression as you are converting the complexity of the input N# into output complexity of x,z etc.

Last fiddled with by Citrix on 2020-07-29 at 03:23
Citrix is offline   Reply With Quote
Old 2020-07-29, 08:11   #14
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

2·607 Posts
Default

Quote:
Originally Posted by Citrix View Post
I am not sure if I understand your question correctly.
From what I understand you are looking for X*N# where the number of '1' bits are very low.

Another way of phrasing this problem would be:
Find 2^x such that 2^x==y (mod N#) and y is small and negative
This becomes a discrete log problem
With N# consisting of all the small primes this can be further be divided for each prime under N. This can be computed extremely fast.
For a random number M similar method can be used assuming p-1 of each factor is smooth to some extent.

If you want to keep x small then we can try solving 2^x+2^z==y (mod N#) and so on...

You can adjust your search space based on your needs.

Edit- This is a way of finding the solution but the gains would not be significant for data compression as you are converting the complexity of the input N# into output complexity of x,z etc.
Thanks!
How would solving the discrete log look like for:
3 * 5 * 7 * 11 * 13 * 17 * 19 * 23 == 111546435
or for:
3 * 5 * 7 * 11 * 13 * 17 * 19 * 23 * 27 == 3011753745
?

Let's say we want a multiple of the above of at most 300bits.

Last fiddled with by preda on 2020-07-29 at 08:13
preda is online now   Reply With Quote
Old 2020-07-29, 18:42   #15
JeppeSN
 
JeppeSN's Avatar
 
"Jeppe"
Jan 2016
Denmark

5×31 Posts
Question

Quote:
Originally Posted by preda View Post
Thanks!
How would solving the discrete log look like for:
3 * 5 * 7 * 11 * 13 * 17 * 19 * 23 == 111546435
or for:
3 * 5 * 7 * 11 * 13 * 17 * 19 * 23 * 27 == 3011753745
?

Let's say we want a multiple of the above of at most 300bits.
I try the first one. With PARI/GP, I try znlog(j, Mod(2, 3 * 5 * 7 * 11 * 13 * 17 * 19 * 23)) for all j = -1, -2, -3, ... The first time it gives anything, is for j = -69613. Then my multiple is 2^2929+69613. It has hammingweight (the PARI name for "pops") 12.

Explicitly, that gives:

Code:
3 * 5 * 7 * 11 * 13 * 17 * 19 * 23 * 4670911135951830670908061342184620067675149710598085670529534071346666252078598007938153523409455974649558521668598946844822022384564386566865625495001516198504091914585194088996678565204189348577008651064806108190563645021005204603353866364622193433252633722340556755457655750039477777274990095817430693664225941478819333671787921122513794582665601788086785126662337904429323871544633703900505738061594109307716230092050162038126416108850876405244067015457154781594023741537224801608620265570120822777232090945257144958639048695692047122601957815224067108345517730425197829379478203006053099508383633795947639082623658496688308850945331899791490765912750669828993402821730858997864528028008874066832321235869648230683555567133525501757545124333133107788648153431123654740287301004622981403404043970380640605440916007872963437737182088045547772461110047503451238999919539215
has pops 12.

But it is not an improvement, for hammingweight(3 * 5 * 7 * 11 * 13 * 17 * 19 * 23) is only 10. And surely my example exceed 300 bits by far!

I could be doing everything wrong.

/JeppeSN

Last fiddled with by JeppeSN on 2020-07-29 at 18:45 Reason: had to put long factor in CODE tags
JeppeSN is offline   Reply With Quote
Old 2020-07-29, 19:06   #16
JeppeSN
 
JeppeSN's Avatar
 
"Jeppe"
Jan 2016
Denmark

5·31 Posts
Default

An improvement (i.e. lower hammingweight) I can get with that method, is the multiple 2^2513+1098793. It has pops 8. It is a multiple of 23# / 2. /JeppeSN
JeppeSN is offline   Reply With Quote
Old 2020-07-30, 02:02   #17
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

2·607 Posts
Default

A simple information theoretic argument, based on entropy, suggests that (almost all) numbers of N bits have a multiple of 2xN bits with a N/10 binary hamming weight (HW).

(below, log() is in base 2).

It goes like this: if the probability of a bit of being 1 is "p" (and of being 0 is q=1-p), then the information of such a bit is:
p*log(1/p) +q*log(1/q). (Thus the information content of a N-bits number with p=1/2 is N bits). The information content of a number with "loaded" bits (i.e. p != 1/2) is lower per-bit, thus the total number of bits must be increased to compensate. It turns out that when p=1/10, the number of bits needs to be roughly doubled (because 1/10*log(10)+9/10*log(10/9)==0.469 ~= 0.5). So overall, a 5x reduction of the HW is achieved through a doubling of the bit-length.

This argument holds for arbitrary numbers N-bits in length. OTOH primorials are not arbitrary numbers at all. In fact, the information content of a primorial of N-bits in length is not N bits but log(N) bits, thus much much lower. (this can be seen like this: if I want to transmit you a primorial Q#, it's enough to transmit just the value Q, which is O(log(Q#)) in size.

Because of this "low information content" of primorials, it is not excluded that they might have much-lower HW multiples then expected from the above information-theoretic argument. (but not implied either).

Last fiddled with by preda on 2020-07-30 at 02:21
preda is online now   Reply With Quote
Old 2020-07-30, 03:22   #18
Citrix
 
Citrix's Avatar
 
Jun 2003

112×13 Posts
Default

Quote:
Originally Posted by preda View Post
Thanks!
How would solving the discrete log look like for:
3 * 5 * 7 * 11 * 13 * 17 * 19 * 23 == 111546435
or for:
3 * 5 * 7 * 11 * 13 * 17 * 19 * 23 * 27 == 3011753745
?

Let's say we want a multiple of the above of at most 300bits.
Discrete log:-
You need to take LCM of p-1 of all the prime factors p1,p2, in p#
LCM=3960
Then you take 2^1...2^3960 (mod N#) find the number with least number of '1' and that is your solution.

For limiting it under 300 bits:-

You calculate 2^1 (mod N#), 2^2 (mod N#)...., 2^300 (mod N#)
Then for each value of y

You have a set of 301 values

Then you want to find a subset from the 301 values such that sum of these numbers and y=0 (mod N#)

This is called the modular subset sum problem. Search on google ... there are some fast algorithms for this. You would need to modify these algorithms so the best solution is found first.

If your number is extremely smooth you could possibly make your algorithm faster by solving for each factor separately and putting it together.
Citrix is offline   Reply With Quote
Old 2020-07-30, 08:35   #19
JeppeSN
 
JeppeSN's Avatar
 
"Jeppe"
Jan 2016
Denmark

2338 Posts
Default

Quote:
Originally Posted by Citrix View Post
Discrete log:-
You need to take LCM of p-1 of all the prime factors p1,p2, in p#
LCM=3960
Then you take 2^1...2^3960 (mod N#) find the number with least number of '1' and that is your solution.
For the LCM, with PARI, using lcm(apply(x->x-1,primes([3,23]))), you get 7920.

When you try to find the member from 2^1, 2^2, 2^3, ..., 2^7920 (mod 23#) with the least hammingweight, then nothing beats the first few terms in that list; they clearly have only one '1' bit?

/JeppeSN
JeppeSN is offline   Reply With Quote
Old 2020-07-30, 12:01   #20
Citrix
 
Citrix's Avatar
 
Jun 2003

30458 Posts
Default

Quote:
Originally Posted by JeppeSN View Post
For the LCM, with PARI, using lcm(apply(x->x-1,primes([3,23]))), you get 7920.

When you try to find the member from 2^1, 2^2, 2^3, ..., 2^7920 (mod 23#) with the least hammingweight, then nothing beats the first few terms in that list; they clearly have only one '1' bit?

/JeppeSN
1. correction:Replace p-1 with smallest n such that 2^n==1 (mod p). Then calculate the LCM

2. 2^1, 2^2, 2^3, ..., 2^7920 (mod 23#) -- only consider terms greater than 23# otherwise the number is 23# itself.
Let m==2^n (mod 23#)
You need to find the number of '1' bits in 2^n+23#-m
Citrix is offline   Reply With Quote
Old 2020-08-03, 08:42   #21
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

2·607 Posts
Default The Why

I'm going to give away the backstory for my question about low-hamming-weight multiples.

"It's all for P-1"

You may know, GpuOwl can do both PRP and P-1. PRP has the advantage of being protected with the famous "G" error check (AKA "GEC"). P-1 (as implemented) OTOH is "naked", with no error check.

But there's a better way!
P-1 (first stage) can be run together with PRP, for better efficiency and a partial error check.

P-1 first stage works like this:
1. compute the exponent E=PowerSmooth(B1),
2. compute X=3^PowerSmooth(B1) (mod M)
3. do GCD(X-1, M)

(where PowerSmooth(B) computes the product of all primes raised to a maximal power such that they're under B, i.e. product(over p prime with p < B, of p^floor(log(B)/log(p)))).
in pari-gp:
Code:
powersmooth(b)= pr=1;forprime(p=2,b,pr*=p^floor(log(b)/log(p)));pr
Concerning the step 2,
Code:
X=3^PowerSmooth(B1)
This modular exponentiation can be done in two ways, funnily named "left-to-right" and "right-to-left" binary exponentiation.. (BTW I always mix them up)
L-to-R: MSB to LSB
R-to-L: LSB to MSB

The fastest is L-to-R binary exponentiation, as it requires only squaring and multiplication-by-3, which (being mul by a small constant) is free. So the cost for 3^PS(B1) is N squarings, where N is the number of bits of PS(B1).

The R-to-L binary exponentiation requires N squarings plus HammingWeight(PS(B1)) multiplications, so seems like a worse choice.

BUT the squarings needed in the R-to-L exponentiation are exactly the same squarings that take place in the initial part of the PRP test! If we assume one does P-1 followed by PRP, the squarings are "free" in the P-1 (i.e. counted towards the PRP), and in addition the squarings are protected by the GEC.

So the cost of the R-to-L exponentiation is HW(PS(B1)), i.e. the number of pops of the PowerSmooth(B1), which is about 1/2 of the bitlen of PS(B1), and this is significantly cheaper than the cost of L-to-R exponentiation. The drawback is that the P-1 must be followed by PRP in order to reap the cost benefit.


At this point it should be clear why the quest for a low-hamming-weight multiple. Instead of PS(B1) we can use just as well any multiple of it, without even needing to know which multiple it is. A reduction in the HW results in a proportional reduction in the cost of P-1 (first stage).

Unfortunately I could find no practical way at all for reducing the HW of the PS(B1). Let's enumerate the approaches tried:

1. low-HW multiple of a small number.
(let's say "small number" being up to 32bits or maybe even 64bits, and low-HW being <7 bits)

The key here is to think of the contribution of each bit that is set modulo the "base", and target a combination of bits that sum to 0 (mod "base"), i.e. represent a multiple of the base. And the problem has similarities to subset-sum.

This can be solved using either dynamic programming, or a "brute search" that takes advantage of the extremely-low HW that is targeted (HW < 7).

(not all the small numbers have a very-low HW multiple)

The problem with this approach is that it works for small numbers only. The solution can't be extended to the huge numbers needed for PS(B1): While PS(B1) can easily be split into a product of (many) 32-bit factors, the lowHW solutions for those factors can't be assembled back into a lowHW solution to PS(B1).


(below, PS is without the powers of "2"):

2. For B1=1M, bitlen(PS(1M))==1442080. So we have a number of about 1.4M bits. The starting HW is: HW(PS(1M))==720737.

I tried various approaches to search among the multiples of PS(1M), and the lowest HW I found was around 718K, let's say a HW reduction of about 0.4%! Impressivelly unremarkable. Feel free to try yourself, doing such experiments is not too difficult in pari-gp.

So for some reason, the search for low-HW multiples of PS(1M) is much harder than it might seem.

At this point I think an approch that might work a bit better would be to look at the base number as a pattern of bits, that can be shifted and added back (this is what multiplication does). This shift-and-add can both create and destroy 1-bits. Maybe an algorithm could be constructed there.

Anyway in the meantime I consider the problem "impossible" myself. But if somebody comes up with a multiple of PS(1M) that is.. let's say 1% improvement (HW<714K), I'd be impressed!

Code:
ps(b)=pr=1;forprime(p=3,b,pr*=p^floor(log(b)/log(p)));pr
hw(x)=hammingweight(x)
base=ps(1000000)
hw(base)

Last fiddled with by preda on 2020-08-03 at 08:46
preda is online now   Reply With Quote
Old 2020-08-05, 03:16   #22
Citrix
 
Citrix's Avatar
 
Jun 2003

112×13 Posts
Default

Your idea seems interesting!

If we could find a N with low hammer weight then we can use it for testing all p-1 candidates again and again so this would be useful. We could use dynamic programming to solve this once as it would be used again and again.

Some questions/comments/suggestions
1) Why do we have to use all the primes in B1. A smaller subset set of primes in B1 might have an lower hammer weight N. We then take this smaller subset and the remaining B1 primes (or move them to the B2 stage). Have you tried this?

2) We do not care about the size of the lower hammer weight N just that it must be less than 50% the size of the exponent of the candidate being tested (to make this useful). Correct?

3) To assess the feasibility of this : For a small prime p taking all the multiple of p and that are odd between 1 and 2^(p-1)-1.
We then count the frequency of various hammer weight. Is the distribution the same for various values of p. What about composites. Is this random - I suspect not? binomial distribution.

We then can predict/calculate how deep we would need to search to find a low hammer weight solution for B1=1M and at what exponent level this method will become practical.

Let me know what you find. I do not have PARI to test this myself.

Last fiddled with by Citrix on 2020-08-05 at 03:18 Reason: typo, grammar
Citrix is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Primorial offsets robert44444uk Prime Gap Searches 7 2018-11-29 08:40
Primorial calculation FreakyPotato Programming 7 2015-02-06 10:33
Primorial puzzle Citrix Puzzles 3 2006-03-07 15:07
Primorial question Dougy Math 2 2005-07-28 13:13
Multiple systems/multiple CPUs. Best configuration? BillW Software 1 2003-01-21 20:11

All times are UTC. The time now is 23:14.

Mon Sep 28 23:14:07 UTC 2020 up 18 days, 20:25, 0 users, load averages: 1.45, 1.59, 1.63

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.