 mersenneforum.org (https://www.mersenneforum.org/index.php)
-   sweety439 (https://www.mersenneforum.org/forumdisplay.php?f=137)
-   -   Minimal set of the strings for primes with at least two digits (https://www.mersenneforum.org/showthread.php?t=24972)

 sweety439 2019-11-25 18:28

Minimal set of the strings for primes with at least two digits

[URL="https://primes.utm.edu/glossary/page.php?sort=MinimalPrime"]https://primes.utm.edu/glossary/page.php?sort=MinimalPrime[/URL]

In 1996, Jeffrey Shallit [Shallit96] suggested that we view prime numbers as strings of digits. He then used concepts from formal language theory to define an interesting set of primes called the minimal primes:

A string a is a subsequence of another string b, if a can be obtained from b by deleting zero or more of the characters in b. For example, 514 is a substring of 251664. The empty string is a subsequence of every string.
Two strings a and b are comparable if either a is a substring of b, or b is a substring of a.
A surprising result from formal language theory is that every set of pairwise incomparable strings is finite [Lothaire83]. This means that from any set of strings we can find its minimal elements.
A string a in a set of strings S is minimal if whenever b (an element of S) is a substring of a, we have b = a.
This set must be finite!

For example, if our set is the set of prime numbers (written in radix 10), then we get the set {2, 3, 5, 7, 11, 19, 41, 61, 89, 409, 449, 499, 881, 991, 6469, 6949, 9001, 9049, 9649, 9949, 60649, 666649, 946669, 60000049, 66000049, 66600049}, and if our set is the set of composite numbers (written in radix 10), then we get the set {4, 6, 8, 9, 10, 12, 15, 20, 21, 22, 25, 27, 30, 32, 33, 35, 50, 51, 52, 55, 57, 70, 72, 75, 77, 111, 117, 171, 371, 711, 713, 731}

Besides, if our set is the set of prime numbers written in radix b, then we get these sets:

[CODE]
b, we get the set
2: {10, 11}
3: {2, 10, 111}
4: {2, 3, 11}
5: {2, 3, 10, 111, 401, 414, 14444, 44441}
6: {2, 3, 5, 11, 4401, 4441, 40041}
[/CODE]

these are already researched in [URL="https://cs.uwaterloo.ca/~cbright/reports/mepn.pdf"]https://cs.uwaterloo.ca/~cbright/reports/mepn.pdf[/URL].

Now, let's consider: if our set is [B]the set of prime numbers >= b[/B] written in radix b (i.e. the prime numbers with at least two digits in radix b), then we get the sets:

[CODE]
b, we get the set
2: {10, 11}
3: {10, 12, 21, 111}
4: {11, 13, 23, 31, 221}
5: {10, 12, 21, 23, 32, 34, 43, 111, 131, 133, 313, 401, 414, 14444, 30301, 33001, 33331, 44441, 300031}
6: {11, 15, 21, 25, 31, 35, 45, 51, 4401, 4441, 40041}
7: {10, 14, 16, 23, 25, 32, 41, 43, 52, 56, 61, 65, 113, 115, 131, 133, 155, 212, 221, 304, 313, 335, 344, 346, 364, 445, 515, 533, 535, 544, 551, 553, 1112, 1211, 1222, 2111, 3031, 3055, 3334, 3503, 3505, 3545, 4504, 4555, 5011, 5455, 5545, 5554, 6034, 6634, 11111, 30011, 31111, 33001, 33311, 35555, 40054, 300053, 33333301}
8: {13, 15, 21, 23, 27, 35, 37, 45, 51, 53, 57, 65, 73, 75, 107, 111, 117, 141, 147, 161, 177, 225, 255, 301, 343, 361, 401, 407, 417, 431, 433, 463, 467, 471, 631, 643, 661, 667, 701, 711, 717, 747, 767, 3331, 3411, 4043, 4443, 4611, 5205, 6007, 6101, 6441, 6477, 6707, 6777, 7461, 7641, 47777, 60171, 60411, 60741, 444641, 500025, 505525, 3344441, 4444477, 5500525, 5550525, 55555025, 444444441, 744444441}
[/CODE]

However, I do not think that my base 7 and 8 sets are complete (I use PARI program to find these primes (all written in base b), but I only searched the primes with <= 8 digits, so there may be missing primes), I proved that my base 2, 3, 4, 5 and 6 sets are complete.

Can someone complete my base 7 and 8 set? Also find the sets of bases 9 to 36.

 LaurV 2019-11-26 05:46

Your sets for 2 and 3 are complete, but the others are not.

For example, all Fermat numbers that may be prime are of the form "1000.....1" in base 4, and additionally, you will have and endless number of combinations of patterns of 0 and 1 that may be prime, and yet not appear in your list, so your base 4 may look like:
[CODE]List(["2", "3", "11", "101", "10001", "100000001", "100001001001001", "100100100001001"])[/CODE]
5 seems to be "endless", too...
[code]
List(["2", "3", "10", "111", "401", "414", "1404", "4041", "14004", "14404", "14411", "14444", "40041", "40441", "41141", "44001
", "44441", "144404", "404001", "404441", "1440044", "1444114", "4000001", "4000144", "4001411", "4114001", "11440001", "1400000
4", "14000114", "14000404", "14000411", "40400001", "40400014", "40400041", "44000441", "44400014", "44440001", "140000404", "14
0000411", "140000444", "140011411", "140014001", "141144004", "144000044", "144000114", "144000444", "400040014", "400114001", "
400140004", "400144004", "404000041", "404004001", "404040001", "404400001", "404400041", "440040001", "1400040004", "1400044114
", "1411400011", "1440000404", "1440004114", "1444000114", "1444004404", "4000000041", "4000400041", "4000400441", "4004400041",
"4040000041", "4040440001", "4044000041", "4044004001", "4400000001", "4400011441", "4404000441", "4444400041", "11400000001",
"11400000041", "11400114441", "11440000144", "11444000014", "14000011444", "14000014441", "14000040011", "14000040044", "1400004
4004", "14000044114", "14000140001", "14001440001", "14114000011", "14114000444", "14400000004", "14400000011", "14400001441", "
14400004114", "14400040044", "14400400141", "14400400444", "14400404044", "14440000004", "14440004044", "14440040004", "14440040
444", "40000000041", "40000001411", "40000004441", "40000040014", "40000141144", "40000400441", "40001400004", "40004001141", "4
0004004441", "40011400001", "40014000044", "40014001144", "40014440044", "40040000001", "40040040001", "40044004001", "400440044
41", "40044400001", "40044411441", "40400004441", "40440400441", "40444000041", "40444004441", "41144400001", "44000400014", "44
004440001", "44040004441", "44044040001", "44114000441", "44114400001", "44400001444", "44400004001", "44400004441", "4444000000
1", "44440000144", "44444000041"])
[/code]etc...

 uau 2019-11-26 06:05

[QUOTE=LaurV;531466]Your sets for 2 and 3 are complete, but the others are not.

For example, all Fermat numbers that may be prime are of the form "1000.....1" in base 4, and additionally, you will have and endless number of combinations of patterns of 0 and 1 that may be prime, and yet not appear in your list, so your base 4 may look like:
[CODE]List(["2", "3", "11", "101", "10001", "100000001", "100001001001001", "100100100001001"])[/CODE][/QUOTE]
"11" is already on the list. That excludes all your additions (they contain two 1s).

[QUOTE]
5 seems to be "endless", too...
[code]
List(["2", "3", "10", "111", "401", "414", "1404", "4041"])
[/code]etc...[/QUOTE]Invalid, as 401 is comparable to 4041 (40(4)1) and so on.

 LaurV 2019-11-26 06:10

Ok, scratch the former post, I am stupid.
It looked odd, and I could not believe those paper were right, then I went and read them (skimmed) and realized the sequence does not need to be consecutive. In fact, you stated as much as that, and your example with 251664 shows very clearly what you mean. Sorry for being stupid :redface:
(in fact, my excuse is that you post so much rubbish here that I don't read your post more than fast skimming mode :razz:)
Give me 5 minutes... (look how I am spending my lunch break!)

(edit @uau: crosspost, I was talking to sweety)

 LaurV 2019-11-26 06:46

Ok, here is the code, and I searched much higher, there are no new numbers.
[CODE]

\\finds text substrings in text strings, or small-sub-vecteors in large small-vectors :P
\\(kind of efficient, but it still could be optimized...)
find(subtxt,txt,startfrom=1)=
{
my(i,j,s=Vecsmall(subtxt), t=Vecsmall(txt));
i=startfrom-1;
while(i<=#t-#s,
j=1;
while(j<=#s,
if(s[j]!=t[i+j],
break
);
j++
);
if(j>#s,
return(i+1)
);
i++
);
return(0);
}

\\similar, but the strings do not need to be consecutive
findweak(subtxt,txt,startfrom=1)=
{
my(i,j,s=Vecsmall(subtxt), t=Vecsmall(txt));
i=startfrom;
j=1;
while(i<=#t && j<=#s,
if(s[j]==t[i],
j++
);
i++
);
if(j>#s,
return(1),
return(0)
);
}

\\base from 2 to 36, no validation!
bprint(n,base=10)=
{
if(base==10,
return(Str(n))
);
my(v=digits(n,base));
for(i=1,#v,
if(v[i]<10,
v[i]+=48,
v[i]+=55
)
);
return(Strchr(v));
}

minprime(startfrom=2, base=10)=
{
my(n,k,x,cnt);
lstminprimes=List([]);
n=nextprime(startfrom);
listput(lstminprimes,bprint(n,base));
print(lstminprimes);
while(1, \\stop it with ctrl+c when fed-up with it, etc
n=nextprime(n+1);
x=bprint(n,base);
k=1;
while(k<=#lstminprimes,
\\ if(find(lstminprimes[k],x),
if(findweak(lstminprimes[k],x),
break
);
k++
);
if(k>#lstminprimes,
listput(lstminprimes,x);
print(lstminprimes)
);
cnt++;
if(cnt%10000==0,
printf("...%d...%c",n,13)
)
);
}
[/CODE]Interesting that for 7 (when the restriction of n>=base is not required) it seems "easy" to prove, that the set is maximal, because combinations of only 0, 4, and 6 can not be prime (in any base, not only base 7). I still don't get it why they say it is hard to prove... Maybe there is a case I am missing...

 LaurV 2019-11-26 11:21

I let that script running in background for 8, during I was doing some real job-related work, and it still produced 77774444441.
Meantime I realized that it is terrible inefficient, and it can be done much cleverer. So. factory stopped for now, I may revisit it in the future (probably weekend).
I also have an idea how 7 and 8 (starting from 2 digits) can be "proved" formally.

I go home now, 18:20 here.

 LaurV 2019-11-28 11:29

I found an easy way to generate those sets, and to prove that they are complete.

For the "starting from two digits" version, neither one of the exposed sets for 7 and 8 are complete. Some larger primes are still lurking in the dark there. I have the complete sets for both 8, and 7 for the both cases when the base itself is included in the set or not*, but I don't want to spoil the puzzle, this is an interesting little problem... hehe...

Hint:
[CODE]
gp > a=(7^17-5)/2
%1 = 116315256993601
gp > isprime(a)
%2 = 1
gp > digits(a,7)
%3 = [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1]
gp >
[/CODE]---------
*when the base is prime, like for 5 and 7, the sets are different; including the base results in automatic elimination of all possible extension numbers with "0 after 1" from the set, which is quite restrictive, so I also calculated the lists for the "base is not included" version, i.e. base-5 starting from 6, and base-7 starting from 8; in this case, for example, base-5 will include numbers like 104 and 10103 which are prime, and base-7 list will include 1022, 1051, 1202, .... 1100021 ... etc, they are "enriched" compared with the case when the first "10" is included. So I have the complete list for 8, and the complete two lists for 7, the normal one, and the "enriched" one. Base-5 is easy, in any case.

 sweety439 2019-11-28 22:16

[QUOTE=LaurV;531632]I found an easy way to generate those sets, and to prove that they are complete.

For the "starting from two digits" version, neither one of the exposed sets for 7 and 8 are complete. Some larger primes are still lurking in the dark there. I have the complete sets for both 8, and 7 for the both cases when the base itself is included in the set or not*, but I don't want to spoil the puzzle, this is an interesting little problem... hehe...

Hint:
[CODE]
gp > a=(7^17-5)/2
%1 = 116315256993601
gp > isprime(a)
%2 = 1
gp > digits(a,7)
%3 = [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1]
gp >
[/CODE]---------
*when the base is prime, like for 5 and 7, the sets are different; including the base results in automatic elimination of all possible extension numbers with "0 after 1" from the set, which is quite restrictive, so I also calculated the lists for the "base is not included" version, i.e. base-5 starting from 6, and base-7 starting from 8; in this case, for example, base-5 will include numbers like 104 and 10103 which are prime, and base-7 list will include 1022, 1051, 1202, .... 1100021 ... etc, they are "enriched" compared with the case when the first "10" is included. So I have the complete list for 8, and the complete two lists for 7, the normal one, and the "enriched" one. Base-5 is easy, in any case.[/QUOTE]

So, what is the complete list of 7? Is it just my list {10, 14, 16, 23, 25, 32, 41, 43, 52, 56, 61, 65, 113, 115, 131, 133, 155, 212, 221, 304, 313, 335, 344, 346, 364, 445, 515, 533, 535, 544, 551, 553, 1112, 1211, 1222, 2111, 3031, 3055, 3334, 3503, 3505, 3545, 4504, 4555, 5011, 5455, 5545, 5554, 6034, 6634, 11111, 30011, 31111, 33001, 33311, 35555, 40054, 300053, 33333301} plus the prime 33333333333333331? And what is the complete list of 8? You said that you have it.

 sweety439 2019-11-28 22:40

[QUOTE=LaurV;531470]Ok, here is the code, and I searched much higher, there are no new numbers.
[CODE]

\\finds text substrings in text strings, or small-sub-vecteors in large small-vectors :P
\\(kind of efficient, but it still could be optimized...)
find(subtxt,txt,startfrom=1)=
{
my(i,j,s=Vecsmall(subtxt), t=Vecsmall(txt));
i=startfrom-1;
while(i<=#t-#s,
j=1;
while(j<=#s,
if(s[j]!=t[i+j],
break
);
j++
);
if(j>#s,
return(i+1)
);
i++
);
return(0);
}

\\similar, but the strings do not need to be consecutive
findweak(subtxt,txt,startfrom=1)=
{
my(i,j,s=Vecsmall(subtxt), t=Vecsmall(txt));
i=startfrom;
j=1;
while(i<=#t && j<=#s,
if(s[j]==t[i],
j++
);
i++
);
if(j>#s,
return(1),
return(0)
);
}

\\base from 2 to 36, no validation!
bprint(n,base=10)=
{
if(base==10,
return(Str(n))
);
my(v=digits(n,base));
for(i=1,#v,
if(v[i]<10,
v[i]+=48,
v[i]+=55
)
);
return(Strchr(v));
}

minprime(startfrom=2, base=10)=
{
my(n,k,x,cnt);
lstminprimes=List([]);
n=nextprime(startfrom);
listput(lstminprimes,bprint(n,base));
print(lstminprimes);
while(1, \\stop it with ctrl+c when fed-up with it, etc
n=nextprime(n+1);
x=bprint(n,base);
k=1;
while(k<=#lstminprimes,
\\ if(find(lstminprimes[k],x),
if(findweak(lstminprimes[k],x),
break
);
k++
);
if(k>#lstminprimes,
listput(lstminprimes,x);
print(lstminprimes)
);
cnt++;
if(cnt%10000==0,
printf("...%d...%c",n,13)
)
);
}
[/CODE]Interesting that for 7 (when the restriction of n>=base is not required) it seems "easy" to prove, that the set is maximal, because combinations of only 0, 4, and 6 can not be prime (in any base, not only base 7). I still don't get it why they say it is hard to prove... Maybe there is a case I am missing...[/QUOTE]

Well, I already have the code:

[CODE]
a(n,k,b)=v=[];for(r=1,length(digits(n,b)),if(r+length(digits(k,2))-length(digits(n,b))>0 && digits(k,2)[r+length(digits(k,2))-length(digits(n,b))]==1,v=concat(v,digits(n,b)[r])));fromdigits(v,b)

g(n)=if(n<10,n+48,if(n<36,n+55,if(n<62,n+61,if(n<77,n-29,if(n<84,n-19,if(n<90,n+7,if(n<94,n+33,n+67)))))))

f(n,b)=for(k=1,length(digits(n,b)),print1(Strchr(g(digits(n,b)[k]))))

is(n,b)=for(k=1,2^length(digits(n,b))-2,if(ispseudoprime(a(n,k,b)) && a(n,k,b)>=b,return(0)));1

c(b)=forprime(p=b,2^32,if(is(p,b),f(p,b);print1(", ")))
[/CODE]

and for the original problem in [URL="https://primes.utm.edu/glossary/page.php?sort=MinimalPrime"]https://primes.utm.edu/glossary/page.php?sort=MinimalPrime[/URL] (single-digit prime is also included), this is the code:

[CODE]
a(n,k,b)=v=[];for(r=1,length(digits(n,b)),if(r+length(digits(k,2))-length(digits(n,b))>0 && digits(k,2)[r+length(digits(k,2))-length(digits(n,b))]==1,v=concat(v,digits(n,b)[r])));fromdigits(v,b)

g(n)=if(n<10,n+48,if(n<36,n+55,if(n<62,n+61,if(n<77,n-29,if(n<84,n-19,if(n<90,n+7,if(n<94,n+33,n+67)))))))

f(n,b)=for(k=1,length(digits(n,b)),print1(Strchr(g(digits(n,b)[k]))))

is(n,b)=for(k=1,2^length(digits(n,b))-2,if(ispseudoprime(a(n,k,b)),return(0)));1

c(b)=forprime(p=2,2^32,if(is(p,b),f(p,b);print1(", ")))
[/CODE]

Enter c(n) to print all minimal prime in base n (written in base n)

 LaurV 2019-11-29 02:17

I didn't look yet how good is your code, but my former one is lousy, so there are chances that yours is better. I mean, not the code, but my method itself was lousy, to look at all primes one by one. The authors of that paper you linked describe a method which is much better and somehow similar to what I am doing now.

Right now, I split the problem in two steps, first I let the zero apart, and solve the problem with "digits" from 1 to b-1, by starting from the end with all possible cases in a set. Starting from the end or from the beginning makes no difference, but in the case the base is even, I only have n/2 elements in the initial set (because numbers ending in 2, 4, 6, etc, can never be prime), so the search dimension is reduced in half. Then, for all elements in set, I check what digit I can add in front of them and still avoiding conflicts. If any of the resulting numbers is prime, I add it to the set. Here is where the algorithm "strikes", because I can do this in about linear time, by creating a matrix with the possible candidates, and then eliminating them from the matrix, by different criteria (like, it produces conflict, it is a prime and I add it to the list, or it is always composite regardless of how you extend it, etc), and sometimes full rows and columns can be eliminated. This gives me the complete set, excluding the numbers that contain zero, in just minutes.

The second part comes from the realization that the numbers that contain zero and have to be in the set, if we delete zeroes from them, the new created are (1) still not in the set, and (2) can not be covered with numbers in the set, and (3) are the same magnitude as the numbers in the set except maybe the first digit, that can repeat indefinitely till the first prime is found.

The (3) is very important (and it can be proved) so the second part of the algorithm is to create a list with all such numbers (like 5-6 digit numbers in our case) and see which one becomes a prime when it is "stuffed" with zeroes, which is piece of cake. Mind that the zeros have to be "between" the digits, as "leading zeros do not count"[sup](TM)[/sup] and numbers ending in zero in any base are not prime.

 sweety439 2019-11-29 06:50

[QUOTE=LaurV;531660]I didn't look yet how good is your code, but my former one is lousy, so there are chances that yours is better. I mean, not the code, but my method itself was lousy, to look at all primes one by one. The authors of that paper you linked describe a method which is much better and somehow similar to what I am doing now.

Right now, I split the problem in two steps, first I let the zero apart, and solve the problem with "digits" from 1 to b-1, by starting from the end with all possible cases in a set. Starting from the end or from the beginning makes no difference, but in the case the base is even, I only have n/2 elements in the initial set (because numbers ending in 2, 4, 6, etc, can never be prime), so the search dimension is reduced in half. Then, for all elements in set, I check what digit I can add in front of them and still avoiding conflicts. If any of the resulting numbers is prime, I add it to the set. Here is where the algorithm "strikes", because I can do this in about linear time, by creating a matrix with the possible candidates, and then eliminating them from the matrix, by different criteria (like, it produces conflict, it is a prime and I add it to the list, or it is always composite regardless of how you extend it, etc), and sometimes full rows and columns can be eliminated. This gives me the complete set, excluding the numbers that contain zero, in just minutes.

The second part comes from the realization that the numbers that contain zero and have to be in the set, if we delete zeroes from them, the new created are (1) still not in the set, and (2) can not be covered with numbers in the set, and (3) are the same magnitude as the numbers in the set except maybe the first digit, that can repeat indefinitely till the first prime is found.

The (3) is very important (and it can be proved) so the second part of the algorithm is to create a list with all such numbers (like 5-6 digit numbers in our case) and see which one becomes a prime when it is "stuffed" with zeroes, which is piece of cake. Mind that the zeros have to be "between" the digits, as "leading zeros do not count"[sup](TM)[/sup] and numbers ending in zero in any base are not prime.[/QUOTE]

In prime bases, 10 is prime.

There is C code and C++ code (given by others, not me), but I do not know how to run :sad::sad:

[URL="https://github.com/curtisbright/mepn-data/"]https://github.com/curtisbright/mepn-data/[/URL]

[URL="https://github.com/RaymondDevillers/primes"]https://github.com/RaymondDevillers/primes[/URL]

All times are UTC. The time now is 13:16.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.