mersenneforum.org  

Go Back   mersenneforum.org > Math Stuff > Probability & Probabilistic Number Theory

Reply
 
Thread Tools
Old 2014-08-01, 19:09   #1
CRGreathouse
 
CRGreathouse's Avatar
 
Aug 2006

2×2,969 Posts
Default Estimating the number of primes in a partially-factored number

I'm interested in using information about a number to estimate the likelihood of it having a certain number of prime factors.

If we had no information about the number except its size, there are two obvious approaches: the Erdős-Kac theorem plus continuity correction, or Landau's extension to the Prime Number Theorem:
\pi_k(n)\sim\frac{n}{\log n}\frac{(\log\log n)^{k-1}}{(k-1)!}

Hildebrand, Tenenbaum, and various collaborators have versions of this formula which works over a larger range, though all results are stated asymptotically so it's hard to tell over what ranges each would be best suited.

It's not obvious to me when I should use each of the various estimates. (I also wonder if there is an integral formulation of Landau's theorem, just like Li(x) is a better approximation to pi(x) than x/log x.)

Things get even more interesting if you add in partial factorization. First, it essentially takes care of the difference between distinct and indistinct factorizations, since beyond a fairly small point you don't see repeated prime factors of that size 'in the wild'. Second, it should be able to use the factorization information to refine the above estimates. Essentially you'd expect about log log x primes out of the first x, and you should be able to subtract this delta from the Erdős-Kac estimate directly. (It's not so clear how to modify Landau's formula.)

I ask these questions because in the range of numbers of interest (from those we can factor fairly easily to those we can check for probable primality fairly easily) the numbers are small enough that constant factors and o(1)'s matter.

And (pipe dream?) if the calculations are fairly routine we might ask Syd (or, for that matter, Alpertron) to include them in factordb (resp., his ECM page).
CRGreathouse is offline   Reply With Quote
Old 2014-08-02, 20:53   #2
wblipp
 
wblipp's Avatar
 
"William"
May 2003
New Haven

1001001001002 Posts
Default

How about building on the Dickman-de Bruijn function by defining a series

\rho_k(n) is the asymptotic probability that x has exactly k prime divisors greater than x^(1/n).

\rho_0(n) is the standard Dickman-de Bruijn function.

The higher k's can (I think) be defined easily in terms of an integral of (k-1) functions.

I'd hope that this heuristic gives a probability density for the number of large primes, with "large" defined by the choice of n. You could use the knowledge of prior factoring effort to choose interesting definitions of "large," and apply Bayes' theorem to modify the density to account for the prior effort.
wblipp is offline   Reply With Quote
Old 2014-08-02, 21:48   #3
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

1C4016 Posts
Default

Quote:
Originally Posted by wblipp View Post
How about building on the Dickman-de Bruijn function by defining a series

\rho_k(n) is the asymptotic probability that x has exactly k prime divisors greater than x^(1/n).

\rho_0(n) is the standard Dickman-de Bruijn function.

The higher k's can (I think) be defined easily in terms of an integral of (k-1) functions.

I'd hope that this heuristic gives a probability density for the number of large primes, with "large" defined by the choice of n. You could use the knowledge of prior factoring effort to choose interesting definitions of "large," and apply Bayes' theorem to modify the density to account for the prior effort.
Wrong approach. What needs to be computed is the conditional
probability that a large integer N has k prime factors given that
it has no factors less than (say) N^1/a, for given a.
R.D. Silverman is offline   Reply With Quote
Old 2014-08-03, 04:06   #4
wblipp
 
wblipp's Avatar
 
"William"
May 2003
New Haven

22·32·5·13 Posts
Default

Quote:
Originally Posted by R.D. Silverman View Post
Wrong approach. What needs to be computed is the conditional probability that a large integer N has k prime factors given that it has no factors less than (say) N^1/a, for given a.
That's a good approach for the scenarios where you can calculate that distribution - you can start directly with the residual composite. The only way I know to calculate that distribution is to use the inclusion-exclusion and normalization approach described in your paper with Wagstaff. That works for cases where the total number of factors is a handful. But some cases of interest will have "a" of hundreds or thousands.

For these scenarios, I think you would be better off to start with the entire original number and the distibutions I described, then use Bayes to account for the known factoring efforts.

But it doesn't really matter what I think or you think - this is a question of comparing heuristics that can studied empirically. Perhaps we can get the OP to propose some "interesting" cases.
wblipp is offline   Reply With Quote
Old 2014-08-03, 05:42   #5
CRGreathouse
 
CRGreathouse's Avatar
 
Aug 2006

2·2,969 Posts
Default

Quote:
Originally Posted by wblipp View Post
How about building on the Dickman-de Bruijn function by defining a series

\rho_k(n) is the asymptotic probability that x has exactly k prime divisors greater than x^(1/n).
Hmm. I think I could do this, but it would be a real bear to compute. I wrote code to compute, or at least closely estimate, that function a few years back and it wasn't easy. It seems like this version would suffer from all the problems of the original, but worse.

But at the ranges that I've tested it the Dickman function already does a fairly poor job; not sure how these would fare, better or worse.

Quote:
Originally Posted by wblipp View Post
You could use the knowledge of prior factoring effort to choose interesting definitions of "large," and apply Bayes' theorem to modify the density to account for the prior effort.
Sounds reasonable.
CRGreathouse is offline   Reply With Quote
Old 2014-08-03, 06:12   #6
CRGreathouse
 
CRGreathouse's Avatar
 
Aug 2006

2×2,969 Posts
Default

Quote:
Originally Posted by wblipp View Post
But it doesn't really matter what I think or you think - this is a question of comparing heuristics that can studied empirically. Perhaps we can get the OP to propose some "interesting" cases.
Sure, let's see. MM127 is an obvious candidate for a big number. For small... well, say 200^199 + 199^200 in Paul's honour.

I think the amount of work on MM127 is well-known. It doesn't look like XYYX has worked on the other one, so suppose not much work has been done on it.

Last fiddled with by CRGreathouse on 2014-08-03 at 16:43
CRGreathouse is offline   Reply With Quote
Old 2014-08-04, 14:54   #7
CRGreathouse
 
CRGreathouse's Avatar
 
Aug 2006

173216 Posts
Default

Using \omega(n) (to avoid the extra variability from the small primes that \Omega brings) and searching an interval around 1020 I find

1 distinct prime factor: 4290
2 distinct prime factors: 21379
3 distinct prime factors: 44810
4 distinct prime factors: 54544
5 distinct prime factors: 42306
6 distinct prime factors: 22179
7 distinct prime factors: 8090
8 distinct prime factors: 2022
9 distinct prime factors: 331
10 distinct prime factors: 49
11 distinct prime factors: 1

which compares to the (naive) Landau predictions of

1 distinct prime factor: 4342
2 distinct prime factors: 16632
3 distinct prime factors: 31849
4 distinct prime factors: 40658
5 distinct prime factors: 38928
6 distinct prime factors: 29817
7 distinct prime factors: 19032
8 distinct prime factors: 10412
9 distinct prime factors: 4984
10 distinct prime factors: 2121
11 distinct prime factors: 812

and the Erdős-Kac predictions

1 distinct prime factor: 29250
2 distinct prime factors: 51228
3 distinct prime factors: 70509
4 distinct prime factors: 76319
5 distinct prime factors: 64973
6 distinct prime factors: 43492
7 distinct prime factors: 22872
8 distinct prime factors: 9438
9 distinct prime factors: 3051
10 distinct prime factors: 772
11 distinct prime factors: 152

This supports the intuition that Landau is better for small numbers of prime factors and Erdős-Kac better for large. In this case the crossover is surprisingly large (8 prime factors) but neither estimate is particularly accurate.
CRGreathouse is offline   Reply With Quote
Old 2014-08-04, 18:26   #8
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

723210 Posts
Default

Quote:
Originally Posted by CRGreathouse View Post
Using \omega(n) (to avoid the extra variability from the small primes that \Omega brings) and searching an interval around 1020 I find

This supports the intuition that Landau is better for small numbers of prime factors and Erdős-Kac better for large. In this case the crossover is surprisingly large (8 prime factors) but neither estimate is particularly accurate.
I would not expect it to be for composites that are only 20 digits.
R.D. Silverman is offline   Reply With Quote
Old 2014-08-04, 18:33   #9
CRGreathouse
 
CRGreathouse's Avatar
 
Aug 2006

2·2,969 Posts
Default

Quote:
Originally Posted by R.D. Silverman View Post
I would not expect it to be for composites that are only 20 digits.
Me either -- but it's hard to factor big enough numbers to get useful information. Even if I used 100-digit numbers, at significantly greater effort, that would only increase the log log by about 1.6.

But it would still be useful to have good estimates for these ranges. Any ideas of what techniques could be used?
CRGreathouse is offline   Reply With Quote
Old 2014-08-04, 21:47   #10
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

1C4016 Posts
Default

Quote:
Originally Posted by CRGreathouse View Post
Me either -- but it's hard to factor big enough numbers to get useful information. Even if I used 100-digit numbers, at significantly greater effort, that would only increase the log log by about 1.6.

But it would still be useful to have good estimates for these ranges. Any ideas of what techniques could be used?
I would try to get a 2nd term (i.e. main term plus error term with effective constant) for the
Erdos-Kac thm.
R.D. Silverman is offline   Reply With Quote
Old 2014-08-05, 00:05   #11
CRGreathouse
 
CRGreathouse's Avatar
 
Aug 2006

2·2,969 Posts
Default

Quote:
Originally Posted by R.D. Silverman View Post
I would try to get a 2nd term (i.e. main term plus error term with effective constant) for the
Erdos-Kac thm.
Interesting.

I haven't worked through the proof yet. Do you recommend the original or a modern version (Granville-Soundarajan, etc.)?
CRGreathouse is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Mersenne number factored (disbelievers are biting elbows) alpertron Data 529 2020-10-07 01:00
Largest Mersenne Number Fully Factored? c10ck3r Data 49 2017-12-10 19:39
Possibility of a Fully-Factored Number Trejack FactorDB 7 2016-05-14 05:38
Estimating the number of prime factors a number has henryzz Math 7 2012-05-23 01:13
230 bits! M1237 (partially) factored petrw1 Factoring 5 2010-11-03 11:31

All times are UTC. The time now is 01:54.

Thu Nov 26 01:54:08 UTC 2020 up 76 days, 23:05, 3 users, load averages: 0.96, 1.21, 1.29

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.