20180917, 20:23  #1 
"Dana Jacobsen"
Feb 2011
Bangkok, TH
908_{10} Posts 
Semiprime counting
Because of Robert's interest in some OEIS sequences, I started fiddling with semiprime code in my library. Let me preface by saying there is no particular deep mathematical reason for this work (at least for me), it is just an interesting coding exercise.
Functions: Code:
semiprime_count(n) count of semiprimes <= n semiprime_count(start, end) count of semiprimes in range nth_semiprime(n) the nth semiprime forsemiprimes {...} [start,] end loop over semiprimes in range random_unrestricted_semiprime(n) random nbit semiprime random_semiprime(n) as above with equal size factors is_semiprime(n) does n have exactly 2 prime factors I started a thread last year about some of the issues with efficiently selecting uniform nbit random semiprimes. The is_semiprime function is pretty efficient and was mentioned in this post last year. Pari 2.12 is about 2x faster now, which leaves it only 50 times slower. Charles' more efficient Pari code is on github. My issue is more around the semiprime counts and nth semiprime. I've discovered two sequences in OEIS with differering results. I'd like to know if anyone has a way to reproduce some results in a reasonable time period. Pari/GP semiprime count: Code:
a(n)=my(s); forprime(p=2, sqrt(10^n), s+=primepi((10^n1)\p)); sbinomial(primepi(sqrt(10^n)), 2) A066265 has the number of semiprimes < 10^n, and Henri Lifchitz filled in values up to 10^21 (!) in 2015. He must have efficient code for this. The method I'm using is basically identical to the Pari code, based on Meissel's prime counting method. The differences vs. Pari are (1) low overhead for smallish values, and (2) pretty fast LMO prime counting for larger values. In theory we could go one step further and use Lehmer's solution which would speed things up, though quite a bit more complicated. For the nth semiprime, I am doing an interpolated search on the semiprime count to get close, then iterate over semiprimes to get to the exact value. I imagine there could be improvements in the interpolation to speed this up some. Of course any speedups in the semiprime count will pay off multiplefold here. Issues I see: A125527: number of semiprimes <= 2^n. a(48) was corrected after finding a confirming value elsewhere. A114125: The 10^nth semiprime. a(14) doesn't look right to me. We can check the result in a reasonable time by doing a semiprime count on the value, assuming a fast enough count routine. I believe a(14) = 896113850117314. It looks like a(15) = 9413000361625346 which is a new value. There are a few other sequences I'll be using this for, including A146168 which needs a bfile, and checking A292785. That sequence is calculated using the Lifchitz results so I can just check my results. 
20180917, 22:06  #2  
"Forget I exist"
Jul 2009
Dumbassville
2^{6}×131 Posts 
Quote:
parforprime instead of forprime ; making sqrt(10^n) a variable ( outside the loop), as gets calculated every time through the loop even though it can be calculated outside the loop. maybe precalulate the binomial coefficient all these things are independent of the loop variable. 

20180917, 22:28  #3  
"Dana Jacobsen"
Feb 2011
Bangkok, TH
2^{2}×227 Posts 
Will need some locking for the accumulator. Just naively using parforprime yields incorrect results.
Quote:
The issue is the thousands to millions of calls to primepi. 

20180917, 23:02  #4 
"Forget I exist"
Jul 2009
Dumbassville
8384_{10} Posts 
made about 20 ms difference by a(9) on my system (admittedly <1/3%) . Theres also the option of forprimestep using equivalence classes. Yes primepi is a conundrum for me to get around. primepi(sqrt(10^n))^2 only deals with semiprimes with both primes less than sqrt(10^n). So the efficiency comes down to how to get a count of all semiprimes with 1 factor over sqrt(10^n) .
Last fiddled with by science_man_88 on 20180917 at 23:03 
20180918, 13:29  #5 
Aug 2006
3^{2}×5×7×19 Posts 
The real issue is that gp has a trivial implementation of primepi, which makes the algorithm noncompetitive. A proper implementation would make a big difference.
As for parallelizing it, probably the right way would be to do the biggest primepi in one thread and all the others in another thread, or something like that. For what it's worth I agree about hoisting sqrt(10^n) out of the loop  it's not a big deal, but it just rubs me the wrong way. :) Last fiddled with by CRGreathouse on 20180918 at 13:36 
20180918, 16:38  #6 
Aug 2006
3^{2}·5·7·19 Posts 
Oh, and as I forgot to say it before: these are nice, danaj, thanks for writing them! they're definitely handy to have in the toolbox.

20180918, 16:39  #7  
"Dana Jacobsen"
Feb 2011
Bangkok, TH
2^{2}·227 Posts 
Quote:
Then it would be a matter of making some low overhead methods for the smaller ones. Quote:
I may write a little C program using the primecount C API and see how that works. 

20180918, 16:41  #8 
"Dana Jacobsen"
Feb 2011
Bangkok, TH
2^{2}·227 Posts 

20180922, 07:34  #9 
Jun 2003
Oxford, UK
1,933 Posts 
I think that pi(semiprime) at n simply the summation of pi(prime) at n/2, n/3....n/p where p is the prime nearest and less than the square root of n.
Just out of interest, if pi(prime) is relatively easy to determine at a given integer n, then would this summation be relatively quicker than the method you are looking at? 
20180922, 17:32  #10  
Aug 2006
5985_{10} Posts 
Quote:


20180923, 20:40  #11 
"Dana Jacobsen"
Feb 2011
Bangkok, TH
2^{2}·227 Posts 
That is the method, as shown on MathWorld, Stack Exchange, and a few other places. I haven't seen any mention of improvements to that.
For small ranges there are some slightly better ways, but asymptotically they are far worse. The issue is then (1) have a fast prime count method for large inputs, and (2) optimize the many smallish counts. My prime count routine is pretty decent though single threaded. Kim's primecount is a little faster at the medium size (e.g. 10^1010^16) now, and much faster for very large values (not to mention it is multithreaded and works past 64bit). I did some work on #2, using a segmented sieve and binary search for counting for counts from n^{1/2} to n^{2/3}. That helps quite a bit and removes well over 90% of the calls. For nth prime, the method typically used is to calculate an estimate (using adjusted inverse Li or inverse R). Compute a single prime count to get an exact answer at the estimate. This usually ends up very close and then we just sieve the remainder. There are two reasons my nth semiprime is a little different. (1) inverse R or adjusted inverse Li gets us extremely close  much closer than I have for an estimated semiprime count. (2) my counting to get the remainder is slower, so I really want to be closer before starting that. Part of #2 is just that sieving a range for primes has been well studied, there are lots of known optimizations, and I have spent many, many hours on my code there. I just started on the semiprimes. For #1, we can use the simple: n * log(n) / log(log(n)). While perhaps asymptotically correct, it's got a lot of error. Step one was to multiply that by 0.966 which is much closer. What I have now is multiplying by 0.968  0.0019278 * log(log(log(n))), which gets it even closer. Typically it seems I only need one more count to get close enough. For the 2^45th semiprime: 3.6% nlogn/loglogn 0.05% .966 * nlogn/loglogn 0.0096% (0.968  0.0019278 * log(log(log(n)))) * nlogn/logn This is not rigorously minimized. It also may be only a decent fit in the area I've tested. With my current code on my Macbook, computing the 2^45th semiprime takes about 2.5 minutes, 2^48th in 11 minutes, 2^50th in 42 minutes. The last is about 20 times faster than when I first made the post. 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Semiprime and nalmost prime candidate for the k's with algebra for the Sierpinski/Riesel problem  sweety439  sweety439  11  20200923 01:42 
A counting problem  jasonp  Puzzles  1  20171224 19:38 
Generating a uniform random nbit semiprime  danaj  Programming  17  20170915 16:41 
Factored my first SemiPrime Some Questions  tal  Msieve  31  20110522 18:17 
counting bricks  michael  Puzzles  8  20040114 16:27 