mersenneforum.org New GPU accelerated sieve of Eratosthenes
 Register FAQ Search Today's Posts Mark Forums Read

 2016-09-27, 16:35 #1 cseizert     "Curtis" Sep 2016 Fort Collins, CO 2×5 Posts New GPU accelerated sieve of Eratosthenes Hey guys, I just put the code online (https://github.com/curtisseizert/CUDASieve) for a sieve that I've been working on that could perhaps be useful to other people. It contains an implementation of Tomás Oliveira e Silva's bucket algorithm, so it doesn't slow down horrendously at higher ranges. I've verified the output in various intervals up to 2^58, but there are kinks that need to be ironed out above that. I'm somewhat new to this type of thing, but I am interested in making this useful to other people, so if there is some way that I can do that, let me know. Oh, and I'm looking forward to checking this forum out. Maybe soon I'll be able to accomplish my life goal of understanding Hardy and Wright. Curtis
 2016-09-28, 00:12 #2 VBCurtis     "Curtis" Feb 2005 Riverside, CA 7×13×61 Posts Curtis- What number forms does it handle, in what formats? -Curtis
 2016-09-28, 00:46 #3 cseizert     "Curtis" Sep 2016 Fort Collins, CO 2×5 Posts It's actually just a humble sieve of Eratosthenes.
 2016-09-28, 01:25 #4 bsquared     "Ben" Feb 2007 3×17×73 Posts Not so humble though... it is quite fast. Here are some timings on Curtis's github site compared to the top of the line cpu sieves (running single threaded on my haswell i3 laptop) Code: range yafu 1.35-beta primesieve cseizert 2^40 - 2^40 + 2^30 0.581 0.598 0.137 2^50 - 2^50 + 2^30 1.19 0.939 0.143 0 - 10^10 2.26 3.02 0.168 0 - 10^12 375 524 13.1 The fact that it is working at these high ranges is very cool - there is a lot of memory accessing going on here and the hardware still flies. I'm in the early stages of porting my goldbach conjecture code to the gpu so that I can hopefully integrate it with this sieve. Nice work Curtis! Last fiddled with by bsquared on 2016-09-28 at 01:27 Reason: formatting and range column edits
 2016-09-28, 05:11 #5 cseizert     "Curtis" Sep 2016 Fort Collins, CO 128 Posts Thanks Ben! It is indeed memory access, specifically to the list of sieving primes, that is slowing it down the most. Higher ranges are on their way, and I'll have to take a look at yafu.
 2016-09-29, 06:49 #6 ldesnogu     Jan 2008 France 2×33×11 Posts Impressive results, congratulation Talking about sieves, did anyone see this: http://www.scientificamerican.com/ar...prime-numbers/ New Take on an Ancient Method Improves Way to Find Prime Numbers Harald Helfgott has reduced the memory requirements for sieving. I wonder if this has any practical impact. Can't wait to read the article.
2016-09-29, 13:51   #7
CRGreathouse

Aug 2006

5,987 Posts

Quote:
 Originally Posted by ldesnogu http://www.scientificamerican.com/ar...prime-numbers/ New Take on an Ancient Method Improves Way to Find Prime Numbers Harald Helfgott has reduced the memory requirements for sieving. I wonder if this has any practical impact. Can't wait to read the article.
I don't know. There are already sieves that take $\tilde{O}(n^{1/3})$ or $\tilde{O}(n^{1/4})$ space; the impact will depend on the practicality of the method. So far I'm cautiously optimistic.

 2016-09-29, 15:04 #8 cseizert     "Curtis" Sep 2016 Fort Collins, CO A16 Posts As of last night I got the top of the usable range up to 2^64-2^35! I'm pretty happy about that. I had been stymied by the fact that an intermediate value in a calculation was being stored in a 32 bit register, and this intermediate would hit a maximum of about 2^34.5. Since changing a single 32 to a 64 in the code (blah!), I haven't found any examples of ranges where the output is less than a complete list of only prime numbers in that interval. I'm working on automating correctness tests today so that I can make sure of this while I'm asleep. The method in that article would be intersesting interesting possibility. My library access at the university I recently graduated from was revoked a couple weeks ago otherwise I would see if the abstracts of this meeting had been published. I've been thinking about feasibility of trying to tackle the challenge of moving this sieve into ranges above 64 bits, but among other things, the amount of memory required for this implementation would go through the roof pretty quickly. Actually, you'd hit 8 GB of memory necessary to hold the data set at about 2^68, so it really would not be worth it. It seems like, with my knowledge anyway, it would be best to partially sieve with, for example, all the primes up to 2^30 and then use some primality test to cross off the rest of the composites. Obviously, this is not a new idea. That said, there are more important things in this project as well as in my life in general than getting this sieve above 2^64. Thanks for the love on the results guys! Very encouraging. I will say that it's kind of cheating to compare GPU results to CPU ones, especially if the latter are based on a single thread. But hey, I'll take it!
2016-10-27, 05:55   #9
only_human

"Gang aft agley"
Sep 2002

72528 Posts

Quote:
 Originally Posted by ldesnogu Impressive results, congratulation Talking about sieves, did anyone see this: http://www.scientificamerican.com/ar...prime-numbers/ New Take on an Ancient Method Improves Way to Find Prime Numbers Harald Helfgott has reduced the memory requirements for sieving. I wonder if this has any practical impact. Can't wait to read the article.
Quote:
 Originally Posted by CRGreathouse I don't know. There are already sieves that take $\tilde{O}(n^{1/3})$ or $\tilde{O}(n^{1/4})$ space; the impact will depend on the practicality of the method. So far I'm cautiously optimistic.
Here's his comment on Slashdot:
Quote:
 by HHelf ( 4723751 ) on Tuesday September 27, 2016 @07:59AM (#52968603) I was about to post something longer, but my computer deleted it at just the right moment. Summary: (a) The origin of this article was an interview given at an algebra colloquium in Buenos Aires a few weeks ago. Its appearance in Scientific American was a little unexpected (to me). I should be able to post a preprint soon. (b) In my colloquium talk, I made sure to mention that I had just come across a precedent in the literature: W. F. Galway, Dissecting a sieve to cut its need for space. In Algorithmic number theory (Leiden, 2000), vol 1838 of Lecture Notes in Comput. Sci., pages 297-312. Springer, Berlin, 2000. In both cases, we are talking about space essentially O(N^(1/3)) and time essentially O(N). Galway improves on the Atkin-Bernstein sieve, which is specifically about producing lists of primes. The sieve of Eratosthenes can be used for more general purposes, e.g., producing lists of factorizations or computing tables of values of arithmetic functions that depend on factorization. I actually got interested in the problem while using a sieve to produce successive values of mu(n), where mu is the Moebius function. As far as I can tell, there's a basic underlying idea being used in both cases, viz. Diophantine approximation. In brief, if you are finding primes (or what have you) in an interval [N,N+N^(1/3)], you do not have to sieve by every m=sqrt(N); you can tell in advance which integers m will have no multiples in that interval. This is all more or less orthogonal to [http://sweet.ua.pt/tos/software/prime_sieve.html]. Indeed what I do and what Oliveira e Silva does can very likely be combined. Best Harald Helfgott
edit:
I found some pdf slides in Spanish from this Google+ post
slides (57 slides)
Quote:
 Eratóstenes en menos espacio Harald Andrés Helfgott Julio 2016

Last fiddled with by only_human on 2016-10-27 at 06:34 Reason: found some pdf slides. Removed "/view" from slide URL -- I think my browser appended that.

 Similar Threads Thread Thread Starter Forum Replies Last Post bhelmes Number Theory Discussion Group 54 2022-10-22 07:13 TObject Hardware 32 2013-05-09 18:17 XYYXF Computer Science & Computational Number Theory 91 2013-01-19 12:19 Raman Programming 4 2009-01-19 17:12 jchein1 Homework Help 6 2007-08-27 13:51

All times are UTC. The time now is 03:14.

Sat Dec 3 03:14:09 UTC 2022 up 107 days, 42 mins, 0 users, load averages: 1.20, 1.17, 1.14