mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   mtsieve (https://www.mersenneforum.org/showthread.php?t=23042)

rogue 2021-10-25 17:46

[QUOTE=ryanp;591580]I've tried a number of combinations of -G and -g, ranging from "-g 16" up to "-G 8 -g 288". No matter what, it always tops out around 9.7 or 9.8M p/sec.

This is pretty surprising (the A100 has a max 19.5 TFlop/s single precision) considering I can easily get higher than that with 48 workers on a Xeon.[/QUOTE]

I have used -g500 (no -G). It is unlikely that the GPU is the bottleneck.

ryanp 2021-10-25 18:24

[QUOTE=rogue;591588]I have used -g500 (no -G). It is unlikely that the GPU is the bottleneck.[/QUOTE]

I've tested "-g 72", "-g 144", "-g 600" and for fun, "-g 6192". All produce about 7.5M to 8M p/sec.

With "-G 2" or "-G 4" added in as well, i can get up to about 9.5 to 9.8M p/sec, but no higher.

R. Gerbicz 2021-10-25 18:25

[QUOTE=ryanp;591575]
On a Tesla A100, I couldn't get srsieve2cl to go much above 9 to 10M p/sec, after fiddling with values for a while. By comparison, a plain [C]./srsieve2 -W 48[/C] on a 72-core Xeon CPU gives me about 15M p/sec.[/QUOTE]

Just for curiosity that is is how many k/n pairs, so what is the #k * (nmax-nmin) ?

Citrix 2021-10-26 04:46

[QUOTE=rogue;591588]I have used -g500 (no -G). It is unlikely that the GPU is the bottleneck.[/QUOTE]

The "Sieve of Eratosthenes" code becomes a bottle neck around 10-15 Mp/sec.

henryzz 2021-10-26 10:27

Have you tried multiple instances of mtsieve?

rogue 2021-10-26 12:41

[QUOTE=Citrix;591639]The "Sieve of Eratosthenes" code becomes a bottle neck around 10-15 Mp/sec.[/QUOTE]

I use primesieve as the prime generator. Maybe it could be used in a more efficient manner.

Citrix 2021-10-27 03:02

[QUOTE=rogue;591664]I use primesieve as the prime generator. Maybe it could be used in a more efficient manner.[/QUOTE]

The license prevents us from changing the code. You could write a faster Sieve of Eratosthenes if you let a few composites slip through.

[QUOTE=henryzz;591660]Have you tried multiple instances of mtsieve?[/QUOTE]

You can get 60MP/sec for 4 cores.

Happy5214 2021-11-02 00:40

Would it be possible for mtsieve to return with a non-zero exit code if it was terminated manually (i.e. with SIGINT)? I have several shell scripts for the older srsieve family of programs that have that behavior, and it's useful to know whether the range completed (particularly when it comes to determining whether or not to update a sieve progress control file, which I currently have to revert manually to the old value if I manually kill an mtsieve-family sieve program).

---------

[QUOTE=Citrix;591744]The license prevents us from changing the code. You could write a faster Sieve of Eratosthenes if you let a few composites slip through.[/QUOTE]

Assuming [url]https://github.com/kimwalisch/primesieve[/url] is the same library, what stops you from forking it and modifying it? It's under a 2-clause BSD license, one of the most liberal FOSS licenses there is.

rogue 2021-11-02 12:49

[QUOTE=Happy5214;592249]Would it be possible for mtsieve to return with a non-zero exit code if it was terminated manually (i.e. with SIGINT)? I have several shell scripts for the older srsieve family of programs that have that behavior, and it's useful to know whether the range completed (particularly when it comes to determining whether or not to update a sieve progress control file, which I currently have to revert manually to the old value if I manually kill an mtsieve-family sieve program).

---------

Assuming [url]https://github.com/kimwalisch/primesieve[/url] is the same library, what stops you from forking it and modifying it? It's under a 2-clause BSD license, one of the most liberal FOSS licenses there is.[/QUOTE]

The output when abnormally terminated has the string "Interrupted" in the text, e.g. "2021-10-01 13:16:49: Sieve interrupted at p=26666330822323". Can you parse the output to determine that it was stopped without completing the range?

The challenge is that the framework was designed around slower sieves, ones where each chunk of work takes much more time to process than getting the "chunk of work". This applies to most sieves. It doesn't work as well for faster sieves or computers with many/many cores where one wants a worker per core.

A "chunk of work" means a fixed number of primes, not a fixed range of primes. With the former, there is a mutex around the code getting the primes so that no two workers work on the same primes and so that each worker has the optimal number of primes for its looping logic.

I could change this to a "fixed range of primes". This is slightly less efficient for the worker threads, but significantly reduces the scope of the mutex.

Citrix 2021-11-07 03:11

[QUOTE=Happy5214;592249]

Assuming [url]https://github.com/kimwalisch/primesieve[/url] is the same library, what stops you from forking it and modifying it? It's under a 2-clause BSD license, one of the most liberal FOSS licenses there is.[/QUOTE]

It was under a different license before.
Maybe this would be helpful:-
[url]https://github.com/curtisseizert/CUDASieve[/url]

matzetoni 2021-11-07 21:34

[C]> gfndsievecl.exe -n22001 -N25000 -k2000000 -K3000000 -o"out1.txt"[/C]

I tried running above input with the newest mtsieve version 2.2.2.7 and the program just exits after 5 minutes with no output / error in log file written.


All times are UTC. The time now is 06:43.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.