mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Proth Prime Search (https://www.mersenneforum.org/forumdisplay.php?f=109)
-   -   Proth primes with k = 1281979 (https://www.mersenneforum.org/showthread.php?t=26482)

bur 2021-12-02 17:33

Thanks, I had thought about using the GPU since at Primegrid sieving Proth numbers with GPU is much faster than any CPU. The reason I didn't was that I only have a GTX 760 and a GTX 1660. The 760 is quite slow and not useful. The 1660 might be ok, but I prefer to use it for Wieferich/Wall-Sun-Sun search currently and as Happy said, it'd need to be really fast to compete against the 12 cores of the Ryzen 9 3900X.


Btw, hijack all you want, I'm glad to hear about such things.

dannyridel 2021-12-04 06:45

[QUOTE=bur;594352]Thanks, I had thought about using the GPU since at Primegrid sieving Proth numbers with GPU is much faster than any CPU. The reason I didn't was that I only have a GTX 760 and a GTX 1660. The 760 is quite slow and not useful. The 1660 might be ok, but I prefer to use it for Wieferich/Wall-Sun-Sun search currently and as Happy said, it'd need to be really fast to compete against the 12 cores of the Ryzen 9 3900X.


Btw, hijack all you want, I'm glad to hear about such things.[/QUOTE]

I agree, just using 6 cores on my 3800xt provides around 4-5 mp /s but using -G 2 - g40 (which seems to be the limit of improving efficiency for me) on a gtx 1650 with ddr6 only gives me around 1 mp/s.

bur 2021-12-06 09:53

While we're on the topic of sieving on GPU, did anyone try colab sessions for it? I don't have any experience with it, just began copy&pasting GPU72 code which seems to run fine. Is there a similar "fire&forget" available for srsieve2cl?

bur 2022-01-19 10:36

No new primes, just another status update:

No sieving was done since the last update.

All n < 5,600,000 have now been checked. No prime since more than 5M candidates, low weight indeed. :)


Since the FFT size grew to 640K with n > 5.6M, the 64 MB L3 cache of the Ryzen 9 3900x ran out when testing 12 numbers simultaneously. Initially I ran six 2-threaded LLR instances, but noticed that two of them were about 30% slower than the other four. The reason being the special layout of the processor. There are four so-called CCDs with 16MB L3 cache each. And since each CCD houses three cores, that means that two of the LLR instances ran on two separate CCDs.

So I switched to four 3-threaded LLR instances occupying a single CCD each. Maybe special constructs like 4 2-threaded and 4 single-threaded LLRs would lead to a higher throughput, I didn't run any tests.


Smallest LLR-test currently running:
n = 5.62M
FFT = 640K
duration = 4060 s / test
digits = 1.69M
Caldwell entry rank: 241

Largest LLR-test currently running:
n = 5.65M
FFT = 640k
duration = 4090 s / test
digits = 1.70M
Caldwell entry rank: 238

bur 2022-07-05 11:18

Long time no update...

After a pause the tests are now running on a 12-core i9-10920x with 32 GB RAM and 20 MB L3 cache under Ubuntu 22.04 LTS. It supports AVX512 which not only gives a nice speed-up but also decreased the FFT from 640K to 588K (I assume that's what caused it). Since I'm now only running 2 simultaneous tests, I can comfortably run each one single-threaded.

All [B]n < 5,800,000[/B] have been checked for primality now. No new primes. Largest known prime: [URL="https://www.rieselprime.de/ziki/Proth_prime_2_1281979"]n = 485014 (146010 digits)[/URL]

Some stats for the [B]4,100,000 < n 10,000,000[/B] range:
[LIST][*]Initial sieving with p < 1E6 removed 5,666,278 of the 5,900,000 candidates, i.e. 96%.[*]233,722 candidates were left after that first step.[*]Sieving with 1E6 < p < 825E12 found 172022 factors[*]93,945 candidates were left unfactored[*]27,383 LLR2 tests done[*]66,816 candidates left[/LIST](the surplus of 254 LLR2 tests is due to tests done on numbers that were factored simultaneously by sieving)



[B]Sieving[/B]
Recently sieved: 800E12 < p < 825E12
Software: sr1sieve 1.4.7
Factors found: 77
Largest factor found: 824937311469287 (15 digits) | 1281979 * 2^6579962 + 1


[B]LLR[/B]
Currently testing: 5,800,000 <= n < 5,820,000
Software: LLR2 1.1.1
FFT = 588K
duration = 7400 s / test
digits = 1.74M - 1.75M
Caldwell entry rank: 249


All times are UTC. The time now is 22:53.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.