mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2020-11-01, 01:02   #2828
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

2·3·5·37 Posts
Default

CUDALucas 2.06, CUDA 11.1.1, Quick&Dirty run, did a "./CUDALucas -cufftbench 2048 32768 20" before the following runs on each GPU.
  1. A100 PCIe, actual clock rate 1200-1230 MHz and power 250 Watt during the run
    Code:
    Using threads: square 256, splice 128.
    Starting M57885161 fft length = 3136K
    |   Date     Time    |   Test Num     Iter        Residue        |    FFT   Error     ms/It     Time  |       ETA      Done   |
    |  Nov 01  01:38:49  |  M57885161     10000  0x76c27556683cd84d  |  3136K  0.16016   0.2899    2.89s  |      4:39:40   0.01%  |
    |  Nov 01  01:38:52  |  M57885161     20000  0xfd8e311d20ffe6ab  |  3136K  0.17969   0.2854    2.85s  |      4:37:26   0.03%  |
    |  Nov 01  01:38:55  |  M57885161     30000  0xce0d85ab0065a232  |  3136K  0.16406   0.2861    2.86s  |      4:36:53   0.05%  |
    [...]
    |  Nov 01  01:39:59  |  M57885161    250000  0x6d8473f6fd9a63c1  |  3136K  0.17188   0.2890    2.89s  |      4:38:07   0.43%  |
    |  Nov 01  01:40:01  |  M57885161    260000  0x214c6beab9d34d77  |  3136K  0.17969   0.2891    2.89s  |      4:38:03   0.44%  |
    |  Nov 01  01:40:04  |  M57885161    270000  0x1e4eb1f3c280c344  |  3136K  0.16406   0.2890    2.89s  |      4:37:59   0.46%  |
    |  Nov 01  01:40:07  |  M57885161    280000  0x678b32b35f88d73d  |  3136K  0.17188   0.2891    2.89s  |      4:37:56   0.48%  |
    |  Nov 01  01:40:10  |  M57885161    290000  0x4f51138c1f7f6301  |  3136K  0.17188   0.2889    2.88s  |      4:37:52   0.50%  |
    |  Nov 01  01:40:13  |  M57885161    300000  0x4332497b4eefcf79  |  3136K  0.17188   0.2892    2.89s  |      4:37:48   0.51%  |
    Code:
    Using threads: square 256, splice 128.
    Starting M100709233 fft length = 5488K
    |   Date     Time    |   Test Num     Iter        Residue        |    FFT   Error     ms/It     Time  |       ETA      Done   |
    |  Nov 01  01:50:52  | M100709233     10000  0x65cca90366e670ef  |  5488K  0.23047   0.6465    6.46s  |     18:05:07   0.00%  |
    |  Nov 01  01:50:59  | M100709233     20000  0xb557cc85bc57b0bc  |  5488K  0.25000   0.6450    6.45s  |     18:03:44   0.01%  |
    |  Nov 01  01:51:05  | M100709233     30000  0xd758016f2a75aebd  |  5488K  0.21875   0.6463    6.46s  |     18:03:55   0.02%  |
    |  Nov 01  01:51:12  | M100709233     40000  0x769ac25053698341  |  5488K  0.22656   0.6475    6.47s  |     18:04:28   0.03%  |
    |  Nov 01  01:51:18  | M100709233     50000  0xc9a8e17405097ed2  |  5488K  0.23438   0.6477    6.47s  |     18:04:49   0.04%  |
    |  Nov 01  01:51:25  | M100709233     60000  0xba8b8115d1b0e392  |  5488K  0.21875   0.6487    6.48s  |     18:05:19   0.05%  |
    |  Nov 01  01:51:31  | M100709233     70000  0x6f32a109753b2af0  |  5488K  0.21680   0.6490    6.49s  |     18:05:42   0.06%  |
    |  Nov 01  01:51:38  | M100709233     80000  0x390b740965708e6d  |  5488K  0.23438   0.6495    6.49s  |     18:06:04   0.07%  |
    |  Nov 01  01:51:44  | M100709233     90000  0x67d98743ef47ec14  |  5488K  0.22656   0.6497    6.49s  |     18:06:22   0.08%  |
    |  Nov 01  01:51:51  | M100709233    100000  0x84a6b8fc954d383b  |  5488K  0.21875   0.6500    6.50s  |     18:06:38   0.09%  |
  2. Quadro RTX 8000, actual clock rate 1890 MHz and power 210-215 Watt during the run
    Code:
    Using threads: square 256, splice 128.
    Starting M57885161 fft length = 3456K
    |   Date     Time    |   Test Num     Iter        Residue        |    FFT   Error     ms/It     Time  |       ETA      Done   |
    |  Nov 01  01:37:11  |  M57885161     10000  0x76c27556683cd84d  |  3456K  0.01685   1.7461   17.46s  |   1:04:04:19   0.01%  |
    |  Nov 01  01:37:28  |  M57885161     20000  0xfd8e311d20ffe6ab  |  3456K  0.01611   1.7516   17.51s  |   1:04:06:40   0.03%  |
    |  Nov 01  01:37:46  |  M57885161     30000  0xce0d85ab0065a232  |  3456K  0.01758   1.7596   17.59s  |   1:04:09:50   0.05%  |
    [...]
    |  Nov 01  01:44:18  |  M57885161    250000  0x6d8473f6fd9a63c1  |  3456K  0.01660   1.7835   17.83s  |   1:04:28:52   0.43%  |
    |  Nov 01  01:44:36  |  M57885161    260000  0x214c6beab9d34d77  |  3456K  0.02051   1.7834   17.83s  |   1:04:28:44   0.44%  |
    |  Nov 01  01:44:54  |  M57885161    270000  0x1e4eb1f3c280c344  |  3456K  0.01743   1.7834   17.83s  |   1:04:28:35   0.46%  |
    |  Nov 01  01:45:11  |  M57885161    280000  0x678b32b35f88d73d  |  3456K  0.01758   1.7834   17.83s  |   1:04:28:26   0.48%  |
    |  Nov 01  01:45:29  |  M57885161    290000  0x4f51138c1f7f6301  |  3456K  0.01758   1.7834   17.83s  |   1:04:28:16   0.50%  |
    |  Nov 01  01:45:47  |  M57885161    300000  0x4332497b4eefcf79  |  3456K  0.01855   1.7834   17.83s  |   1:04:28:06   0.51%  |
    Code:
    Using threads: square 256, splice 128.
    Starting M100709233 fft length = 5760K
    |   Date     Time    |   Test Num     Iter        Residue        |    FFT   Error     ms/It     Time  |       ETA      Done   |
    |  Nov 01  01:51:48  | M100709233     10000  0x65cca90366e670ef  |  5760K  0.06580   3.2809   32.80s  |   3:19:46:24   0.00%  |
    |  Nov 01  01:52:21  | M100709233     20000  0xb557cc85bc57b0bc  |  5760K  0.06250   3.3074   33.07s  |   3:20:08:07   0.01%  |
    |  Nov 01  01:52:55  | M100709233     30000  0xd758016f2a75aebd  |  5760K  0.06641   3.3288   33.28s  |   3:20:26:57   0.02%  |
    |  Nov 01  01:53:28  | M100709233     40000  0x769ac25053698341  |  5760K  0.06744   3.3346   33.34s  |   3:20:38:32   0.03%  |
    |  Nov 01  01:54:01  | M100709233     50000  0xc9a8e17405097ed2  |  5760K  0.06250   3.3346   33.34s  |   3:20:45:15   0.04%  |
    |  Nov 01  01:54:35  | M100709233     60000  0xba8b8115d1b0e392  |  5760K  0.06836   3.3346   33.34s  |   3:20:49:33   0.05%  |
    |  Nov 01  01:55:08  | M100709233     70000  0x6f32a109753b2af0  |  5760K  0.06348   3.3346   33.34s  |   3:20:52:29   0.06%  |
    |  Nov 01  01:55:41  | M100709233     80000  0x390b740965708e6d  |  5760K  0.06250   3.3347   33.34s  |   3:20:54:32   0.07%  |
    |  Nov 01  01:56:15  | M100709233     90000  0x67d98743ef47ec14  |  5760K  0.06641   3.3346   33.34s  |   3:20:56:00   0.08%  |
    |  Nov 01  01:56:48  | M100709233    100000  0x84a6b8fc954d383b  |  5760K  0.06641   3.3346   33.34s  |   3:20:57:04   0.09%  |
  3. Geforce RTX 3090, actual clock rate 1905-1935 MHz and power 330-335 Watt during the run
    Code:
    Using threads: square 256, splice 128.
    Starting M57885161 fft length = 3200K
    |   Date     Time    |   Test Num     Iter        Residue        |    FFT   Error     ms/It     Time  |       ETA      Done   |
    |  Nov 01  01:37:16  |  M57885161     10000  0x76c27556683cd84d  |  3200K  0.10156   1.4418   14.41s  |     23:10:45   0.01%  |
    |  Nov 01  01:37:30  |  M57885161     20000  0xfd8e311d20ffe6ab  |  3200K  0.09766   1.4469   14.46s  |     23:13:00   0.03%  |
    |  Nov 01  01:37:45  |  M57885161     30000  0xce0d85ab0065a232  |  3200K  0.10938   1.4567   14.56s  |     23:16:44   0.05%  |
    [...]
    |  Nov 01  01:43:12  |  M57885161    250000  0x6d8473f6fd9a63c1  |  3200K  0.10156   1.4978   14.97s  |     23:45:11   0.43%  |
    |  Nov 01  01:43:27  |  M57885161    260000  0x214c6beab9d34d77  |  3200K  0.09946   1.4982   14.98s  |     23:45:28   0.44%  |
    |  Nov 01  01:43:42  |  M57885161    270000  0x1e4eb1f3c280c344  |  3200K  0.10156   1.4985   14.98s  |     23:45:44   0.46%  |
    |  Nov 01  01:43:57  |  M57885161    280000  0x678b32b35f88d73d  |  3200K  0.10156   1.4989   14.98s  |     23:45:58   0.48%  |
    |  Nov 01  01:44:12  |  M57885161    290000  0x4f51138c1f7f6301  |  3200K  0.10156   1.4985   14.98s  |     23:46:09   0.50%  |
    |  Nov 01  01:44:28  |  M57885161    300000  0x4332497b4eefcf79  |  3200K  0.09766   1.4990   14.99s  |     23:46:20   0.51%  |
    Code:
    Using threads: square 256, splice 128.
    Starting M100709233 fft length = 5760K
    |   Date     Time    |   Test Num     Iter        Residue        |    FFT   Error     ms/It     Time  |       ETA      Done   |
    |  Nov 01  01:51:30  | M100709233     10000  0x65cca90366e670ef  |  5760K  0.06250   2.8093   28.09s  |   3:06:35:01   0.00%  |
    |  Nov 01  01:51:59  | M100709233     20000  0xb557cc85bc57b0bc  |  5760K  0.06348   2.8430   28.43s  |   3:07:02:50   0.01%  |
    |  Nov 01  01:52:27  | M100709233     30000  0xd758016f2a75aebd  |  5760K  0.06641   2.8559   28.55s  |   3:07:18:58   0.02%  |
    |  Nov 01  01:52:56  | M100709233     40000  0x769ac25053698341  |  5760K  0.06445   2.8562   28.56s  |   3:07:26:55   0.03%  |
    |  Nov 01  01:53:25  | M100709233     50000  0xc9a8e17405097ed2  |  5760K  0.06250   2.8767   28.76s  |   3:07:38:24   0.04%  |
    |  Nov 01  01:53:53  | M100709233     60000  0xba8b8115d1b0e392  |  5760K  0.06250   2.8821   28.82s  |   3:07:47:23   0.05%  |
    |  Nov 01  01:54:22  | M100709233     70000  0x6f32a109753b2af0  |  5760K  0.06250   2.8868   28.86s  |   3:07:54:48   0.06%  |
    |  Nov 01  01:54:51  | M100709233     80000  0x390b740965708e6d  |  5760K  0.06396   2.8909   28.90s  |   3:08:01:06   0.07%  |
    |  Nov 01  01:55:20  | M100709233     90000  0x67d98743ef47ec14  |  5760K  0.06250   2.8922   28.92s  |   3:08:06:08   0.08%  |
    |  Nov 01  01:55:49  | M100709233    100000  0x84a6b8fc954d383b  |  5760K  0.06250   2.8934   28.93s  |   3:08:10:17   0.09%  |

Oliver
TheJudger is offline   Reply With Quote
Old 2020-11-01, 03:13   #2829
axn
 
axn's Avatar
 
Jun 2003

2×32×269 Posts
Default

Quote:
Originally Posted by TheJudger View Post
CUDALucas 2.06, CUDA 11.1.1, Quick&Dirty run, did a "./CUDALucas -cufftbench 2048 32768 20" before the following runs on each GPU.
Those are impressive numbers from A100. I suspect these would be even more impressive if gpuOWL was used - is that something you can test out?
axn is offline   Reply With Quote
Old 2020-11-01, 03:18   #2830
moebius
 
moebius's Avatar
 
Jul 2009
Germany

10001000112 Posts
Default

Quote:
Originally Posted by axn View Post
Those are impressive numbers from A100. I suspect these would be even more impressive if gpuOWL was used - is that something you can test out?
I have requested such values, they should be very similar.
moebius is offline   Reply With Quote
Old 2020-11-01, 03:24   #2831
axn
 
axn's Avatar
 
Jun 2003

2·32·269 Posts
Default

Quote:
Originally Posted by moebius View Post
I have requested such values, they should be very similar.
I suspect gpuOwl will complete these in 3hrs/12hrs respectively compared to Cudalucas 4.5hrs/18hrs.

Of course, at USD 12k, these aren't exactly "affordable".
axn is offline   Reply With Quote
Old 2020-11-01, 03:41   #2832
moebius
 
moebius's Avatar
 
Jul 2009
Germany

22316 Posts
Default

Quote:
Originally Posted by axn View Post
I suspect gpuOwl will complete these in 3hrs/12hrs respectively compared to Cudalucas 4.5hrs/18hrs.

Of course, at USD 12k, these aren't exactly "affordable".
but CudaLucas without proofs.....

Last fiddled with by moebius on 2020-11-01 at 03:42
moebius is offline   Reply With Quote
Old 2020-11-01, 14:16   #2833
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

29×167 Posts
Default

Quote:
Originally Posted by moebius View Post
but CudaLucas without proofs.....
To my knowledge no one has implemented LL proof yet.
Also, CUDALucas does not include the Jacobi symbol check, although mlucas, prime95 and some versions of gpuowl have.
kriesel is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Don't DC/LL them with CudaLucas LaurV Data 131 2017-05-02 18:41
CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 Brain GPU Computing 13 2016-02-19 15:53
CUDALucas: which binary to use? Karl M Johnson GPU Computing 15 2015-10-13 04:44
settings for cudaLucas fairsky GPU Computing 11 2013-11-03 02:08
Trying to run CUDALucas on Windows 8 CP Rodrigo GPU Computing 12 2012-03-07 23:20

All times are UTC. The time now is 12:01.

Mon Jan 18 12:01:13 UTC 2021 up 46 days, 8:12, 0 users, load averages: 3.32, 2.72, 2.69

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.