mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2020-11-23, 20:39   #826
S485122
 
S485122's Avatar
 
Sep 2006
Brussels, Belgium

33·61 Posts
Default

Quote:
Originally Posted by NookieN View Post
Finally getting around to setting up my 10900X system. RAM is 3200C16.
...
This processor supports quad channel memory. What is your memory configuration ? You only specified the speed.

Jacob
S485122 is offline   Reply With Quote
Old 2020-11-24, 02:22   #827
NookieN
 
NookieN's Avatar
 
Aug 2002

2·29 Posts
Default

Quote:
Originally Posted by S485122 View Post
This processor supports quad channel memory. What is your memory configuration ? You only specified the speed.
You're right, I didn't (I guess it's not obvious, though it would be rather silly to build a system like this without populating all channels).

More specific config is: Quad-channel (4x8GiB) 3200MT/s 16-18-18 2T single-rank

Dual rank would be interesting, but I don't have 8 sticks of the stuff lying around at the moment.

Quote:
Originally Posted by henryzz View Post
Are you able to saturate memory bandwidth without AVX512? If so how does power consumption compare at the lowest frequency that maxes bandwidth?
Here's... some data. I don't think it fully answers your question because I don't know for certain yet what frequencies max out both FMA3 and AVX512F. The table below shows various benchmarks for both. (Again this is without any OC or alteration to profiles, other than enabling XMP.)

FMA3 runs at 3.8GHz for both 8 and 10 cores. AVX512 runs at 3.4GHz for 8 and 10 cores, although with 8 1 or 2 cores would make it up to 3.7GHz periodically. Power is interesting (using HWMonitor package power, I haven't dug out the killawatt). FMA3 is 140W for 8 cores and 155-160W for 10. AVX512 was always in the 135-140W range for both 8 and 10. I'll have to play with AVX offsets more to have a better idea of what that means.

Code:
FFT            FMA3 8 cores      FMA3 10 cores     AVX512F 8 cores   AVX512F 10 cores
4032K                                              1.63 ms.          1.46 ms.
4096K          2.22 ms.          1.87 ms.                    
4200K                                              1.78 ms.          1.61 ms.
4320K                                              1.88 ms.          1.71 ms.
4480K          2.46 ms.          2.12 ms.          1.99 ms.          1.82 ms.
4608K          2.51 ms.          2.15 ms.          2.08 ms.          1.91 ms.
4704K                                              2.13 ms.          1.96 ms.
4800K          2.73 ms.          2.37 ms.          2.49 ms.          2.26 ms.
5040K                                              2.24 ms.          2.09 ms.
5120K          2.91 ms.          2.51 ms.          2.30 ms.          2.15 ms.
5184K                                              2.35 ms.          2.22 ms.
5376K          3.04 ms.          2.67 ms.          2.50 ms.          2.36 ms.
5760K          3.44 ms.          3.02 ms.          3.12 ms.          2.93 ms.
NookieN is offline   Reply With Quote
Old 2020-11-27, 22:21   #828
wagner85
 
Aug 2020

52 Posts
Default

Quote:
Originally Posted by NookieN View Post
Finally getting around to setting up my 10900X system. RAM is 3200C16. Benchmark below is stock. I played around with various OCs and as expected they don't make any difference in throughput (but a lot in temperature!)--the stock speed with AVX512 (apparently 3.4GHz) easily saturates memory bandwidth.

Code:
Intel(R) Core(TM) i9-10900X CPU @ 3.70GHz
CPU speed: 4288.93 MHz, 10 hyperthreaded cores
CPU features: Prefetchw, SSE, SSE2, SSE4, AVX, AVX2, FMA, AVX512F
L1 cache size: 10x32 KB, L2 cache size: 10x1 MB, L3 cache size: 19712 KB
L1 cache line size: 64 bytes, L2 cache line size: 64 bytes
Prime95 64-bit version 30.3, RdtscTiming=1
Timings for 2048K FFT length (10 cores, 1 worker):  0.62 ms.  Throughput: 1621.33 iter/sec.
Timings for 2048K FFT length (10 cores, 2 workers):  1.45,  1.34 ms.  Throughput: 1433.85 iter/sec.
Timings for 2048K FFT length (10 cores, 10 workers): 11.78, 11.69, 11.66, 10.87, 11.68, 11.67, 11.67, 11.65, 11.69, 11.67 ms.  Throughput: 862.35 iter/sec.
Timings for 2100K FFT length (10 cores, 1 worker):  0.68 ms.  Throughput: 1477.05 iter/sec.
Timings for 2100K FFT length (10 cores, 2 workers):  1.48,  1.48 ms.  Throughput: 1350.56 iter/sec.
Timings for 2100K FFT length (10 cores, 10 workers): 11.30, 11.30, 11.30, 11.29, 11.30, 11.30, 11.30, 11.29, 11.30, 11.30 ms.  Throughput: 885.20 iter/sec.
Timings for 2160K FFT length (10 cores, 1 worker):  0.72 ms.  Throughput: 1396.96 iter/sec.
Timings for 2160K FFT length (10 cores, 2 workers):  1.62,  1.62 ms.  Throughput: 1234.74 iter/sec.
Timings for 2160K FFT length (10 cores, 10 workers): 11.60, 11.62, 11.62, 11.61, 11.62, 11.60, 11.62, 11.61, 11.61, 11.61 ms.  Throughput: 860.97 iter/sec.
Timings for 2240K FFT length (10 cores, 1 worker):  0.74 ms.  Throughput: 1347.42 iter/sec.
Timings for 2240K FFT length (10 cores, 2 workers):  1.76,  1.66 ms.  Throughput: 1170.89 iter/sec.
Timings for 2240K FFT length (10 cores, 10 workers): 12.91, 12.89, 12.95, 12.03, 12.92, 12.94, 12.91, 12.91, 12.92, 12.94 ms.  Throughput: 779.61 iter/sec.
Timings for 2304K FFT length (10 cores, 1 worker):  0.75 ms.  Throughput: 1340.50 iter/sec.
Timings for 2304K FFT length (10 cores, 2 workers):  1.90,  1.73 ms.  Throughput: 1105.43 iter/sec.
Timings for 2304K FFT length (10 cores, 10 workers): 13.29, 13.30, 13.40, 12.43, 13.31, 13.28, 13.29, 13.29, 13.34, 13.31 ms.  Throughput: 756.59 iter/sec.
Timings for 2400K FFT length (10 cores, 1 worker):  0.81 ms.  Throughput: 1230.45 iter/sec.
Timings for 2400K FFT length (10 cores, 2 workers):  1.89,  1.92 ms.  Throughput: 1048.94 iter/sec.
Timings for 2400K FFT length (10 cores, 10 workers): 13.32, 13.32, 13.31, 13.33, 13.30, 13.32, 13.31, 13.31, 13.32, 13.32 ms.  Throughput: 751.10 iter/sec.
Timings for 2520K FFT length (10 cores, 1 worker):  0.84 ms.  Throughput: 1196.41 iter/sec.
[Sun Nov 22 07:53:39 2020]
Timings for 2520K FFT length (10 cores, 2 workers):  2.03,  2.03 ms.  Throughput: 986.85 iter/sec.
Timings for 2520K FFT length (10 cores, 10 workers): 13.93, 13.94, 13.94, 13.94, 13.82, 13.94, 13.94, 13.93, 13.94, 13.94 ms.  Throughput: 718.09 iter/sec.
Timings for 2560K FFT length (10 cores, 1 worker):  0.85 ms.  Throughput: 1177.57 iter/sec.
Timings for 2560K FFT length (10 cores, 2 workers):  2.16,  2.16 ms.  Throughput: 924.33 iter/sec.
Timings for 2560K FFT length (10 cores, 10 workers): 14.32, 14.32, 14.32, 14.31, 14.32, 14.32, 14.31, 14.31, 14.31, 14.32 ms.  Throughput: 698.63 iter/sec.
Timings for 2592K FFT length (10 cores, 1 worker):  0.84 ms.  Throughput: 1193.89 iter/sec.
Timings for 2592K FFT length (10 cores, 2 workers):  2.18,  2.18 ms.  Throughput: 919.39 iter/sec.
Timings for 2592K FFT length (10 cores, 10 workers): 14.32, 14.12, 14.32, 14.34, 14.34, 14.34, 14.21, 14.23, 14.32, 14.34 ms.  Throughput: 699.92 iter/sec.
Timings for 2688K FFT length (10 cores, 1 worker):  0.86 ms.  Throughput: 1157.89 iter/sec.
Timings for 2688K FFT length (10 cores, 2 workers):  2.32,  2.32 ms.  Throughput: 863.79 iter/sec.
Timings for 2688K FFT length (10 cores, 10 workers): 15.28, 15.16, 15.10, 15.31, 15.30, 15.21, 15.17, 14.99, 15.19, 15.31 ms.  Throughput: 657.81 iter/sec.
Timings for 2880K FFT length (10 cores, 1 worker):  0.92 ms.  Throughput: 1089.76 iter/sec.
Timings for 2880K FFT length (10 cores, 2 workers):  2.63,  2.48 ms.  Throughput: 783.82 iter/sec.
Timings for 2880K FFT length (10 cores, 10 workers): 15.91, 16.01, 16.00, 15.59, 15.98, 15.93, 15.93, 15.90, 16.04, 15.95 ms.  Throughput: 628.01 iter/sec.
Timings for 2940K FFT length (10 cores, 1 worker):  0.98 ms.  Throughput: 1024.34 iter/sec.
Timings for 2940K FFT length (10 cores, 2 workers):  2.68,  2.68 ms.  Throughput: 745.01 iter/sec.
Timings for 2940K FFT length (10 cores, 10 workers): 15.86, 16.03, 15.84, 15.98, 15.94, 15.86, 15.86, 15.74, 15.94, 15.90 ms.  Throughput: 629.22 iter/sec.
Timings for 3000K FFT length (10 cores, 1 worker):  1.05 ms.  Throughput: 953.88 iter/sec.
Timings for 3000K FFT length (10 cores, 2 workers):  2.69,  2.69 ms.  Throughput: 743.37 iter/sec.
Timings for 3000K FFT length (10 cores, 10 workers): 16.84, 16.84, 16.84, 16.82, 16.84, 16.78, 16.78, 16.73, 16.84, 16.84 ms.  Throughput: 594.71 iter/sec.
[Sun Nov 22 07:58:45 2020]
Timings for 3072K FFT length (10 cores, 1 worker):  0.93 ms.  Throughput: 1075.16 iter/sec.
Timings for 3072K FFT length (10 cores, 2 workers):  2.90,  2.64 ms.  Throughput: 724.15 iter/sec.
Timings for 3072K FFT length (10 cores, 10 workers): 17.38, 17.45, 16.63, 17.42, 17.40, 17.40, 17.48, 17.38, 17.40, 17.44 ms.  Throughput: 576.91 iter/sec.
Timings for 3136K FFT length (10 cores, 1 worker):  1.07 ms.  Throughput: 938.49 iter/sec.
Timings for 3136K FFT length (10 cores, 2 workers):  3.14,  2.91 ms.  Throughput: 661.78 iter/sec.
Timings for 3136K FFT length (10 cores, 10 workers): 18.50, 18.50, 18.52, 18.53, 18.50, 18.54, 18.50, 17.18, 18.53, 18.60 ms.  Throughput: 544.06 iter/sec.
Timings for 3200K FFT length (10 cores, 1 worker):  1.15 ms.  Throughput: 867.95 iter/sec.
Timings for 3200K FFT length (10 cores, 2 workers):  3.01,  3.01 ms.  Throughput: 664.84 iter/sec.
Timings for 3200K FFT length (10 cores, 10 workers): 17.83, 17.97, 17.95, 17.80, 17.95, 17.95, 17.93, 17.80, 17.97, 17.97 ms.  Throughput: 558.36 iter/sec.
Timings for 3360K FFT length (10 cores, 1 worker):  1.12 ms.  Throughput: 893.80 iter/sec.
Timings for 3360K FFT length (10 cores, 2 workers):  3.30,  3.10 ms.  Throughput: 624.92 iter/sec.
Timings for 3360K FFT length (10 cores, 10 workers): 19.33, 19.42, 18.62, 19.35, 19.35, 19.33, 19.50, 19.33, 19.41, 19.39 ms.  Throughput: 518.15 iter/sec.
Timings for 3456K FFT length (10 cores, 1 worker):  1.19 ms.  Throughput: 842.58 iter/sec.
Timings for 3456K FFT length (10 cores, 2 workers):  3.41,  3.25 ms.  Throughput: 600.34 iter/sec.
Timings for 3456K FFT length (10 cores, 10 workers): 19.91, 19.94, 19.86, 19.25, 20.12, 19.86, 19.86, 19.94, 19.88, 19.89 ms.  Throughput: 503.81 iter/sec.
Timings for 3600K FFT length (10 cores, 1 worker):  1.30 ms.  Throughput: 770.75 iter/sec.
Timings for 3600K FFT length (10 cores, 2 workers):  3.50,  3.50 ms.  Throughput: 571.12 iter/sec.
Timings for 3600K FFT length (10 cores, 10 workers): 20.37, 20.37, 20.37, 20.34, 20.37, 20.37, 20.26, 20.26, 20.37, 20.37 ms.  Throughput: 491.45 iter/sec.
Timings for 3840K FFT length (10 cores, 1 worker):  1.41 ms.  Throughput: 709.48 iter/sec.
Timings for 3840K FFT length (10 cores, 2 workers):  3.83,  3.82 ms.  Throughput: 522.78 iter/sec.
[Sun Nov 22 08:03:53 2020]
Timings for 3840K FFT length (10 cores, 10 workers): 21.31, 21.52, 21.38, 21.38, 21.47, 21.37, 21.34, 21.25, 21.41, 21.47 ms.  Throughput: 467.54 iter/sec.
Timings for 3920K FFT length (10 cores, 1 worker):  1.46 ms.  Throughput: 686.86 iter/sec.
Timings for 3920K FFT length (10 cores, 2 workers):  4.04,  4.04 ms.  Throughput: 494.65 iter/sec.
Timings for 3920K FFT length (10 cores, 10 workers): 22.72, 22.57, 22.56, 22.49, 22.55, 22.57, 22.53, 22.49, 22.59, 22.54 ms.  Throughput: 443.24 iter/sec.
Timings for 4032K FFT length (10 cores, 1 worker):  1.47 ms.  Throughput: 679.19 iter/sec.
Timings for 4032K FFT length (10 cores, 2 workers):  4.09,  4.09 ms.  Throughput: 489.04 iter/sec.
Timings for 4032K FFT length (10 cores, 10 workers): 24.21, 24.37, 24.06, 23.69, 24.33, 24.09, 24.00, 23.90, 24.13, 24.33 ms.  Throughput: 414.76 iter/sec.
Timings for 4200K FFT length (10 cores, 1 worker):  1.58 ms.  Throughput: 632.69 iter/sec.
Timings for 4200K FFT length (10 cores, 2 workers):  4.26,  4.22 ms.  Throughput: 471.50 iter/sec.
Timings for 4200K FFT length (10 cores, 10 workers): 23.78, 23.70, 23.65, 23.59, 23.78, 23.60, 23.67, 23.67, 23.70, 23.78 ms.  Throughput: 422.08 iter/sec.
Timings for 4320K FFT length (10 cores, 1 worker):  1.69 ms.  Throughput: 590.92 iter/sec.
Timings for 4320K FFT length (10 cores, 2 workers):  4.42,  4.42 ms.  Throughput: 452.10 iter/sec.
Timings for 4320K FFT length (10 cores, 10 workers): 24.40, 24.39, 24.39, 24.27, 24.39, 24.40, 24.20, 24.32, 24.40, 24.40 ms.  Throughput: 410.59 iter/sec.
Timings for 4480K FFT length (10 cores, 1 worker):  1.81 ms.  Throughput: 553.65 iter/sec.
Timings for 4480K FFT length (10 cores, 2 workers):  4.68,  4.68 ms.  Throughput: 427.63 iter/sec.
Timings for 4480K FFT length (10 cores, 10 workers): 25.67, 25.67, 25.67, 25.50, 25.67, 25.67, 25.67, 25.56, 25.54, 25.67 ms.  Throughput: 390.21 iter/sec.
Timings for 4608K FFT length (10 cores, 1 worker):  1.91 ms.  Throughput: 522.41 iter/sec.
Timings for 4608K FFT length (10 cores, 2 workers):  4.75,  4.95 ms.  Throughput: 412.36 iter/sec.
Timings for 4608K FFT length (10 cores, 10 workers): 26.71, 26.79, 26.71, 26.77, 26.77, 26.77, 26.87, 25.83, 27.17, 26.78 ms.  Throughput: 374.38 iter/sec.
Timings for 4704K FFT length (10 cores, 1 worker):  1.96 ms.  Throughput: 509.37 iter/sec.
[Sun Nov 22 08:09:02 2020]
Timings for 4704K FFT length (10 cores, 2 workers):  4.92,  5.11 ms.  Throughput: 399.20 iter/sec.
Timings for 4704K FFT length (10 cores, 10 workers): 27.50, 27.50, 27.50, 26.91, 27.61, 27.50, 27.50, 27.55, 27.55, 27.55 ms.  Throughput: 364.11 iter/sec.
Timings for 4800K FFT length (10 cores, 1 worker):  2.24 ms.  Throughput: 446.78 iter/sec.
Timings for 4800K FFT length (10 cores, 2 workers):  5.45,  5.51 ms.  Throughput: 365.01 iter/sec.
Timings for 4800K FFT length (10 cores, 10 workers): 28.59, 27.35, 27.29, 27.39, 27.80, 27.60, 27.80, 27.35, 27.53, 28.05 ms.  Throughput: 361.40 iter/sec.
Timings for 5040K FFT length (10 cores, 1 worker):  2.10 ms.  Throughput: 476.41 iter/sec.
Timings for 5040K FFT length (10 cores, 2 workers):  5.42,  5.42 ms.  Throughput: 368.93 iter/sec.
Timings for 5040K FFT length (10 cores, 10 workers): 30.27, 30.09, 30.15, 29.77, 30.15, 29.98, 30.10, 29.96, 30.15, 30.27 ms.  Throughput: 332.35 iter/sec.
Timings for 5120K FFT length (10 cores, 1 worker):  2.19 ms.  Throughput: 456.28 iter/sec.
Timings for 5120K FFT length (10 cores, 2 workers):  5.71,  5.71 ms.  Throughput: 350.21 iter/sec.
Timings for 5120K FFT length (10 cores, 10 workers): 31.26, 30.96, 30.79, 30.68, 31.26, 30.87, 30.86, 30.86, 30.95, 31.26 ms.  Throughput: 322.87 iter/sec.
Timings for 5184K FFT length (10 cores, 1 worker):  2.24 ms.  Throughput: 446.51 iter/sec.
Timings for 5184K FFT length (10 cores, 2 workers):  5.70,  5.69 ms.  Throughput: 351.17 iter/sec.
Timings for 5184K FFT length (10 cores, 10 workers): 30.54, 30.53, 30.66, 30.38, 30.60, 30.67, 30.66, 30.41, 30.66, 30.67 ms.  Throughput: 327.02 iter/sec.
Timings for 5376K FFT length (10 cores, 1 worker):  2.39 ms.  Throughput: 417.94 iter/sec.
Timings for 5376K FFT length (10 cores, 2 workers):  5.91,  5.91 ms.  Throughput: 338.27 iter/sec.
Timings for 5376K FFT length (10 cores, 10 workers): 32.31, 32.52, 32.18, 31.93, 32.45, 32.23, 32.17, 32.11, 32.37, 32.47 ms.  Throughput: 309.86 iter/sec.
Timings for 5760K FFT length (10 cores, 1 worker):  2.90 ms.  Throughput: 344.81 iter/sec.
Timings for 5760K FFT length (10 cores, 2 workers):  6.85,  6.85 ms.  Throughput: 291.83 iter/sec.
Timings for 5760K FFT length (10 cores, 10 workers): 33.58, 33.37, 33.29, 32.72, 33.02, 32.89, 33.00, 32.95, 33.18, 33.25 ms.  Throughput: 301.91 iter/sec.
[Sun Nov 22 08:14:12 2020]
Timings for 6048K FFT length (10 cores, 1 worker):  2.84 ms.  Throughput: 351.93 iter/sec.
Timings for 6048K FFT length (10 cores, 2 workers):  6.77,  6.75 ms.  Throughput: 296.01 iter/sec.
Timings for 6048K FFT length (10 cores, 10 workers): 36.29, 36.30, 36.19, 36.49, 35.93, 36.21, 36.17, 36.02, 36.21, 36.37 ms.  Throughput: 276.11 iter/sec.
Timings for 6144K FFT length (10 cores, 1 worker):  2.95 ms.  Throughput: 339.04 iter/sec.
Timings for 6144K FFT length (10 cores, 2 workers):  7.05,  7.04 ms.  Throughput: 283.89 iter/sec.
Timings for 6144K FFT length (10 cores, 10 workers): 37.96, 37.94, 37.72, 37.39, 38.06, 37.91, 37.93, 37.67, 37.97, 38.25 ms.  Throughput: 264.01 iter/sec.
Timings for 6272K FFT length (10 cores, 1 worker):  3.00 ms.  Throughput: 333.52 iter/sec.
Timings for 6272K FFT length (10 cores, 2 workers):  7.15,  7.15 ms.  Throughput: 279.82 iter/sec.
Timings for 6272K FFT length (10 cores, 10 workers): 38.65, 38.95, 38.35, 38.06, 38.84, 38.48, 38.35, 38.27, 38.64, 38.85 ms.  Throughput: 259.46 iter/sec.
Timings for 6400K FFT length (10 cores, 1 worker):  3.18 ms.  Throughput: 314.58 iter/sec.
Timings for 6400K FFT length (10 cores, 2 workers):  7.34,  7.23 ms.  Throughput: 274.62 iter/sec.
Timings for 6400K FFT length (10 cores, 10 workers): 38.34, 38.45, 38.28, 37.93, 38.48, 38.55, 38.40, 38.32, 38.46, 38.38 ms.  Throughput: 260.70 iter/sec.
Timings for 6720K FFT length (10 cores, 1 worker):  3.36 ms.  Throughput: 298.00 iter/sec.
Timings for 6720K FFT length (10 cores, 2 workers):  7.70,  7.70 ms.  Throughput: 259.66 iter/sec.
Timings for 6720K FFT length (10 cores, 10 workers): 40.31, 40.22, 40.14, 39.95, 40.23, 40.18, 40.16, 40.03, 40.26, 40.45 ms.  Throughput: 248.81 iter/sec.
Timings for 7056K FFT length (10 cores, 1 worker):  3.54 ms.  Throughput: 282.21 iter/sec.
Timings for 7056K FFT length (10 cores, 2 workers):  8.07,  8.07 ms.  Throughput: 247.80 iter/sec.
Timings for 7056K FFT length (10 cores, 10 workers): 42.08, 42.20, 42.05, 41.65, 41.89, 42.04, 42.09, 41.86, 42.15, 42.55 ms.  Throughput: 237.79 iter/sec.
Timings for 7168K FFT length (10 cores, 1 worker):  3.64 ms.  Throughput: 274.72 iter/sec.
Timings for 7168K FFT length (10 cores, 2 workers):  8.38,  8.39 ms.  Throughput: 238.60 iter/sec.
[Sun Nov 22 08:19:26 2020]
Timings for 7168K FFT length (10 cores, 10 workers): 44.38, 44.51, 44.26, 43.75, 44.67, 44.25, 44.24, 44.04, 44.34, 44.78 ms.  Throughput: 225.62 iter/sec.
Timings for 7200K FFT length (10 cores, 1 worker):  3.60 ms.  Throughput: 277.47 iter/sec.
Timings for 7200K FFT length (10 cores, 2 workers):  8.31,  8.30 ms.  Throughput: 240.86 iter/sec.
Timings for 7200K FFT length (10 cores, 10 workers): 44.51, 44.91, 44.36, 43.83, 44.64, 44.38, 44.26, 44.19, 44.42, 44.78 ms.  Throughput: 225.09 iter/sec.
Timings for 7680K FFT length (10 cores, 1 worker):  3.96 ms.  Throughput: 252.74 iter/sec.
Timings for 7680K FFT length (10 cores, 2 workers):  8.94,  8.94 ms.  Throughput: 223.78 iter/sec.
Timings for 7680K FFT length (10 cores, 10 workers): 46.83, 46.98, 46.59, 45.89, 46.98, 46.65, 46.54, 46.39, 46.73, 46.99 ms.  Throughput: 214.34 iter/sec.
Timings for 8064K FFT length (10 cores, 1 worker):  4.37 ms.  Throughput: 228.87 iter/sec.
Timings for 8064K FFT length (10 cores, 2 workers):  9.34,  9.55 ms.  Throughput: 211.75 iter/sec.
Timings for 8064K FFT length (10 cores, 10 workers): 48.23, 48.30, 48.23, 48.34, 48.45, 48.14, 48.38, 48.04, 48.32, 48.64 ms.  Throughput: 207.01 iter/sec.

I run a rig with dual CPU (e-2690 v0)
MB Asus Z9PR-D12
32 GB DDR3 1333Mhz (quad channel)
I noticed my machine throughput is close of those of your machine.
Since I am new to that I don't understand why.
I have 20mb L3 cache for each CPU. total 40mb for both.
The price tag for the intel 10900X (just cpu) is more than I spent on my rig (twice).

Anyone has any ideas?

[Mon Nov 23 23:44:33 2020]
FFTlen=6144K, Type=3, Arch=3, Pass1=384, Pass2=16384, clm=4 (16 cores, 2 workers): 7.31, 7.08 ms. Throughput: 278.07 iter/sec.
FFTlen=6144K, Type=3, Arch=3, Pass1=384, Pass2=16384, clm=2 (16 cores, 2 workers): 7.21, 6.99 ms. Throughput: 281.81 iter/sec.
FFTlen=6144K, Type=3, Arch=3, Pass1=384, Pass2=16384, clm=1 (16 cores, 2 workers): 7.13, 6.91 ms. Throughput: 284.96 iter/sec.
FFTlen=6144K, Type=3, Arch=3, Pass1=512, Pass2=12288, clm=4 (16 cores, 2 workers): 7.63, 7.13 ms. Throughput: 271.32 iter/sec.
FFTlen=6144K, Type=3, Arch=3, Pass1=512, Pass2=12288, clm=2 (16 cores, 2 workers): 8.22, 6.91 ms. Throughput: 266.38 iter/sec.
FFTlen=6144K, Type=3, Arch=3, Pass1=512, Pass2=12288, clm=1 (16 cores, 2 workers): 7.26, 7.04 ms. Throughput: 279.82 iter/sec.
FFTlen=6144K, Type=3, Arch=3, Pass1=768, Pass2=8192, clm=4 (16 cores, 2 workers): 7.38, 7.13 ms. Throughput: 275.91 iter/sec.
FFTlen=6144K, Type=3, Arch=3, Pass1=768, Pass2=8192, clm=2 (16 cores, 2 workers): 6.96, 6.81 ms. Throughput: 290.45 iter/sec.
FFTlen=6144K, Type=3, Arch=3, Pass1=768, Pass2=8192, clm=1 (16 cores, 2 workers): 6.92, 6.76 ms. Throughput: 292.39 iter/sec.
FFTlen=6144K, Type=3, Arch=3, Pass1=1024, Pass2=6144, clm=4 (16 cores, 2 workers): 7.70, 7.43 ms. Throughput: 264.47 iter/sec.
FFTlen=6144K, Type=3, Arch=3, Pass1=1024, Pass2=6144, clm=2 (16 cores, 2 workers): 7.26, 7.03 ms. Throughput: 279.96 iter/sec.
FFTlen=6144K, Type=3, Arch=3, Pass1=1024, Pass2=6144, clm=1 (16 cores, 2 workers): 7.09, 6.90 ms. Throughput: 286.00 iter/sec.
FFTlen=6144K, Type=3, Arch=3, Pass1=1536, Pass2=4096, clm=4 (16 cores, 2 workers): 7.71, 7.37 ms. Throughput: 265.38 iter/sec.
FFTlen=6144K, Type=3, Arch=3, Pass1=1536, Pass2=4096, clm=2 (16 cores, 2 workers): 7.76, 7.36 ms. Throughput: 264.74 iter/sec.
FFTlen=6144K, Type=3, Arch=3, Pass1=1536, Pass2=4096, clm=1 (16 cores, 2 workers): 8.07, 7.71 ms. Throughput: 253.61 iter/sec.
FFTlen=6144K, Type=3, Arch=3, Pass1=2048, Pass2=3072, clm=4 (16 cores, 2 workers): 8.03, 7.46 ms. Throughput: 258.62 iter/sec.
FFTlen=6144K, Type=3, Arch=3, Pass1=2048, Pass2=3072, clm=2 (16 cores, 2 workers): 8.00, 7.68 ms. Throughput: 255.23 iter/sec.
FFTlen=6144K, Type=3, Arch=3, Pass1=2048, Pass2=3072, clm=1 (16 cores, 2 workers): 8.43, 8.10 ms. Throughput: 242.04 iter/sec.
wagner85 is offline   Reply With Quote
Old 2020-11-27, 23:13   #829
NookieN
 
NookieN's Avatar
 
Aug 2002

2×29 Posts
Default

Quote:
Originally Posted by wagner85 View Post
I run a rig with dual CPU (e-2690 v0)
MB Asus Z9PR-D12
32 GB DDR3 1333Mhz (quad channel)
I noticed my machine throughput is close of those of your machine.
Since I am new to that I don't understand why.
I have 20mb L3 cache for each CPU. total 40mb for both.
The price tag for the intel 10900X (just cpu) is more than I spent on my rig (twice).

Anyone has any ideas?
I think the short answer is memory bandwidth. That dual-socket system is quad-channel per socket (assuming you have 8x4GB). So that's ~100GB/s total theoretical system memory bandwidth, which is just about the same as on the X299 board.
NookieN is offline   Reply With Quote
Old 2020-11-28, 00:50   #830
wagner85
 
Aug 2020

52 Posts
Thumbs up :)

Yes, all have almost all ram slots populated.
8x4.
Interesting.
wagner85 is offline   Reply With Quote
Old 2020-11-28, 16:40   #831
Ensigm
 
Aug 2020

2·3·19 Posts
Default Benchmarking for P-1 Stage 2

I observe that the FFT implementation that has the best performance in benchmark is usually the best at P-1 stage 1, but not always so at P-1 stage 2.

For example, on a certain single-core cpu, mprime benchmark shows Pass1=512, Pass2=6K, clm=1, 2 threads to be slightly faster than Pass1=768, Pass2=4K, clm=2, 2 threads at FMA3 FFT length 3M.
Quote:
FFTlen=3072K, Type=3, Arch=4, Pass1=512, Pass2=6144, clm=1 (1 core hyperthreaded, 1 worker): 16.24 ms. Throughput: 61.56 iter/sec.
FFTlen=3072K, Type=3, Arch=4, Pass1=768, Pass2=4096, clm=2 (1 core hyperthreaded, 1 worker): 17.24 ms. Throughput: 57.99 iter/sec.
Indeed, the Stage 1 P-1 performance is roughly consistent with the benchmark, although the gap between the two implementations is much smaller.
Quote:
[Work thread Nov 21 20:29] P-1 on M56054833 with B1=1870000, B2=48900000
[Work thread Nov 21 20:29] Chance of finding a factor is an estimated 5.53%
[Work thread Nov 21 20:29] Using FMA3 FFT length 3M, Pass1=512, Pass2=6K, clm=1, 2 threads
[Work thread Nov 21 20:29] M56054833 stage 1 is 7.88% complete.
[Work thread Nov 21 20:31] M56054833 stage 1 is 8.22% complete. Time: 153.159 sec.
[Work thread Nov 21 20:34] M56054833 stage 1 is 8.55% complete. Time: 153.846 sec.
[Work thread Nov 21 20:37] M56054833 stage 1 is 8.88% complete. Time: 159.332 sec.
[Work thread Nov 21 20:39] M56054833 stage 1 is 9.22% complete. Time: 156.763 sec.
[Work thread Nov 21 20:42] M56054833 stage 1 is 9.55% complete. Time: 154.971 sec.
[Work thread Nov 21 20:44] M56054833 stage 1 is 9.88% complete. Time: 154.494 sec.
excluding 1 warmup run, avg. = 155.881
Quote:
[Work thread Nov 21 20:08] P-1 on M56054833 with B1=1870000, B2=48900000
[Work thread Nov 21 20:08] Chance of finding a factor is an estimated 5.53%
[Work thread Nov 21 20:08] Using FMA3 FFT length 3M, Pass1=768, Pass2=4K, clm=2, 2 threads
[Work thread Nov 21 20:09] M56054833 stage 1 is 5.34% complete.
[Work thread Nov 21 20:11] M56054833 stage 1 is 5.67% complete. Time: 155.436 sec.
[Work thread Nov 21 20:14] M56054833 stage 1 is 6.00% complete. Time: 156.521 sec.
[Work thread Nov 21 20:16] M56054833 stage 1 is 6.34% complete. Time: 157.097 sec.
[Work thread Nov 21 20:19] M56054833 stage 1 is 6.67% complete. Time: 157.518 sec.
[Work thread Nov 21 20:22] M56054833 stage 1 is 7.00% complete. Time: 155.897 sec.
[Work thread Nov 21 20:24] M56054833 stage 1 is 7.34% complete. Time: 157.303 sec.
excluding 1 warmup run, avg. = 156.867

But when it comes to Stage 2, it is Pass1=768, Pass2=4K, clm=2 that performs obviously better.
Quote:
[Work thread Nov 21 21:17] P-1 on M55974101 with B1=1836000, B2=48000000
[Work thread Nov 21 21:17] Chance of finding a factor is an estimated 5.5%
[Work thread Nov 21 21:17] Using FMA3 FFT length 3M, Pass1=512, Pass2=6K, clm=1, 2 threads
[Work thread Nov 21 21:17] Using 9216MB of memory. Processing 363 relative primes (0 of 960 already processed).
[Work thread Nov 21 21:23] M55974101 stage 2 is 11.01% complete.
[Work thread Nov 21 21:26] M55974101 stage 2 is 11.38% complete. Time: 208.500 sec.
[Work thread Nov 21 21:30] M55974101 stage 2 is 11.74% complete. Time: 211.054 sec.
[Work thread Nov 21 21:33] M55974101 stage 2 is 12.11% complete. Time: 209.971 sec.
[Work thread Nov 21 21:37] M55974101 stage 2 is 12.47% complete. Time: 211.080 sec.
[Work thread Nov 21 21:40] M55974101 stage 2 is 12.83% complete. Time: 211.593 sec.
[Work thread Nov 21 21:44] M55974101 stage 2 is 13.20% complete. Time: 208.985 sec.
[Work thread Nov 21 21:47] M55974101 stage 2 is 13.56% complete. Time: 210.307 sec.
excluding 2 warmup runs, avg. = 210.909.
Quote:
[Work thread Nov 21 20:46] P-1 on M55974101 with B1=1836000, B2=48000000
[Work thread Nov 21 20:46] Chance of finding a factor is an estimated 5.5%
[Work thread Nov 21 20:46] Using FMA3 FFT length 3M, Pass1=768, Pass2=4K, clm=2, 2 threads
[Work thread Nov 21 20:46] Using 9211MB of memory. Processing 363 relative primes (0 of 960 already processed).
[Work thread Nov 21 20:51] M55974101 stage 2 is 8.19% complete.
[Work thread Nov 21 20:55] M55974101 stage 2 is 8.55% complete. Time: 193.932 sec.
[Work thread Nov 21 20:58] M55974101 stage 2 is 8.92% complete. Time: 196.700 sec.
[Work thread Nov 21 21:01] M55974101 stage 2 is 9.28% complete. Time: 194.192 sec.
[Work thread Nov 21 21:04] M55974101 stage 2 is 9.65% complete. Time: 195.121 sec.
[Work thread Nov 21 21:08] M55974101 stage 2 is 10.01% complete. Time: 196.390 sec.
[Work thread Nov 21 21:11] M55974101 stage 2 is 10.37% complete. Time: 193.121 sec.
[Work thread Nov 21 21:14] M55974101 stage 2 is 10.74% complete. Time: 193.990 sec.
excluding 2 warmup runs, avg. = 194.563.

I don't know what exactly caused this phenomenon, but I find it very, very interesting. We know that S2 of P-1 uses large amounts of memory, so it's not surprising if memory bandwidth impacts S1 and S2 differently. From a practical point of view, this phenomenon implies we might be able to further optimize P-1 efficiency in some cases by using different FFT implementations for Stage 1 and Stage 2, each being the best choice of the respective stage.

Last fiddled with by Ensigm on 2020-11-28 at 16:48
Ensigm is offline   Reply With Quote
Old 2020-11-30, 23:18   #832
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

22·5·397 Posts
Default

Here are basic timings for a 5600X at the default power level with DDR4-3200C14-DR memory.

If any particular FFT length is interesting to you just ask and we will run a more thorough timing for it.
Code:
AMD Ryzen 5 5600X 6-Core Processor             
 CPU speed: 4649.98 MHz, 6 hyperthreaded cores
CPU features: 3DNow! Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 6x32 KB, L2 cache size: 6x512 KB, L3 cache size: 32 MB
L1 cache line size: 64 bytes, L2 cache line size: 64 bytes
Machine topology as determined by hwloc library:
 Machine#0 (total=30343520KB, Backend=Windows, hwlocVersion=2.0.4, ProcessName=prime95.exe)
  Package (total=30343520KB, CPUVendor=AuthenticAMD, CPUFamilyNumber=25, CPUModelNumber=33, CPUModel="AMD Ryzen 5 5600X 6-Core Processor             ", CPUStepping=0)
    L3 (size=32768KB, linesize=64, ways=16, Inclusive=0)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000003)
            PU#0 (cpuset: 0x00000001)
            PU#1 (cpuset: 0x00000002)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x0000000c)
            PU#2 (cpuset: 0x00000004)
            PU#3 (cpuset: 0x00000008)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000030)
            PU#4 (cpuset: 0x00000010)
            PU#5 (cpuset: 0x00000020)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x000000c0)
            PU#6 (cpuset: 0x00000040)
            PU#7 (cpuset: 0x00000080)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000300)
            PU#8 (cpuset: 0x00000100)
            PU#9 (cpuset: 0x00000200)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000c00)
            PU#10 (cpuset: 0x00000400)
            PU#11 (cpuset: 0x00000800)
Prime95 64-bit version 29.8, RdtscTiming=1
Timings for 64K FFT length (6 cores, 1 worker):  0.05 ms.  Throughput: 21888.32 iter/sec.
Timings for 64K FFT length (6 cores hyperthreaded, 1 worker):  0.07 ms.  Throughput: 14558.48 iter/sec.
Timings for 72K FFT length (6 cores, 1 worker):  0.06 ms.  Throughput: 17938.48 iter/sec.
Timings for 72K FFT length (6 cores hyperthreaded, 1 worker):  0.11 ms.  Throughput: 9120.66 iter/sec.
Timings for 80K FFT length (6 cores, 1 worker):  0.06 ms.  Throughput: 17557.29 iter/sec.
Timings for 80K FFT length (6 cores hyperthreaded, 1 worker):  0.09 ms.  Throughput: 10748.22 iter/sec.
Timings for 84K FFT length (6 cores, 1 worker):  0.07 ms.  Throughput: 15355.83 iter/sec.
Timings for 84K FFT length (6 cores hyperthreaded, 1 worker):  0.12 ms.  Throughput: 8128.39 iter/sec.
Timings for 96K FFT length (6 cores, 1 worker):  0.07 ms.  Throughput: 15209.24 iter/sec.
Timings for 96K FFT length (6 cores hyperthreaded, 1 worker):  0.11 ms.  Throughput: 9420.69 iter/sec.
Timings for 100K FFT length (6 cores, 1 worker):  0.07 ms.  Throughput: 14941.21 iter/sec.
Timings for 100K FFT length (6 cores hyperthreaded, 1 worker):  0.09 ms.  Throughput: 11344.65 iter/sec.
Timings for 112K FFT length (6 cores, 1 worker):  0.07 ms.  Throughput: 13669.07 iter/sec.
Timings for 112K FFT length (6 cores hyperthreaded, 1 worker):  0.09 ms.  Throughput: 10705.88 iter/sec.
Timings for 120K FFT length (6 cores, 1 worker):  0.08 ms.  Throughput: 12768.55 iter/sec.
Timings for 120K FFT length (6 cores hyperthreaded, 1 worker):  0.10 ms.  Throughput: 9914.96 iter/sec.
Timings for 128K FFT length (6 cores, 1 worker):  0.08 ms.  Throughput: 12613.72 iter/sec.
Timings for 128K FFT length (6 cores hyperthreaded, 1 worker):  0.11 ms.  Throughput: 8964.43 iter/sec.
Timings for 140K FFT length (6 cores, 1 worker):  0.09 ms.  Throughput: 10637.14 iter/sec.
Timings for 140K FFT length (6 cores hyperthreaded, 1 worker):  0.12 ms.  Throughput: 8610.30 iter/sec.
Timings for 144K FFT length (6 cores, 1 worker):  0.09 ms.  Throughput: 10750.74 iter/sec.
Timings for 144K FFT length (6 cores hyperthreaded, 1 worker):  0.16 ms.  Throughput: 6323.37 iter/sec.
Timings for 160K FFT length (6 cores, 1 worker):  0.10 ms.  Throughput: 10149.33 iter/sec.
Timings for 160K FFT length (6 cores hyperthreaded, 1 worker):  0.12 ms.  Throughput: 8213.02 iter/sec.
Timings for 168K FFT length (6 cores, 1 worker):  0.11 ms.  Throughput: 8993.45 iter/sec.
Timings for 168K FFT length (6 cores hyperthreaded, 1 worker):  0.18 ms.  Throughput: 5451.80 iter/sec.
Timings for 192K FFT length (6 cores, 1 worker):  0.12 ms.  Throughput: 8635.49 iter/sec.
Timings for 192K FFT length (6 cores hyperthreaded, 1 worker):  0.14 ms.  Throughput: 7008.71 iter/sec.
Timings for 200K FFT length (6 cores, 1 worker):  0.12 ms.  Throughput: 8436.61 iter/sec.
Timings for 200K FFT length (6 cores hyperthreaded, 1 worker):  0.14 ms.  Throughput: 7111.67 iter/sec.
Timings for 224K FFT length (6 cores, 1 worker):  0.14 ms.  Throughput: 7225.68 iter/sec.
Timings for 224K FFT length (6 cores hyperthreaded, 1 worker):  0.19 ms.  Throughput: 5314.01 iter/sec.
Timings for 240K FFT length (6 cores, 1 worker):  0.14 ms.  Throughput: 7005.25 iter/sec.
Timings for 240K FFT length (6 cores hyperthreaded, 1 worker):  0.15 ms.  Throughput: 6547.29 iter/sec.
Timings for 256K FFT length (6 cores, 1 worker):  0.14 ms.  Throughput: 7213.37 iter/sec.
Timings for 256K FFT length (6 cores hyperthreaded, 1 worker):  0.15 ms.  Throughput: 6530.21 iter/sec.
Timings for 288K FFT length (6 cores, 1 worker):  0.17 ms.  Throughput: 5990.37 iter/sec.
Timings for 288K FFT length (6 cores hyperthreaded, 1 worker):  0.18 ms.  Throughput: 5449.48 iter/sec.
Timings for 320K FFT length (6 cores, 1 worker):  0.18 ms.  Throughput: 5642.96 iter/sec.
Timings for 320K FFT length (6 cores hyperthreaded, 1 worker):  0.20 ms.  Throughput: 4996.88 iter/sec.
Timings for 336K FFT length (6 cores, 1 worker):  0.20 ms.  Throughput: 4915.37 iter/sec.
Timings for 336K FFT length (6 cores hyperthreaded, 1 worker):  0.22 ms.  Throughput: 4574.27 iter/sec.
Timings for 384K FFT length (6 cores, 1 worker):  0.21 ms.  Throughput: 4752.30 iter/sec.
Timings for 384K FFT length (6 cores hyperthreaded, 1 worker):  0.24 ms.  Throughput: 4218.52 iter/sec.
Timings for 400K FFT length (6 cores, 1 worker):  0.22 ms.  Throughput: 4496.52 iter/sec.
Timings for 400K FFT length (6 cores hyperthreaded, 1 worker):  0.25 ms.  Throughput: 4027.93 iter/sec.
Timings for 448K FFT length (6 cores, 1 worker):  0.25 ms.  Throughput: 3972.51 iter/sec.
Timings for 448K FFT length (6 cores hyperthreaded, 1 worker):  0.28 ms.  Throughput: 3513.83 iter/sec.
Timings for 480K FFT length (6 cores, 1 worker):  0.26 ms.  Throughput: 3798.82 iter/sec.
Timings for 480K FFT length (6 cores hyperthreaded, 1 worker):  0.29 ms.  Throughput: 3416.21 iter/sec.
Timings for 512K FFT length (6 cores, 1 worker):  0.28 ms.  Throughput: 3575.94 iter/sec.
Timings for 512K FFT length (6 cores hyperthreaded, 1 worker):  0.29 ms.  Throughput: 3437.17 iter/sec.
Timings for 560K FFT length (6 cores, 1 worker):  0.32 ms.  Throughput: 3118.77 iter/sec.
Timings for 560K FFT length (6 cores hyperthreaded, 1 worker):  0.35 ms.  Throughput: 2848.29 iter/sec.
Timings for 576K FFT length (6 cores, 1 worker):  0.32 ms.  Throughput: 3102.93 iter/sec.
Timings for 576K FFT length (6 cores hyperthreaded, 1 worker):  0.33 ms.  Throughput: 2996.06 iter/sec.
Timings for 640K FFT length (6 cores, 1 worker):  0.36 ms.  Throughput: 2764.82 iter/sec.
Timings for 640K FFT length (6 cores hyperthreaded, 1 worker):  0.37 ms.  Throughput: 2673.04 iter/sec.
Timings for 672K FFT length (6 cores, 1 worker):  0.39 ms.  Throughput: 2568.59 iter/sec.
Timings for 672K FFT length (6 cores hyperthreaded, 1 worker):  0.41 ms.  Throughput: 2462.90 iter/sec.
Timings for 720K FFT length (6 cores, 1 worker):  0.41 ms.  Throughput: 2462.97 iter/sec.
Timings for 720K FFT length (6 cores hyperthreaded, 1 worker):  0.43 ms.  Throughput: 2323.71 iter/sec.
Timings for 768K FFT length (6 cores, 1 worker):  0.42 ms.  Throughput: 2368.56 iter/sec.
Timings for 768K FFT length (6 cores hyperthreaded, 1 worker):  0.43 ms.  Throughput: 2337.10 iter/sec.
Timings for 800K FFT length (6 cores, 1 worker):  0.44 ms.  Throughput: 2297.36 iter/sec.
Timings for 800K FFT length (6 cores hyperthreaded, 1 worker):  0.45 ms.  Throughput: 2222.44 iter/sec.
Timings for 864K FFT length (6 cores, 1 worker):  0.48 ms.  Throughput: 2099.57 iter/sec.
Timings for 864K FFT length (6 cores hyperthreaded, 1 worker):  0.50 ms.  Throughput: 2000.37 iter/sec.
Timings for 896K FFT length (6 cores, 1 worker):  0.51 ms.  Throughput: 1958.96 iter/sec.
Timings for 896K FFT length (6 cores hyperthreaded, 1 worker):  0.52 ms.  Throughput: 1927.95 iter/sec.
Timings for 960K FFT length (6 cores, 1 worker):  0.51 ms.  Throughput: 1945.51 iter/sec.
Timings for 960K FFT length (6 cores hyperthreaded, 1 worker):  0.53 ms.  Throughput: 1869.43 iter/sec.
Timings for 1024K FFT length (6 cores, 1 worker):  0.56 ms.  Throughput: 1787.88 iter/sec.
Timings for 1024K FFT length (6 cores hyperthreaded, 1 worker):  0.59 ms.  Throughput: 1707.65 iter/sec.
Timings for 1120K FFT length (6 cores, 1 worker):  0.63 ms.  Throughput: 1591.93 iter/sec.
Timings for 1120K FFT length (6 cores hyperthreaded, 1 worker):  0.65 ms.  Throughput: 1540.94 iter/sec.
Timings for 1152K FFT length (6 cores, 1 worker):  0.62 ms.  Throughput: 1604.97 iter/sec.
Timings for 1152K FFT length (6 cores hyperthreaded, 1 worker):  0.66 ms.  Throughput: 1519.62 iter/sec.
Timings for 1200K FFT length (6 cores, 1 worker):  0.67 ms.  Throughput: 1500.19 iter/sec.
Timings for 1200K FFT length (6 cores hyperthreaded, 1 worker):  0.70 ms.  Throughput: 1428.53 iter/sec.
Timings for 1280K FFT length (6 cores, 1 worker):  0.71 ms.  Throughput: 1412.05 iter/sec.
Timings for 1280K FFT length (6 cores hyperthreaded, 1 worker):  0.74 ms.  Throughput: 1355.82 iter/sec.
Timings for 1344K FFT length (6 cores, 1 worker):  0.76 ms.  Throughput: 1320.86 iter/sec.
Timings for 1344K FFT length (6 cores hyperthreaded, 1 worker):  0.80 ms.  Throughput: 1253.62 iter/sec.
Timings for 1440K FFT length (6 cores, 1 worker):  0.80 ms.  Throughput: 1253.99 iter/sec.
Timings for 1440K FFT length (6 cores hyperthreaded, 1 worker):  0.83 ms.  Throughput: 1200.22 iter/sec.
Timings for 1536K FFT length (6 cores, 1 worker):  0.84 ms.  Throughput: 1189.18 iter/sec.
Timings for 1536K FFT length (6 cores hyperthreaded, 1 worker):  0.87 ms.  Throughput: 1143.42 iter/sec.
Timings for 1600K FFT length (6 cores, 1 worker):  0.88 ms.  Throughput: 1140.09 iter/sec.
Timings for 1600K FFT length (6 cores hyperthreaded, 1 worker):  0.90 ms.  Throughput: 1114.65 iter/sec.
Timings for 1680K FFT length (6 cores, 1 worker):  0.97 ms.  Throughput: 1033.45 iter/sec.
Timings for 1680K FFT length (6 cores hyperthreaded, 1 worker):  1.00 ms.  Throughput: 1000.04 iter/sec.
Timings for 1728K FFT length (6 cores, 1 worker):  0.96 ms.  Throughput: 1039.41 iter/sec.
Timings for 1728K FFT length (6 cores hyperthreaded, 1 worker):  1.00 ms.  Throughput: 1000.16 iter/sec.
Timings for 1792K FFT length (6 cores, 1 worker):  1.04 ms.  Throughput: 964.05 iter/sec.
Timings for 1792K FFT length (6 cores hyperthreaded, 1 worker):  1.07 ms.  Throughput: 934.18 iter/sec.
Timings for 1920K FFT length (6 cores, 1 worker):  1.05 ms.  Throughput: 950.13 iter/sec.
Timings for 1920K FFT length (6 cores hyperthreaded, 1 worker):  1.10 ms.  Throughput: 905.62 iter/sec.
Timings for 2016K FFT length (6 cores, 1 worker):  1.16 ms.  Throughput: 864.23 iter/sec.
Timings for 2016K FFT length (6 cores hyperthreaded, 1 worker):  1.20 ms.  Throughput: 833.46 iter/sec.
Timings for 2048K FFT length (6 cores, 1 worker):  1.14 ms.  Throughput: 876.58 iter/sec.
Timings for 2048K FFT length (6 cores hyperthreaded, 1 worker):  1.18 ms.  Throughput: 847.42 iter/sec.
Timings for 2304K FFT length (6 cores, 1 worker):  1.25 ms.  Throughput: 799.02 iter/sec.
Timings for 2304K FFT length (6 cores hyperthreaded, 1 worker):  1.30 ms.  Throughput: 766.62 iter/sec.
Timings for 2400K FFT length (6 cores, 1 worker):  1.34 ms.  Throughput: 745.39 iter/sec.
Timings for 2400K FFT length (6 cores hyperthreaded, 1 worker):  1.41 ms.  Throughput: 707.19 iter/sec.
Timings for 2560K FFT length (6 cores, 1 worker):  1.44 ms.  Throughput: 692.29 iter/sec.
Timings for 2560K FFT length (6 cores hyperthreaded, 1 worker):  1.50 ms.  Throughput: 667.86 iter/sec.
Timings for 2688K FFT length (6 cores, 1 worker):  1.53 ms.  Throughput: 654.84 iter/sec.
Timings for 2688K FFT length (6 cores hyperthreaded, 1 worker):  1.60 ms.  Throughput: 624.67 iter/sec.
Timings for 2880K FFT length (6 cores, 1 worker):  1.59 ms.  Throughput: 627.20 iter/sec.
Timings for 2880K FFT length (6 cores hyperthreaded, 1 worker):  1.67 ms.  Throughput: 599.18 iter/sec.
Timings for 3072K FFT length (6 cores, 1 worker):  1.74 ms.  Throughput: 574.61 iter/sec.
Timings for 3072K FFT length (6 cores hyperthreaded, 1 worker):  1.79 ms.  Throughput: 558.26 iter/sec.
Timings for 3200K FFT length (6 cores, 1 worker):  1.86 ms.  Throughput: 538.18 iter/sec.
Timings for 3200K FFT length (6 cores hyperthreaded, 1 worker):  1.93 ms.  Throughput: 517.19 iter/sec.
Timings for 3360K FFT length (6 cores, 1 worker):  1.98 ms.  Throughput: 505.10 iter/sec.
Timings for 3360K FFT length (6 cores hyperthreaded, 1 worker):  2.06 ms.  Throughput: 485.65 iter/sec.
Timings for 3456K FFT length (6 cores, 1 worker):  1.95 ms.  Throughput: 512.51 iter/sec.
Timings for 3456K FFT length (6 cores hyperthreaded, 1 worker):  2.00 ms.  Throughput: 499.88 iter/sec.
Timings for 3584K FFT length (6 cores, 1 worker):  2.10 ms.  Throughput: 475.86 iter/sec.
Timings for 3584K FFT length (6 cores hyperthreaded, 1 worker):  2.19 ms.  Throughput: 455.68 iter/sec.
Timings for 3840K FFT length (6 cores, 1 worker):  2.18 ms.  Throughput: 458.94 iter/sec.
Timings for 3840K FFT length (6 cores hyperthreaded, 1 worker):  2.27 ms.  Throughput: 440.92 iter/sec.
Timings for 4096K FFT length (6 cores, 1 worker):  2.43 ms.  Throughput: 412.33 iter/sec.
Timings for 4096K FFT length (6 cores hyperthreaded, 1 worker):  2.50 ms.  Throughput: 399.58 iter/sec.
Timings for 4480K FFT length (6 cores, 1 worker):  2.75 ms.  Throughput: 363.60 iter/sec.
Timings for 4480K FFT length (6 cores hyperthreaded, 1 worker):  2.87 ms.  Throughput: 348.93 iter/sec.
Timings for 4608K FFT length (6 cores, 1 worker):  2.64 ms.  Throughput: 378.62 iter/sec.
Timings for 4608K FFT length (6 cores hyperthreaded, 1 worker):  2.77 ms.  Throughput: 360.94 iter/sec.
Timings for 4800K FFT length (6 cores, 1 worker):  2.84 ms.  Throughput: 352.34 iter/sec.
Timings for 4800K FFT length (6 cores hyperthreaded, 1 worker):  3.00 ms.  Throughput: 333.12 iter/sec.
Timings for 5120K FFT length (6 cores, 1 worker):  3.10 ms.  Throughput: 322.57 iter/sec.
Timings for 5120K FFT length (6 cores hyperthreaded, 1 worker):  3.26 ms.  Throughput: 306.88 iter/sec.
Timings for 5376K FFT length (6 cores, 1 worker):  3.25 ms.  Throughput: 307.40 iter/sec.
Timings for 5376K FFT length (6 cores hyperthreaded, 1 worker):  3.65 ms.  Throughput: 273.77 iter/sec.
Timings for 5760K FFT length (6 cores, 1 worker):  3.81 ms.  Throughput: 262.22 iter/sec.
Timings for 5760K FFT length (6 cores hyperthreaded, 1 worker):  4.43 ms.  Throughput: 225.98 iter/sec.
Timings for 6144K FFT length (6 cores, 1 worker):  4.09 ms.  Throughput: 244.22 iter/sec.
Timings for 6144K FFT length (6 cores hyperthreaded, 1 worker):  4.80 ms.  Throughput: 208.15 iter/sec.
Timings for 6400K FFT length (6 cores, 1 worker):  4.25 ms.  Throughput: 235.12 iter/sec.
Timings for 6400K FFT length (6 cores hyperthreaded, 1 worker):  4.77 ms.  Throughput: 209.50 iter/sec.
Timings for 6720K FFT length (6 cores, 1 worker):  4.43 ms.  Throughput: 225.80 iter/sec.
Timings for 6720K FFT length (6 cores hyperthreaded, 1 worker):  5.00 ms.  Throughput: 200.04 iter/sec.
Timings for 6912K FFT length (6 cores, 1 worker):  4.87 ms.  Throughput: 205.31 iter/sec.
Timings for 6912K FFT length (6 cores hyperthreaded, 1 worker):  5.58 ms.  Throughput: 179.07 iter/sec.
Timings for 7168K FFT length (6 cores, 1 worker):  4.76 ms.  Throughput: 210.20 iter/sec.
Timings for 7168K FFT length (6 cores hyperthreaded, 1 worker):  5.49 ms.  Throughput: 181.98 iter/sec.
Timings for 7680K FFT length (6 cores, 1 worker):  5.11 ms.  Throughput: 195.68 iter/sec.
Timings for 7680K FFT length (6 cores hyperthreaded, 1 worker):  5.71 ms.  Throughput: 175.23 iter/sec.
Timings for 8064K FFT length (6 cores, 1 worker):  5.55 ms.  Throughput: 180.31 iter/sec.
Timings for 8064K FFT length (6 cores hyperthreaded, 1 worker):  6.52 ms.  Throughput: 153.35 iter/sec.
Timings for 8192K FFT length (6 cores, 1 worker):  5.58 ms.  Throughput: 179.08 iter/sec.
Timings for 8192K FFT length (6 cores hyperthreaded, 1 worker):  6.87 ms.  Throughput: 145.56 iter/sec.
Xyzzy is offline   Reply With Quote
Old 2020-12-19, 07:19   #833
scan80269
 
"Sam"
Jun 2019
California, USA

2×3×5 Posts
Default i9-7980XE Skylake-X stock with quad DDR4-3600 17-19-19-39 2R and ASRock X299 Taichi XE

Code:
[Fri Dec 18 22:48:03 2020]
Compare your results to other computers at http://www.mersenne.org/report_benchmarks
Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz
CPU speed: 3389.68 MHz, 18 hyperthreaded cores
CPU features: Prefetchw, SSE, SSE2, SSE4, AVX, AVX2, FMA, AVX512F
L1 cache size: 18x32 KB, L2 cache size: 18x1 MB, L3 cache size: 25344 KB
L1 cache line size: 64 bytes, L2 cache line size: 64 bytes
Machine topology as determined by hwloc library:
 Machine#0 (total=64580260KB, Backend=Windows, hwlocVersion=2.2.0, ProcessName=prime95.exe)
  Package (total=64580260KB, CPUVendor=GenuineIntel, CPUFamilyNumber=6, CPUModelNumber=85, CPUModel="Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz", CPUStepping=4)
    L3 (size=25344KB, linesize=64, ways=11, Inclusive=0)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000003)
            PU#0 (cpuset: 0x00000001)
            PU#1 (cpuset: 0x00000002)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x0000000c)
            PU#2 (cpuset: 0x00000004)
            PU#3 (cpuset: 0x00000008)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000030)
            PU#4 (cpuset: 0x00000010)
            PU#5 (cpuset: 0x00000020)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x000000c0)
            PU#6 (cpuset: 0x00000040)
            PU#7 (cpuset: 0x00000080)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000300)
            PU#8 (cpuset: 0x00000100)
            PU#9 (cpuset: 0x00000200)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000c00)
            PU#10 (cpuset: 0x00000400)
            PU#11 (cpuset: 0x00000800)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00003000)
            PU#12 (cpuset: 0x00001000)
            PU#13 (cpuset: 0x00002000)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x0000c000)
            PU#14 (cpuset: 0x00004000)
            PU#15 (cpuset: 0x00008000)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00030000)
            PU#16 (cpuset: 0x00010000)
            PU#17 (cpuset: 0x00020000)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x000c0000)
            PU#18 (cpuset: 0x00040000)
            PU#19 (cpuset: 0x00080000)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00300000)
            PU#20 (cpuset: 0x00100000)
            PU#21 (cpuset: 0x00200000)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00c00000)
            PU#22 (cpuset: 0x00400000)
            PU#23 (cpuset: 0x00800000)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x03000000)
            PU#24 (cpuset: 0x01000000)
            PU#25 (cpuset: 0x02000000)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x0c000000)
            PU#26 (cpuset: 0x04000000)
            PU#27 (cpuset: 0x08000000)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x30000000)
            PU#28 (cpuset: 0x10000000)
            PU#29 (cpuset: 0x20000000)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0xc0000000)
            PU#30 (cpuset: 0x40000000)
            PU#31 (cpuset: 0x80000000)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000003,0x0)
            PU#32 (cpuset: 0x00000001,0x0)
            PU#33 (cpuset: 0x00000002,0x0)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x0000000c,0x0)
            PU#34 (cpuset: 0x00000004,0x0)
            PU#35 (cpuset: 0x00000008,0x0)
Prime95 64-bit version 30.3, RdtscTiming=1
Timings for 2048K FFT length (18 cores, 1 worker):  0.56 ms.  Throughput: 1774.64 iter/sec.
Timings for 2100K FFT length (18 cores, 1 worker):  0.54 ms.  Throughput: 1847.59 iter/sec.
Timings for 2160K FFT length (18 cores, 1 worker):  0.76 ms.  Throughput: 1311.03 iter/sec.
Timings for 2240K FFT length (18 cores, 1 worker):  0.89 ms.  Throughput: 1122.31 iter/sec.
Timings for 2304K FFT length (18 cores, 1 worker):  0.72 ms.  Throughput: 1383.84 iter/sec.
Timings for 2400K FFT length (18 cores, 1 worker):  0.83 ms.  Throughput: 1207.84 iter/sec.
Timings for 2520K FFT length (18 cores, 1 worker):  0.91 ms.  Throughput: 1097.36 iter/sec.
Timings for 2560K FFT length (18 cores, 1 worker):  0.94 ms.  Throughput: 1061.14 iter/sec.
Timings for 2592K FFT length (18 cores, 1 worker):  0.70 ms.  Throughput: 1429.10 iter/sec.
Timings for 2688K FFT length (18 cores, 1 worker):  0.96 ms.  Throughput: 1046.92 iter/sec.
Timings for 2880K FFT length (18 cores, 1 worker):  0.71 ms.  Throughput: 1408.59 iter/sec.
Timings for 2940K FFT length (18 cores, 1 worker):  0.72 ms.  Throughput: 1382.41 iter/sec.
Timings for 3000K FFT length (18 cores, 1 worker):  1.20 ms.  Throughput: 832.02 iter/sec.
Timings for 3072K FFT length (18 cores, 1 worker):  0.73 ms.  Throughput: 1365.16 iter/sec.
Timings for 3136K FFT length (18 cores, 1 worker):  0.95 ms.  Throughput: 1053.21 iter/sec.
Timings for 3200K FFT length (18 cores, 1 worker):  1.36 ms.  Throughput: 737.54 iter/sec.
Timings for 3360K FFT length (18 cores, 1 worker):  0.84 ms.  Throughput: 1183.59 iter/sec.
Timings for 3456K FFT length (18 cores, 1 worker):  0.88 ms.  Throughput: 1139.78 iter/sec.
Timings for 3600K FFT length (18 cores, 1 worker):  1.43 ms.  Throughput: 697.91 iter/sec.
[Fri Dec 18 22:53:13 2020]
Timings for 3840K FFT length (18 cores, 1 worker):  1.28 ms.  Throughput: 778.90 iter/sec.
Timings for 3920K FFT length (18 cores, 1 worker):  0.97 ms.  Throughput: 1030.34 iter/sec.
Timings for 4032K FFT length (18 cores, 1 worker):  0.93 ms.  Throughput: 1079.83 iter/sec.
Timings for 4200K FFT length (18 cores, 1 worker):  1.49 ms.  Throughput: 671.73 iter/sec.
Timings for 4320K FFT length (18 cores, 1 worker):  1.65 ms.  Throughput: 606.82 iter/sec.
Timings for 4480K FFT length (18 cores, 1 worker):  1.11 ms.  Throughput: 897.59 iter/sec.
Timings for 4608K FFT length (18 cores, 1 worker):  1.17 ms.  Throughput: 854.38 iter/sec.
Timings for 4704K FFT length (18 cores, 1 worker):  1.20 ms.  Throughput: 834.60 iter/sec.
Timings for 4800K FFT length (18 cores, 1 worker):  2.48 ms.  Throughput: 403.36 iter/sec.
Timings for 5040K FFT length (18 cores, 1 worker):  1.19 ms.  Throughput: 837.02 iter/sec.
Timings for 5120K FFT length (18 cores, 1 worker):  1.29 ms.  Throughput: 773.71 iter/sec.
Timings for 5184K FFT length (18 cores, 1 worker):  1.41 ms.  Throughput: 708.51 iter/sec.
Timings for 5376K FFT length (18 cores, 1 worker):  1.37 ms.  Throughput: 731.34 iter/sec.
Timings for 5760K FFT length (18 cores, 1 worker):  2.54 ms.  Throughput: 393.57 iter/sec.
Timings for 6048K FFT length (18 cores, 1 worker):  1.68 ms.  Throughput: 594.96 iter/sec.
Timings for 6144K FFT length (18 cores, 1 worker):  1.74 ms.  Throughput: 574.36 iter/sec.
Timings for 6272K FFT length (18 cores, 1 worker):  1.79 ms.  Throughput: 560.22 iter/sec.
Timings for 6400K FFT length (18 cores, 1 worker):  1.96 ms.  Throughput: 511.06 iter/sec.
Timings for 6720K FFT length (18 cores, 1 worker):  2.13 ms.  Throughput: 468.99 iter/sec.
Timings for 7056K FFT length (18 cores, 1 worker):  2.28 ms.  Throughput: 439.02 iter/sec.
[Fri Dec 18 22:58:28 2020]
Timings for 7168K FFT length (18 cores, 1 worker):  2.31 ms.  Throughput: 433.28 iter/sec.
Timings for 7200K FFT length (18 cores, 1 worker):  2.26 ms.  Throughput: 441.97 iter/sec.
Timings for 7680K FFT length (18 cores, 1 worker):  2.53 ms.  Throughput: 394.55 iter/sec.
Timings for 8064K FFT length (18 cores, 1 worker):  2.94 ms.  Throughput: 339.67 iter/sec.
scan80269 is offline   Reply With Quote
Old 2020-12-28, 03:18   #834
moebius
 
moebius's Avatar
 
Jul 2009
Germany

547 Posts
Default

FFT-Timings for AMD Ryzen 7 3700X with hyperthreading enabled. Win 10 Pro ,X470 board, 2x8 GB Single Rank DDR4-3000 CL16@3200Mhz Unfortunately, the CPU clocks up and down with the stock cooler
Code:
 AMD Ryzen 7 3700X 8-Core Processor             
CPU speed: 3600.02 MHz, 8 hyperthreaded cores
CPU features: 3DNow! Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 8x32 KB, L2 cache size: 8x512 KB, L3 cache size: 2x16 MB
L1 cache line size: 64 bytes, L2 cache line size: 64 bytes
Machine topology as determined by hwloc library:
 Machine#0 (total=13605644KB, Backend=Windows, hwlocVersion=2.2.0, ProcessName=prime95.exe)
  Package (total=13605644KB, CPUVendor=AuthenticAMD, CPUFamilyNumber=23, CPUModelNumber=113, CPUModel="AMD Ryzen 7 3700X 8-Core Processor             ", CPUStepping=0)
    L3 (size=16384KB, linesize=64, ways=16, Inclusive=0)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000003)
            PU#0 (cpuset: 0x00000001)
            PU#1 (cpuset: 0x00000002)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x0000000c)
            PU#2 (cpuset: 0x00000004)
            PU#3 (cpuset: 0x00000008)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000030)
            PU#4 (cpuset: 0x00000010)
            PU#5 (cpuset: 0x00000020)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x000000c0)
            PU#6 (cpuset: 0x00000040)
            PU#7 (cpuset: 0x00000080)
    L3 (size=16384KB, linesize=64, ways=16, Inclusive=0)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000300)
            PU#8 (cpuset: 0x00000100)
            PU#9 (cpuset: 0x00000200)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000c00)
            PU#10 (cpuset: 0x00000400)
            PU#11 (cpuset: 0x00000800)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00003000)
            PU#12 (cpuset: 0x00001000)
            PU#13 (cpuset: 0x00002000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x0000c000)
            PU#14 (cpuset: 0x00004000)
            PU#15 (cpuset: 0x00008000)
Prime95 64-bit version 30.3, RdtscTiming=1
Timing FFTs using 8 threads on 8 cores.
Best time for 2048K FFT length: 1.748 ms., avg: 1.797 ms.
Best time for 2240K FFT length: 1.971 ms., avg: 1.992 ms.
Best time for 2304K FFT length: 1.974 ms., avg: 2.077 ms.
Best time for 2400K FFT length: 2.058 ms., avg: 2.136 ms.
Best time for 2560K FFT length: 2.206 ms., avg: 2.272 ms.
Best time for 2688K FFT length: 2.393 ms., avg: 2.411 ms.
Best time for 2800K FFT length: 2.422 ms., avg: 2.581 ms.
Best time for 2880K FFT length: 2.497 ms., avg: 2.517 ms.
Best time for 3072K FFT length: 2.572 ms., avg: 2.629 ms.
Best time for 3200K FFT length: 2.762 ms., avg: 2.906 ms.
Best time for 3360K FFT length: 3.019 ms., avg: 3.106 ms.
Best time for 3584K FFT length: 3.134 ms., avg: 3.192 ms.
Best time for 3840K FFT length: 3.436 ms., avg: 3.611 ms.
Best time for 4096K FFT length: 3.683 ms., avg: 3.784 ms.
Best time for 4480K FFT length: 4.356 ms., avg: 4.521 ms.
Best time for 4608K FFT length: 4.552 ms., avg: 4.764 ms.
Best time for 4800K FFT length: 4.967 ms., avg: 5.041 ms.
Best time for 5120K FFT length: 5.135 ms., avg: 5.389 ms.
Best time for 5376K FFT length: 5.867 ms., avg: 5.988 ms.
Best time for 5600K FFT length: 6.350 ms., avg: 6.570 ms.
Best time for 5760K FFT length: 6.872 ms., avg: 6.979 ms.
Best time for 6144K FFT length: 7.081 ms., avg: 7.357 ms.
Best time for 6400K FFT length: 7.724 ms., avg: 7.874 ms.
Best time for 6720K FFT length: 8.106 ms., avg: 8.591 ms.
Best time for 7168K FFT length: 8.676 ms., avg: 9.031 ms.
Best time for 7680K FFT length: 9.668 ms., avg: 9.909 ms.
Best time for 8000K FFT length: 10.079 ms., avg: 10.279 ms.
Best time for 8064K FFT length: 10.226 ms., avg: 10.401 ms.
Best time for 8192K FFT length: 10.477 ms., avg: 10.644 ms.
Timing FFTs using 16 threads on 8 cores.
Best time for 2048K FFT length: 1.778 ms., avg: 1.802 ms.
Best time for 2240K FFT length: 1.643 ms., avg: 1.679 ms.
Best time for 2304K FFT length: 1.645 ms., avg: 1.674 ms.
Best time for 2400K FFT length: 1.648 ms., avg: 1.693 ms.
Best time for 2560K FFT length: 1.982 ms., avg: 2.055 ms.
Best time for 2688K FFT length: 2.128 ms., avg: 2.248 ms.
Best time for 2800K FFT length: 2.354 ms., avg: 2.422 ms.
Best time for 2880K FFT length: 2.170 ms., avg: 2.402 ms.
Best time for 3072K FFT length: 1.978 ms., avg: 2.114 ms.
Best time for 3200K FFT length: 1.926 ms., avg: 1.994 ms.
Best time for 3360K FFT length: 2.056 ms., avg: 2.117 ms.
Best time for 3584K FFT length: 2.163 ms., avg: 2.384 ms.
Best time for 3840K FFT length: 3.293 ms., avg: 3.457 ms.
Best time for 4096K FFT length: 3.616 ms., avg: 3.752 ms.
Best time for 4480K FFT length: 4.620 ms., avg: 4.736 ms.
Best time for 4608K FFT length: 4.527 ms., avg: 4.663 ms.
Best time for 4800K FFT length: 4.781 ms., avg: 4.920 ms.
Best time for 5120K FFT length: 5.354 ms., avg: 5.992 ms.
Best time for 5376K FFT length: 6.026 ms., avg: 6.234 ms.
Best time for 5600K FFT length: 6.640 ms., avg: 6.830 ms.
Best time for 5760K FFT length: 7.244 ms., avg: 7.376 ms.
Best time for 6144K FFT length: 7.134 ms., avg: 7.384 ms.
Best time for 6400K FFT length: 7.724 ms., avg: 8.102 ms.
Best time for 6720K FFT length: 8.659 ms., avg: 8.936 ms.
Best time for 7168K FFT length: 9.029 ms., avg: 9.647 ms.
Best time for 7680K FFT length: 10.654 ms., avg: 10.879 ms.
Best time for 8000K FFT length: 10.372 ms., avg: 10.710 ms.
Best time for 8064K FFT length: 10.859 ms., avg: 11.032 ms.
Best time for 8192K FFT length: 11.111 ms., avg: 11.335 ms.

Last fiddled with by moebius on 2020-12-28 at 03:47
moebius is offline   Reply With Quote
Old 2020-12-29, 03:28   #835
moebius
 
moebius's Avatar
 
Jul 2009
Germany

54710 Posts
Default

With MSI Bios update Version 7B79vHB1(Beta version) 2020-12-20
- AMD AGESA ComboAm4v2PI 1.1.0.0 Patch D
- Support Re-size BAR function to enhance GPU performance. Full memory access to the VRAM of an Nvidia GPU can be activated.

a little better values than before

Throughput benchmark best FFT implementation
Code:
AMD Ryzen 7 3700X 8-Core Processor             
CPU speed: 4309.08 MHz, 8 hyperthreaded cores

FFTlen=2048K all-complex, Type=3, Arch=4, Pass1=512, Pass2=4096, clm=1 (8 cores hyperthreaded, 1 worker):  0.99 ms.  Throughput: 1010.46 iter/sec.
FFTlen=2304K all-complex, Type=3, Arch=4, Pass1=384, Pass2=6144, clm=2 (8 cores hyperthreaded, 1 worker):  1.08 ms.  Throughput: 923.27 iter/sec.
FFTlen=2400K all-complex, Type=3, Arch=4, Pass1=384, Pass2=6400, clm=2 (8 cores hyperthreaded, 1 worker):  1.17 ms.  Throughput: 852.04 iter/sec.
FFTlen=2560K all-complex, Type=3, Arch=4, Pass1=640, Pass2=4096, clm=2 (8 cores hyperthreaded, 1 worker):  1.23 ms.  Throughput: 811.99 iter/sec.
FFTlen=2880K all-complex, Type=3, Arch=4, Pass1=640, Pass2=4608, clm=2 (8 cores hyperthreaded, 1 worker):  1.42 ms.  Throughput: 702.34 iter/sec.
FFTlen=3072K all-complex, Type=3, Arch=4, Pass1=512, Pass2=6144, clm=2 (8 cores hyperthreaded, 1 worker):  1.47 ms.  Throughput: 679.50 iter/sec.
FFTlen=3200K all-complex, Type=3, Arch=4, Pass1=640, Pass2=5120, clm=2 (8 cores hyperthreaded, 1 worker):  1.64 ms.  Throughput: 610.42 iter/sec.
FFTlen=3456K all-complex, Type=3, Arch=4, Pass1=1536, Pass2=2304, clm=1 (8 cores, 1 worker): 1.98 ms.  Throughput: 504.67 iter/sec.
FFTlen=3840K all-complex, Type=3, Arch=4, Pass1=640, Pass2=6144, clm=4 (8 cores, 1 worker): 2.38 ms.  Throughput: 419.45 iter/sec.
FFTlen=4000K all-complex, Type=3, Arch=4, Pass1=640, Pass2=6400, clm=2 (8 cores, 1 worker):  2.68 ms.  Throughput: 373.08 iter/sec.
FTlen=4608K all-complex, Type=3, Arch=4, Pass1=768, Pass2=6144, clm=2 (8 cores, 1 worker): 3.93 ms. Throughput: 254.30 iter/sec.
FFTlen=4800K all-complex, Type=3, Arch=4, Pass1=768, Pass2=6400, clm=2 (8 cores, 1 worker): 4.34 ms. Throughput: 230.33 iter/sec.
FFTlen=5120K all-complex, Type=3, Arch=4, Pass1=1280, Pass2=4096, clm=2 (8 cores, 1 worker):  4.97 ms.  Throughput: 201.36 iter/sec.
FTlen=5600K, Type=3, Arch=4, Pass1=896, Pass2=6400, clm=2 (8 cores, 1 worker):  6.12 ms. Throughput: 163.33 iter/sec.
FFTlen=5760K all-complex, Type=3, Arch=4, Pass1=640, Pass2=9216, clm=2 (8 cores, 1 worker):  6.16 ms.  Throughput: 162.21 iter/sec.
FFTlen=6144K all-complex, Type=3, Arch=4, Pass1=1536, Pass2=4096, clm=2 (8 cores, 1 worker):  6.90 ms.  Throughput: 144.89 iter/sec.
FFTlen=6400K all-complex, Type=3, Arch=4, Pass1=1280, Pass2=5120, clm=2 (8 cores, 1 worker):  7.42 ms.  Throughput: 134.75 iter/sec.
FFTlen=6912K all-complex, Type=3, Arch=4, Pass1=768, Pass2=9216, clm=2 (8 cores, 1 worker):  8.07 ms.  Throughput: 123.95 iter/sec.
FFTlen=7680K all-complex, Type=3, Arch=4, Pass1=1024, Pass2=7680, clm=2 (8 cores, 1 worker):  9.48 ms.  Throughput: 105.46 iter/sec.
FFTlen=8000K all-complex, Type=3, Arch=4, Pass1=1280, Pass2=6400, clm=2 (8 cores, 1 worker): 10.06 ms.  Throughput: 99.42 iter/sec.
FFTlen=8192K all-complex, Type=3, Arch=4, Pass1=2048, Pass2=4096, clm=2 (8 cores, 1 worker): 10.32 ms.  Throughput: 96.92

Last fiddled with by moebius on 2020-12-29 at 03:33
moebius is offline   Reply With Quote
Old 2021-01-13, 08:38   #836
dcheuk
 
dcheuk's Avatar
 
Jan 2019
Pittsburgh, PA

24710 Posts
Default

Ryzen 9 3950X offset -0.0625v, 240mm cooling
Asus Strix X570-I, BIOS v.2802, I'm upgrading to v3202 will post them later
G.Skill 2x16GB 3600MHz CL16-16-16-34, see photo for detailed timing/cpu temp

Note that some numbers could be off due to active/background programs, and 1 instance of mfaktc is running at all times.

Code:
FFTlen=3072K, Type=3, Arch=4, Pass1=256, Pass2=12288, clm=4 (16 cores, 1 worker):  1.35 ms.  Throughput: 739.17 iter/sec.
FFTlen=3072K, Type=3, Arch=4, Pass1=256, Pass2=12288, clm=2 (16 cores, 1 worker):  1.52 ms.  Throughput: 656.06 iter/sec.
FFTlen=3072K, Type=3, Arch=4, Pass1=256, Pass2=12288, clm=1 (16 cores, 1 worker):  2.10 ms.  Throughput: 475.76 iter/sec.
FFTlen=3072K, Type=3, Arch=4, Pass1=384, Pass2=8192, clm=4 (16 cores, 1 worker):  1.40 ms.  Throughput: 712.17 iter/sec.
FFTlen=3072K, Type=3, Arch=4, Pass1=384, Pass2=8192, clm=2 (16 cores, 1 worker):  1.40 ms.  Throughput: 712.07 iter/sec.
FFTlen=3072K, Type=3, Arch=4, Pass1=384, Pass2=8192, clm=1 (16 cores, 1 worker):  1.91 ms.  Throughput: 522.51 iter/sec.
FFTlen=3072K, Type=3, Arch=4, Pass1=512, Pass2=6144, clm=4 (16 cores, 1 worker):  1.39 ms.  Throughput: 720.02 iter/sec.
FFTlen=3072K, Type=3, Arch=4, Pass1=512, Pass2=6144, clm=2 (16 cores, 1 worker):  1.34 ms.  Throughput: 745.85 iter/sec.
FFTlen=3072K, Type=3, Arch=4, Pass1=512, Pass2=6144, clm=1 (16 cores, 1 worker):  1.41 ms.  Throughput: 707.44 iter/sec.
FFTlen=3072K, Type=3, Arch=4, Pass1=768, Pass2=4096, clm=4 (16 cores, 1 worker):  1.38 ms.  Throughput: 724.46 iter/sec.
FFTlen=3072K, Type=3, Arch=4, Pass1=768, Pass2=4096, clm=2 (16 cores, 1 worker):  1.32 ms.  Throughput: 755.88 iter/sec.
FFTlen=3072K, Type=3, Arch=4, Pass1=768, Pass2=4096, clm=1 (16 cores, 1 worker):  1.94 ms.  Throughput: 516.73 iter/sec.
FFTlen=3072K, Type=3, Arch=4, Pass1=1024, Pass2=3072, clm=4 (16 cores, 1 worker):  1.42 ms.  Throughput: 703.73 iter/sec.
FFTlen=3072K, Type=3, Arch=4, Pass1=1024, Pass2=3072, clm=2 (16 cores, 1 worker):  1.34 ms.  Throughput: 743.64 iter/sec.
FFTlen=3072K, Type=3, Arch=4, Pass1=1024, Pass2=3072, clm=1 (16 cores, 1 worker):  1.31 ms.  Throughput: 765.73 iter/sec.
FFTlen=3072K, Type=3, Arch=4, Pass1=1536, Pass2=2048, clm=4 (16 cores, 1 worker):  1.56 ms.  Throughput: 640.96 iter/sec.
FFTlen=3072K, Type=3, Arch=4, Pass1=1536, Pass2=2048, clm=2 (16 cores, 1 worker):  1.40 ms.  Throughput: 711.88 iter/sec.
FFTlen=3072K, Type=3, Arch=4, Pass1=1536, Pass2=2048, clm=1 (16 cores, 1 worker):  1.34 ms.  Throughput: 745.46 iter/sec.
Code:
FFTlen=5376K, Type=3, Arch=4, Pass1=448, Pass2=12288, clm=4 (16 cores, 1 worker):  2.30 ms.  Throughput: 435.03 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=448, Pass2=12288, clm=2 (16 cores, 1 worker):  2.23 ms.  Throughput: 448.33 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=448, Pass2=12288, clm=1 (16 cores, 1 worker):  2.51 ms.  Throughput: 398.43 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=896, Pass2=6144, clm=4 (16 cores, 1 worker):  2.35 ms.  Throughput: 425.98 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=896, Pass2=6144, clm=2 (16 cores, 1 worker):  2.31 ms.  Throughput: 433.15 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=896, Pass2=6144, clm=1 (16 cores, 1 worker):  2.26 ms.  Throughput: 443.45 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=1792, Pass2=3072, clm=4 (16 cores, 1 worker):  2.55 ms.  Throughput: 392.54 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=1792, Pass2=3072, clm=2 (16 cores, 1 worker):  2.25 ms.  Throughput: 444.32 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=1792, Pass2=3072, clm=1 (16 cores, 1 worker):  2.12 ms.  Throughput: 471.97 iter/sec.

FFTlen=5376K, Type=3, Arch=4, Pass1=448, Pass2=12288, clm=4 (16 cores, 1 worker):  2.36 ms.  Throughput: 423.58 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=448, Pass2=12288, clm=2 (16 cores, 1 worker):  2.34 ms.  Throughput: 427.96 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=448, Pass2=12288, clm=1 (16 cores, 1 worker):  2.46 ms.  Throughput: 406.60 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=896, Pass2=6144, clm=4 (16 cores, 1 worker):  2.34 ms.  Throughput: 427.43 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=896, Pass2=6144, clm=2 (16 cores, 1 worker):  2.28 ms.  Throughput: 439.00 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=896, Pass2=6144, clm=1 (16 cores, 1 worker):  2.24 ms.  Throughput: 446.91 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=1792, Pass2=3072, clm=4 (16 cores, 1 worker):  2.58 ms.  Throughput: 387.83 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=1792, Pass2=3072, clm=2 (16 cores, 1 worker):  2.37 ms.  Throughput: 422.35 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=1792, Pass2=3072, clm=1 (16 cores, 1 worker):  2.15 ms.  Throughput: 464.47 iter/sec.

FFTlen=5376K, Type=3, Arch=4, Pass1=448, Pass2=12288, clm=4 (16 cores, 1 worker):  2.30 ms.  Throughput: 434.12 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=448, Pass2=12288, clm=2 (16 cores, 1 worker):  2.26 ms.  Throughput: 442.74 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=448, Pass2=12288, clm=1 (16 cores, 1 worker):  2.50 ms.  Throughput: 400.21 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=896, Pass2=6144, clm=4 (16 cores, 1 worker):  2.37 ms.  Throughput: 421.63 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=896, Pass2=6144, clm=2 (16 cores, 1 worker):  2.35 ms.  Throughput: 426.03 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=896, Pass2=6144, clm=1 (16 cores, 1 worker):  2.32 ms.  Throughput: 431.93 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=1792, Pass2=3072, clm=4 (16 cores, 1 worker):  2.56 ms.  Throughput: 390.73 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=1792, Pass2=3072, clm=2 (16 cores, 1 worker):  2.28 ms.  Throughput: 437.85 iter/sec.
FFTlen=5376K, Type=3, Arch=4, Pass1=1792, Pass2=3072, clm=1 (16 cores, 1 worker):  2.14 ms.  Throughput: 467.36 iter/sec.
Code:
FFTlen=5600K, Type=3, Arch=4, Pass1=448, Pass2=12800, clm=4 (16 cores, 1 worker):  2.58 ms.  Throughput: 387.51 iter/sec.
FFTlen=5600K, Type=3, Arch=4, Pass1=448, Pass2=12800, clm=2 (16 cores, 1 worker):  2.53 ms.  Throughput: 395.56 iter/sec.
FFTlen=5600K, Type=3, Arch=4, Pass1=448, Pass2=12800, clm=1 (16 cores, 1 worker):  2.75 ms.  Throughput: 363.71 iter/sec.
FFTlen=5600K, Type=3, Arch=4, Pass1=896, Pass2=6400, clm=4 (16 cores, 1 worker):  2.47 ms.  Throughput: 405.61 iter/sec.
FFTlen=5600K, Type=3, Arch=4, Pass1=896, Pass2=6400, clm=2 (16 cores, 1 worker):  2.37 ms.  Throughput: 422.47 iter/sec.
FFTlen=5600K, Type=3, Arch=4, Pass1=896, Pass2=6400, clm=1 (16 cores, 1 worker):  2.31 ms.  Throughput: 432.25 iter/sec.

FFTlen=5600K, Type=3, Arch=4, Pass1=448, Pass2=12800, clm=4 (16 cores, 1 worker):  2.97 ms.  Throughput: 336.80 iter/sec.
FFTlen=5600K, Type=3, Arch=4, Pass1=448, Pass2=12800, clm=2 (16 cores, 1 worker):  2.78 ms.  Throughput: 359.98 iter/sec.
FFTlen=5600K, Type=3, Arch=4, Pass1=448, Pass2=12800, clm=1 (16 cores, 1 worker):  2.91 ms.  Throughput: 343.56 iter/sec.
FFTlen=5600K, Type=3, Arch=4, Pass1=896, Pass2=6400, clm=4 (16 cores, 1 worker):  2.67 ms.  Throughput: 375.05 iter/sec.
FFTlen=5600K, Type=3, Arch=4, Pass1=896, Pass2=6400, clm=2 (16 cores, 1 worker):  2.54 ms.  Throughput: 393.93 iter/sec.
FFTlen=5600K, Type=3, Arch=4, Pass1=896, Pass2=6400, clm=1 (16 cores, 1 worker):  2.44 ms.  Throughput: 410.66 iter/sec.
Code:
FFTlen=5760K, Type=3, Arch=4, Pass1=384, Pass2=15360, clm=4 (16 cores, 1 worker):  2.65 ms.  Throughput: 377.55 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=384, Pass2=15360, clm=2 (16 cores, 1 worker):  2.70 ms.  Throughput: 369.88 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=384, Pass2=15360, clm=1 (16 cores, 1 worker):  3.57 ms.  Throughput: 280.47 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=640, Pass2=9216, clm=4 (16 cores, 1 worker):  3.58 ms.  Throughput: 279.33 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=640, Pass2=9216, clm=2 (16 cores, 1 worker):  3.43 ms.  Throughput: 291.74 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=640, Pass2=9216, clm=1 (16 cores, 1 worker):  3.20 ms.  Throughput: 312.95 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=768, Pass2=7680, clm=4 (16 cores, 1 worker):  2.49 ms.  Throughput: 402.06 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=768, Pass2=7680, clm=2 (16 cores, 1 worker):  2.38 ms.  Throughput: 419.31 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=768, Pass2=7680, clm=1 (16 cores, 1 worker):  2.35 ms.  Throughput: 425.18 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=1280, Pass2=4608, clm=4 (16 cores, 1 worker):  2.68 ms.  Throughput: 373.68 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=1280, Pass2=4608, clm=2 (16 cores, 1 worker):  2.46 ms.  Throughput: 407.07 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=1280, Pass2=4608, clm=1 (16 cores, 1 worker):  2.36 ms.  Throughput: 424.38 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=1536, Pass2=3840, clm=4 (16 cores, 1 worker):  2.77 ms.  Throughput: 361.27 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=1536, Pass2=3840, clm=2 (16 cores, 1 worker):  2.51 ms.  Throughput: 397.72 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=1536, Pass2=3840, clm=1 (16 cores, 1 worker):  2.34 ms.  Throughput: 427.92 iter/sec.

FFTlen=5760K, Type=3, Arch=4, Pass1=384, Pass2=15360, clm=4 (16 cores, 1 worker):  2.81 ms.  Throughput: 356.15 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=384, Pass2=15360, clm=2 (16 cores, 1 worker):  2.78 ms.  Throughput: 359.45 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=384, Pass2=15360, clm=1 (16 cores, 1 worker):  3.28 ms.  Throughput: 304.68 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=640, Pass2=9216, clm=4 (16 cores, 1 worker):  3.70 ms.  Throughput: 270.63 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=640, Pass2=9216, clm=2 (16 cores, 1 worker):  3.65 ms.  Throughput: 274.20 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=640, Pass2=9216, clm=1 (16 cores, 1 worker):  3.41 ms.  Throughput: 293.49 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=768, Pass2=7680, clm=4 (16 cores, 1 worker):  2.72 ms.  Throughput: 368.14 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=768, Pass2=7680, clm=2 (16 cores, 1 worker):  2.61 ms.  Throughput: 383.85 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=768, Pass2=7680, clm=1 (16 cores, 1 worker):  2.51 ms.  Throughput: 398.08 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=1280, Pass2=4608, clm=4 (16 cores, 1 worker):  2.87 ms.  Throughput: 349.00 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=1280, Pass2=4608, clm=2 (16 cores, 1 worker):  2.69 ms.  Throughput: 371.12 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=1280, Pass2=4608, clm=1 (16 cores, 1 worker):  2.53 ms.  Throughput: 395.96 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=1536, Pass2=3840, clm=4 (16 cores, 1 worker):  3.12 ms.  Throughput: 320.17 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=1536, Pass2=3840, clm=2 (16 cores, 1 worker):  2.70 ms.  Throughput: 370.55 iter/sec.
FFTlen=5760K, Type=3, Arch=4, Pass1=1536, Pass2=3840, clm=1 (16 cores, 1 worker):  2.53 ms.  Throughput: 395.42 iter/sec.
Code:
FFTlen=6144K, Type=3, Arch=4, Pass1=384, Pass2=16384, clm=4 (16 cores, 1 worker):  2.98 ms.  Throughput: 335.10 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=384, Pass2=16384, clm=2 (16 cores, 1 worker):  2.94 ms.  Throughput: 340.15 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=384, Pass2=16384, clm=1 (16 cores, 1 worker):  3.77 ms.  Throughput: 265.36 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=512, Pass2=12288, clm=4 (16 cores, 1 worker):  2.65 ms.  Throughput: 377.33 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=512, Pass2=12288, clm=2 (16 cores, 1 worker):  2.60 ms.  Throughput: 384.64 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=512, Pass2=12288, clm=1 (16 cores, 1 worker):  2.68 ms.  Throughput: 372.74 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=768, Pass2=8192, clm=4 (16 cores, 1 worker):  2.68 ms.  Throughput: 373.81 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=768, Pass2=8192, clm=2 (16 cores, 1 worker):  2.59 ms.  Throughput: 385.89 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=768, Pass2=8192, clm=1 (16 cores, 1 worker):  2.73 ms.  Throughput: 366.74 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=1024, Pass2=6144, clm=4 (16 cores, 1 worker):  2.85 ms.  Throughput: 351.49 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=1024, Pass2=6144, clm=2 (16 cores, 1 worker):  2.90 ms.  Throughput: 344.60 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=1024, Pass2=6144, clm=1 (16 cores, 1 worker):  2.86 ms.  Throughput: 349.73 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=1536, Pass2=4096, clm=4 (16 cores, 1 worker):  2.80 ms.  Throughput: 357.13 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=1536, Pass2=4096, clm=2 (16 cores, 1 worker):  2.63 ms.  Throughput: 380.13 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=1536, Pass2=4096, clm=1 (16 cores, 1 worker):  2.44 ms.  Throughput: 409.28 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=2048, Pass2=3072, clm=4 (16 cores, 1 worker):  3.16 ms.  Throughput: 316.27 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=2048, Pass2=3072, clm=2 (16 cores, 1 worker):  2.76 ms.  Throughput: 361.81 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=2048, Pass2=3072, clm=1 (16 cores, 1 worker):  2.67 ms.  Throughput: 375.03 iter/sec.

FFTlen=6144K, Type=3, Arch=4, Pass1=384, Pass2=16384, clm=4 (16 cores, 1 worker):  3.09 ms.  Throughput: 323.89 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=384, Pass2=16384, clm=2 (16 cores, 1 worker):  3.17 ms.  Throughput: 315.16 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=384, Pass2=16384, clm=1 (16 cores, 1 worker):  3.71 ms.  Throughput: 269.64 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=512, Pass2=12288, clm=4 (16 cores, 1 worker):  2.94 ms.  Throughput: 340.53 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=512, Pass2=12288, clm=2 (16 cores, 1 worker):  2.76 ms.  Throughput: 362.49 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=512, Pass2=12288, clm=1 (16 cores, 1 worker):  2.75 ms.  Throughput: 363.29 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=768, Pass2=8192, clm=4 (16 cores, 1 worker):  2.90 ms.  Throughput: 344.39 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=768, Pass2=8192, clm=2 (16 cores, 1 worker):  2.72 ms.  Throughput: 367.57 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=768, Pass2=8192, clm=1 (16 cores, 1 worker):  2.93 ms.  Throughput: 341.41 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=1024, Pass2=6144, clm=4 (16 cores, 1 worker):  2.93 ms.  Throughput: 341.39 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=1024, Pass2=6144, clm=2 (16 cores, 1 worker):  2.84 ms.  Throughput: 352.48 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=1024, Pass2=6144, clm=1 (16 cores, 1 worker):  2.76 ms.  Throughput: 361.77 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=1536, Pass2=4096, clm=4 (16 cores, 1 worker):  2.93 ms.  Throughput: 341.32 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=1536, Pass2=4096, clm=2 (16 cores, 1 worker):  2.72 ms.  Throughput: 368.26 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=1536, Pass2=4096, clm=1 (16 cores, 1 worker):  2.45 ms.  Throughput: 407.62 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=2048, Pass2=3072, clm=4 (16 cores, 1 worker):  3.52 ms.  Throughput: 284.45 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=2048, Pass2=3072, clm=2 (16 cores, 1 worker):  3.24 ms.  Throughput: 308.98 iter/sec.
FFTlen=6144K, Type=3, Arch=4, Pass1=2048, Pass2=3072, clm=1 (16 cores, 1 worker):  3.09 ms.  Throughput: 323.66 iter/sec.
Attached Thumbnails
Click image for larger version

Name:	Capture.PNG
Views:	18
Size:	982.2 KB
ID:	24171  

Last fiddled with by dcheuk on 2021-01-13 at 08:45 Reason: highlight
dcheuk is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Perpetual "interesting video" thread... Xyzzy Lounge 14 2021-01-15 07:44
LLR benchmark thread Oddball Riesel Prime Search 5 2010-08-02 00:11
Perpetual I'm pi**ed off thread rogue Soap Box 19 2009-10-28 19:17
Perpetual autostereogram thread... Xyzzy Lounge 10 2006-09-28 00:36
Perpetual ECM factoring challenge thread... Xyzzy Factoring 65 2005-09-05 08:16

All times are UTC. The time now is 01:07.

Sun Jan 17 01:07:49 UTC 2021 up 44 days, 21:19, 0 users, load averages: 2.25, 2.09, 1.84

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.