20220420, 18:48  #870 
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
12141_{8} Posts 
Interesting that 300K which is so much worse with 8 Cores/1 Worker ...
has much more consistent times for 8 / 8. Timings for 280K FFT length (8 cores, 1 worker): 0.16 ms. Throughput: 6246.39 iter/sec. Timings for 280K FFT length (8 cores, 2 workers): 0.23, 0.23 ms. Throughput: 8765.58 iter/sec. Timings for 280K FFT length (8 cores, 4 workers): 0.39, 0.39, 0.39, 0.39 ms. Throughput: 10241.03 iter/sec. Timings for 280K FFT length (8 cores, 8 workers): 0.67, 0.67, 0.66, 0.66, 0.67, 0.66, 0.66, 0.67 ms. Throughput: 12039.18 iter/sec. Timings for 300K FFT length (8 cores, 1 worker): 0.36 ms. Throughput: 2795.59 iter/sec. Timings for 300K FFT length (8 cores, 2 workers): 0.34, 0.34 ms. Throughput: 5864.26 iter/sec. Timings for 300K FFT length (8 cores, 4 workers): 0.51, 0.52, 0.52, 0.50 ms. Throughput: 7814.25 iter/sec. Timings for 300K FFT length (8 cores, 8 workers): 0.73, 0.72, 0.72, 0.70, 0.71, 0.71, 0.71, 0.71 ms. Throughput: 11245.56 iter/sec. Timings for 320K FFT length (8 cores, 1 worker): 0.17 ms. Throughput: 5715.57 iter/sec. Timings for 320K FFT length (8 cores, 2 workers): 0.26, 0.26 ms. Throughput: 7777.33 iter/sec. Timings for 320K FFT length (8 cores, 4 workers): 0.45, 0.44, 0.45, 0.45 ms. Throughput: 8965.12 iter/sec. Timings for 320K FFT length (8 cores, 8 workers): 1.00, 0.83, 0.85, 0.86, 0.84, 0.82, 0.84, 0.83 ms. Throughput: 9329.29 iter/sec. 
20220506, 06:06  #871 
Aug 2002
North San Diego County
743 Posts 
5800x3D 1024K to 8192K throughput benchmark.
Nonoptimized, nonoc'd CPU, 4000MHz RAM DDR4 using AXMS stock settings.
Nothing unexpected or outstanding at first glance. Code:
CPU speed: 3400.12 MHz, 8 hyperthreaded cores CPU features: 3DNow! Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA L1 cache size: 8x32 KB, L2 cache size: 8x512 KB, L3 cache size: 96 MB L1 cache line size: 64 bytes, L2 cache line size: 64 bytes Machine topology as determined by hwloc library: Machine#0 (total=13447548KB, Backend=Windows, OSName=Windows, WindowsBuildEnvironment=MinGW, OSRelease=10, OSVersion=10.0.18362, Hostname=5800X3D, Architecture=x86_64, hwlocVersion=2.4.1, ProcessName=prime95.exe) Package (total=13447548KB, CPUVendor=AuthenticAMD, CPUFamilyNumber=25, CPUModelNumber=33, CPUModel="AMD Ryzen 7 5800X3D 8Core Processor ", CPUStepping=2) L3 (size=98304KB, linesize=64, ways=16, Inclusive=0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000003) PU#0 (cpuset: 0x00000001) PU#1 (cpuset: 0x00000002) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x0000000c) PU#2 (cpuset: 0x00000004) PU#3 (cpuset: 0x00000008) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000030) PU#4 (cpuset: 0x00000010) PU#5 (cpuset: 0x00000020) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x000000c0) PU#6 (cpuset: 0x00000040) PU#7 (cpuset: 0x00000080) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000300) PU#8 (cpuset: 0x00000100) PU#9 (cpuset: 0x00000200) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000c00) PU#10 (cpuset: 0x00000400) PU#11 (cpuset: 0x00000800) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00003000) PU#12 (cpuset: 0x00001000) PU#13 (cpuset: 0x00002000) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x0000c000) PU#14 (cpuset: 0x00004000) PU#15 (cpuset: 0x00008000) Prime95 64bit version 30.7, RdtscTiming=1 Timings for 1024K FFT length (8 cores, 1 worker): 0.55 ms. Throughput: 1821.51 iter/sec. Timings for 1024K FFT length (8 cores, 2 workers): 1.04, 1.04 ms. Throughput: 1931.97 iter/sec. Timings for 1024K FFT length (8 cores, 4 workers): 2.00, 1.99, 2.00, 2.00 ms. Throughput: 2001.40 iter/sec. Timings for 1024K FFT length (8 cores, 8 workers): 3.82, 3.82, 3.82, 3.80, 3.82, 3.83, 3.84, 3.82 ms. Throughput: 2093.63 iter/sec. Timings for 1120K FFT length (8 cores, 1 worker): 0.60 ms. Throughput: 1653.34 iter/sec. Timings for 1120K FFT length (8 cores, 2 workers): 1.15, 1.15 ms. Throughput: 1735.32 iter/sec. Timings for 1120K FFT length (8 cores, 4 workers): 2.22, 2.24, 2.21, 2.20 ms. Throughput: 1804.09 iter/sec. Timings for 1120K FFT length (8 cores, 8 workers): 4.21, 4.23, 4.21, 4.21, 4.21, 4.23, 4.22, 4.21 ms. Throughput: 1897.55 iter/sec. Timings for 1152K FFT length (8 cores, 1 worker): 0.59 ms. Throughput: 1691.24 iter/sec. Timings for 1152K FFT length (8 cores, 2 workers): 1.13, 1.13 ms. Throughput: 1769.93 iter/sec. Timings for 1152K FFT length (8 cores, 4 workers): 2.22, 2.20, 2.20, 2.20 ms. Throughput: 1814.40 iter/sec. Timings for 1152K FFT length (8 cores, 8 workers): 4.29, 4.26, 4.26, 4.23, 4.22, 4.25, 4.24, 4.23 ms. Throughput: 1883.85 iter/sec. [Thu May 5 22:25:18 2022] Timings for 1280K FFT length (8 cores, 1 worker): 0.68 ms. Throughput: 1464.10 iter/sec. Timings for 1280K FFT length (8 cores, 2 workers): 1.30, 1.30 ms. Throughput: 1535.99 iter/sec. Timings for 1280K FFT length (8 cores, 4 workers): 2.52, 2.57, 2.51, 2.52 ms. Throughput: 1581.03 iter/sec. Timings for 1280K FFT length (8 cores, 8 workers): 4.85, 4.90, 4.84, 4.86, 4.85, 4.88, 4.86, 4.86 ms. Throughput: 1645.09 iter/sec. Timings for 1344K FFT length (8 cores, 1 worker): 0.71 ms. Throughput: 1401.51 iter/sec. Timings for 1344K FFT length (8 cores, 2 workers): 1.37, 1.37 ms. Throughput: 1456.32 iter/sec. Timings for 1344K FFT length (8 cores, 4 workers): 2.70, 2.68, 2.69, 2.69 ms. Throughput: 1486.34 iter/sec. Timings for 1344K FFT length (8 cores, 8 workers): 5.22, 5.23, 5.22, 5.22, 5.22, 5.24, 5.27, 5.22 ms. Throughput: 1529.46 iter/sec. Timings for 1440K FFT length (8 cores, 1 worker): 0.75 ms. Throughput: 1339.61 iter/sec. Timings for 1440K FFT length (8 cores, 2 workers): 1.46, 1.44 ms. Throughput: 1376.79 iter/sec. Timings for 1440K FFT length (8 cores, 4 workers): 2.83, 2.81, 2.86, 2.84 ms. Throughput: 1411.41 iter/sec. Timings for 1440K FFT length (8 cores, 8 workers): 5.53, 5.56, 5.46, 5.47, 5.49, 5.53, 5.49, 5.48 ms. Throughput: 1454.27 iter/sec. Timings for 1536K FFT length (8 cores, 1 worker): 0.84 ms. Throughput: 1189.28 iter/sec. Timings for 1536K FFT length (8 cores, 2 workers): 1.59, 1.61 ms. Throughput: 1250.85 iter/sec. Timings for 1536K FFT length (8 cores, 4 workers): 3.04, 3.03, 3.04, 3.05 ms. Throughput: 1315.83 iter/sec. Timings for 1536K FFT length (8 cores, 8 workers): 5.86, 5.84, 5.81, 5.84, 5.84, 5.86, 5.83, 5.85 ms. Throughput: 1369.74 iter/sec. Timings for 1600K FFT length (8 cores, 1 worker): 0.82 ms. Throughput: 1214.51 iter/sec. Timings for 1600K FFT length (8 cores, 2 workers): 1.61, 1.59 ms. Throughput: 1250.17 iter/sec. Timings for 1600K FFT length (8 cores, 4 workers): 3.11, 3.14, 3.12, 3.11 ms. Throughput: 1281.28 iter/sec. Timings for 1600K FFT length (8 cores, 8 workers): 6.15, 6.13, 6.11, 6.16, 6.11, 6.15, 6.17, 6.13 ms. Throughput: 1303.17 iter/sec. Timings for 1680K FFT length (8 cores, 1 worker): 0.91 ms. Throughput: 1104.91 iter/sec. Timings for 1680K FFT length (8 cores, 2 workers): 1.77, 1.76 ms. Throughput: 1133.54 iter/sec. Timings for 1680K FFT length (8 cores, 4 workers): 3.45, 3.44, 3.45, 3.45 ms. Throughput: 1160.00 iter/sec. Timings for 1680K FFT length (8 cores, 8 workers): 6.85, 6.84, 6.83, 6.81, 6.82, 6.85, 6.84, 6.84 ms. Throughput: 1170.61 iter/sec. Timings for 1792K FFT length (8 cores, 1 worker): 1.01 ms. Throughput: 991.35 iter/sec. Timings for 1792K FFT length (8 cores, 2 workers): 1.90, 1.91 ms. Throughput: 1049.05 iter/sec. Timings for 1792K FFT length (8 cores, 4 workers): 3.74, 3.70, 3.72, 3.71 ms. Throughput: 1076.18 iter/sec. Timings for 1792K FFT length (8 cores, 8 workers): 7.31, 7.34, 7.28, 7.28, 7.29, 7.33, 7.29, 7.29 ms. Throughput: 1095.94 iter/sec. Timings for 1920K FFT length (8 cores, 1 worker): 0.98 ms. Throughput: 1024.47 iter/sec. Timings for 1920K FFT length (8 cores, 2 workers): 1.92, 1.89 ms. Throughput: 1049.13 iter/sec. [Thu May 5 22:30:23 2022] Timings for 1920K FFT length (8 cores, 4 workers): 3.74, 3.74, 3.73, 3.73 ms. Throughput: 1071.95 iter/sec. Timings for 1920K FFT length (8 cores, 8 workers): 8.26, 8.22, 8.20, 8.18, 8.30, 8.15, 8.25, 8.29 ms. Throughput: 972.00 iter/sec. Timings for 2048K FFT length (8 cores, 1 worker): 1.15 ms. Throughput: 866.43 iter/sec. Timings for 2048K FFT length (8 cores, 2 workers): 2.18, 2.23 ms. Throughput: 906.57 iter/sec. Timings for 2048K FFT length (8 cores, 4 workers): 4.27, 4.19, 4.27, 4.18 ms. Throughput: 945.90 iter/sec. Timings for 2048K FFT length (8 cores, 8 workers): 9.26, 9.18, 9.12, 9.11, 9.12, 9.15, 9.11, 9.11 ms. Throughput: 874.90 iter/sec. Timings for 2240K FFT length (8 cores, 1 worker): 1.20 ms. Throughput: 829.97 iter/sec. Timings for 2240K FFT length (8 cores, 2 workers): 2.33, 2.35 ms. Throughput: 854.92 iter/sec. Timings for 2240K FFT length (8 cores, 4 workers): 4.63, 4.60, 4.58, 4.59 ms. Throughput: 869.64 iter/sec. Timings for 2240K FFT length (8 cores, 8 workers): 10.89, 11.01, 10.80, 10.97, 10.78, 10.83, 10.66, 10.81 ms. Throughput: 737.80 iter/sec. Timings for 2304K FFT length (8 cores, 1 worker): 1.20 ms. Throughput: 832.67 iter/sec. Timings for 2304K FFT length (8 cores, 2 workers): 2.32, 2.33 ms. Throughput: 861.15 iter/sec. Timings for 2304K FFT length (8 cores, 4 workers): 4.58, 4.57, 4.64, 4.57 ms. Throughput: 871.49 iter/sec. Timings for 2304K FFT length (8 cores, 8 workers): 11.58, 11.70, 11.63, 11.65, 11.44, 11.10, 11.56, 11.17 ms. Throughput: 697.18 iter/sec. Timings for 2400K FFT length (8 cores, 1 worker): 1.27 ms. Throughput: 785.73 iter/sec. Timings for 2400K FFT length (8 cores, 2 workers): 2.49, 2.52 ms. Throughput: 797.82 iter/sec. Timings for 2400K FFT length (8 cores, 4 workers): 4.91, 4.90, 4.90, 4.91 ms. Throughput: 815.68 iter/sec. Timings for 2400K FFT length (8 cores, 8 workers): 12.80, 12.72, 12.76, 12.91, 12.73, 12.74, 12.44, 12.75 ms. Throughput: 628.38 iter/sec. Timings for 2560K FFT length (8 cores, 1 worker): 1.41 ms. Throughput: 707.11 iter/sec. Timings for 2560K FFT length (8 cores, 2 workers): 2.76, 2.80 ms. Throughput: 719.15 iter/sec. Timings for 2560K FFT length (8 cores, 4 workers): 5.47, 5.40, 5.40, 5.40 ms. Throughput: 738.28 iter/sec. Timings for 2560K FFT length (8 cores, 8 workers): 13.82, 13.76, 13.51, 14.81, 13.80, 13.47, 13.88, 13.96 ms. Throughput: 576.91 iter/sec. Timings for 2688K FFT length (8 cores, 1 worker): 1.46 ms. Throughput: 687.16 iter/sec. Timings for 2688K FFT length (8 cores, 2 workers): 2.83, 2.83 ms. Throughput: 706.48 iter/sec. Timings for 2688K FFT length (8 cores, 4 workers): 5.61, 5.61, 5.60, 5.61 ms. Throughput: 713.56 iter/sec. Timings for 2688K FFT length (8 cores, 8 workers): 15.67, 15.25, 15.30, 15.49, 15.74, 14.73, 15.14, 15.62 ms. Throughput: 520.82 iter/sec. Timings for 2800K FFT length (8 cores, 1 worker): 1.58 ms. Throughput: 632.00 iter/sec. Timings for 2800K FFT length (8 cores, 2 workers): 3.08, 3.08 ms. Throughput: 649.40 iter/sec. Timings for 2800K FFT length (8 cores, 4 workers): 6.01, 5.98, 5.99, 5.96 ms. Throughput: 668.72 iter/sec. Timings for 2800K FFT length (8 cores, 8 workers): 16.97, 17.62, 16.81, 16.83, 16.77, 17.51, 16.85, 16.85 ms. Throughput: 470.01 iter/sec. [Thu May 5 22:35:31 2022] Timings for 2880K FFT length (8 cores, 1 worker): 1.52 ms. Throughput: 659.39 iter/sec. Timings for 2880K FFT length (8 cores, 2 workers): 2.97, 3.01 ms. Throughput: 668.70 iter/sec. Timings for 2880K FFT length (8 cores, 4 workers): 5.87, 5.85, 5.88, 5.87 ms. Throughput: 681.73 iter/sec. Timings for 2880K FFT length (8 cores, 8 workers): 17.62, 17.43, 17.26, 17.36, 17.56, 16.80, 19.11, 17.55 ms. Throughput: 455.48 iter/sec. Timings for 3072K FFT length (8 cores, 1 worker): 1.60 ms. Throughput: 626.87 iter/sec. Timings for 3072K FFT length (8 cores, 2 workers): 3.13, 3.13 ms. Throughput: 639.33 iter/sec. Timings for 3072K FFT length (8 cores, 4 workers): 6.21, 6.24, 6.20, 6.20 ms. Throughput: 643.85 iter/sec. Timings for 3072K FFT length (8 cores, 8 workers): 19.10, 19.58, 19.56, 19.61, 19.72, 18.80, 19.31, 19.82 ms. Throughput: 411.66 iter/sec. Timings for 3200K FFT length (8 cores, 1 worker): 1.74 ms. Throughput: 575.10 iter/sec. Timings for 3200K FFT length (8 cores, 2 workers): 3.40, 3.38 ms. Throughput: 589.81 iter/sec. Timings for 3200K FFT length (8 cores, 4 workers): 6.75, 6.74, 6.75, 6.69 ms. Throughput: 594.29 iter/sec. Timings for 3200K FFT length (8 cores, 8 workers): 20.36, 19.59, 20.58, 20.57, 20.59, 21.98, 21.27, 21.03 ms. Throughput: 385.95 iter/sec. Timings for 3360K FFT length (8 cores, 1 worker): 1.84 ms. Throughput: 543.23 iter/sec. Timings for 3360K FFT length (8 cores, 2 workers): 3.61, 3.66 ms. Throughput: 550.28 iter/sec. Timings for 3360K FFT length (8 cores, 4 workers): 7.23, 7.20, 7.20, 7.21 ms. Throughput: 554.67 iter/sec. Timings for 3360K FFT length (8 cores, 8 workers): 23.46, 21.81, 22.16, 21.76, 24.26, 23.28, 22.52, 22.61 ms. Throughput: 352.37 iter/sec. Timings for 3584K FFT length (8 cores, 1 worker): 1.96 ms. Throughput: 510.57 iter/sec. Timings for 3584K FFT length (8 cores, 2 workers): 3.84, 3.83 ms. Throughput: 521.48 iter/sec. Timings for 3584K FFT length (8 cores, 4 workers): 7.78, 7.69, 7.70, 7.72 ms. Throughput: 518.06 iter/sec. Timings for 3584K FFT length (8 cores, 8 workers): 24.69, 25.31, 25.16, 25.43, 25.25, 24.63, 24.97, 26.01 ms. Throughput: 317.77 iter/sec. Timings for 3840K FFT length (8 cores, 1 worker): 2.09 ms. Throughput: 477.87 iter/sec. Timings for 3840K FFT length (8 cores, 2 workers): 4.18, 4.11 ms. Throughput: 482.30 iter/sec. Timings for 3840K FFT length (8 cores, 4 workers): 8.51, 8.47, 8.50, 8.48 ms. Throughput: 471.03 iter/sec. Timings for 3840K FFT length (8 cores, 8 workers): 28.31, 26.02, 26.73, 27.02, 29.30, 26.07, 27.45, 28.35 ms. Throughput: 292.38 iter/sec. Timings for 4096K FFT length (8 cores, 1 worker): 2.26 ms. Throughput: 442.60 iter/sec. Timings for 4096K FFT length (8 cores, 2 workers): 4.44, 4.44 ms. Throughput: 450.26 iter/sec. Timings for 4096K FFT length (8 cores, 4 workers): 9.77, 9.66, 9.77, 9.65 ms. Throughput: 411.85 iter/sec. Timings for 4096K FFT length (8 cores, 8 workers): 30.87, 31.18, 30.41, 31.13, 31.77, 29.69, 31.02, 30.04 ms. Throughput: 260.16 iter/sec. Timings for 4480K FFT length (8 cores, 1 worker): 2.53 ms. Throughput: 395.06 iter/sec. Timings for 4480K FFT length (8 cores, 2 workers): 4.97, 4.93 ms. Throughput: 403.79 iter/sec. [Thu May 5 22:40:42 2022] Timings for 4480K FFT length (8 cores, 4 workers): 13.07, 13.07, 13.07, 13.07 ms. Throughput: 305.98 iter/sec. Timings for 4480K FFT length (8 cores, 8 workers): 40.23, 40.22, 40.21, 40.21, 40.22, 40.23, 40.22, 40.22 ms. Throughput: 198.91 iter/sec. Timings for 4608K FFT length (8 cores, 1 worker): 2.53 ms. Throughput: 395.21 iter/sec. Timings for 4608K FFT length (8 cores, 2 workers): 4.93, 4.93 ms. Throughput: 405.86 iter/sec. Timings for 4608K FFT length (8 cores, 4 workers): 11.80, 11.61, 11.60, 11.85 ms. Throughput: 341.44 iter/sec. Timings for 4608K FFT length (8 cores, 8 workers): 34.89, 35.49, 38.28, 36.01, 36.50, 34.16, 36.19, 36.05 ms. Throughput: 222.77 iter/sec. Timings for 4800K FFT length (8 cores, 1 worker): 2.72 ms. Throughput: 368.13 iter/sec. Timings for 4800K FFT length (8 cores, 2 workers): 5.35, 5.29 ms. Throughput: 376.09 iter/sec. Timings for 4800K FFT length (8 cores, 4 workers): 12.64, 12.68, 12.87, 12.71 ms. Throughput: 314.33 iter/sec. Timings for 4800K FFT length (8 cores, 8 workers): 38.85, 36.54, 38.12, 37.28, 37.95, 36.28, 38.16, 38.37 ms. Throughput: 212.35 iter/sec. Timings for 5120K FFT length (8 cores, 1 worker): 2.86 ms. Throughput: 349.51 iter/sec. Timings for 5120K FFT length (8 cores, 2 workers): 5.63, 5.59 ms. Throughput: 356.46 iter/sec. Timings for 5120K FFT length (8 cores, 4 workers): 14.58, 14.30, 14.26, 14.32 ms. Throughput: 278.44 iter/sec. Timings for 5120K FFT length (8 cores, 8 workers): 41.60, 40.58, 41.96, 41.19, 41.50, 39.44, 43.41, 41.01 ms. Throughput: 193.66 iter/sec. Timings for 5376K FFT length (8 cores, 1 worker): 3.04 ms. Throughput: 328.71 iter/sec. Timings for 5376K FFT length (8 cores, 2 workers): 6.00, 6.02 ms. Throughput: 332.87 iter/sec. Timings for 5376K FFT length (8 cores, 4 workers): 15.87, 15.97, 16.18, 16.25 ms. Throughput: 248.94 iter/sec. Timings for 5376K FFT length (8 cores, 8 workers): 44.97, 44.45, 45.45, 45.07, 44.54, 43.04, 45.48, 46.42 ms. Throughput: 178.14 iter/sec. Timings for 5600K FFT length (8 cores, 1 worker): 3.12 ms. Throughput: 320.55 iter/sec. Timings for 5600K FFT length (8 cores, 2 workers): 6.15, 6.15 ms. Throughput: 325.13 iter/sec. Timings for 5600K FFT length (8 cores, 4 workers): 17.49, 17.93, 17.49, 17.80 ms. Throughput: 226.28 iter/sec. Timings for 5600K FFT length (8 cores, 8 workers): 47.96, 48.18, 48.50, 47.76, 48.44, 47.26, 47.98, 48.89 ms. Throughput: 166.27 iter/sec. Timings for 5760K FFT length (8 cores, 1 worker): 3.13 ms. Throughput: 319.52 iter/sec. Timings for 5760K FFT length (8 cores, 2 workers): 6.24, 6.24 ms. Throughput: 320.58 iter/sec. Timings for 5760K FFT length (8 cores, 4 workers): 21.12, 21.17, 21.27, 21.12 ms. Throughput: 188.92 iter/sec. Timings for 5760K FFT length (8 cores, 8 workers): 56.75, 55.82, 55.68, 56.87, 56.59, 55.87, 56.83, 55.72 ms. Throughput: 142.19 iter/sec. Timings for 6144K FFT length (8 cores, 1 worker): 3.40 ms. Throughput: 294.26 iter/sec. Timings for 6144K FFT length (8 cores, 2 workers): 6.76, 6.85 ms. Throughput: 293.91 iter/sec. Timings for 6144K FFT length (8 cores, 4 workers): 20.67, 19.86, 19.70, 19.92 ms. Throughput: 199.72 iter/sec. [Thu May 5 22:45:47 2022] Timings for 6144K FFT length (8 cores, 8 workers): 53.06, 52.40, 52.69, 52.98, 52.99, 50.68, 52.47, 53.26 ms. Throughput: 152.22 iter/sec. Timings for 6400K FFT length (8 cores, 1 worker): 3.56 ms. Throughput: 281.20 iter/sec. Timings for 6400K FFT length (8 cores, 2 workers): 7.04, 7.00 ms. Throughput: 284.79 iter/sec. Timings for 6400K FFT length (8 cores, 4 workers): 21.02, 21.89, 20.41, 21.51 ms. Throughput: 188.73 iter/sec. Timings for 6400K FFT length (8 cores, 8 workers): 55.57, 54.56, 57.04, 55.12, 55.78, 55.46, 55.44, 56.50 ms. Throughput: 143.69 iter/sec. Timings for 6720K FFT length (8 cores, 1 worker): 3.71 ms. Throughput: 269.24 iter/sec. Timings for 6720K FFT length (8 cores, 2 workers): 7.54, 7.54 ms. Throughput: 265.09 iter/sec. Timings for 6720K FFT length (8 cores, 4 workers): 24.47, 24.38, 24.25, 24.64 ms. Throughput: 163.71 iter/sec. Timings for 6720K FFT length (8 cores, 8 workers): 62.69, 61.28, 61.24, 62.39, 61.28, 60.91, 62.22, 62.84 ms. Throughput: 129.35 iter/sec. Timings for 7168K FFT length (8 cores, 1 worker): 4.17 ms. Throughput: 239.98 iter/sec. Timings for 7168K FFT length (8 cores, 2 workers): 8.40, 8.40 ms. Throughput: 238.18 iter/sec. Timings for 7168K FFT length (8 cores, 4 workers): 26.18, 26.75, 25.78, 25.71 ms. Throughput: 153.26 iter/sec. Timings for 7168K FFT length (8 cores, 8 workers): 63.65, 66.52, 64.48, 62.92, 66.19, 62.65, 65.37, 64.88 ms. Throughput: 123.92 iter/sec. Timings for 7680K FFT length (8 cores, 1 worker): 4.34 ms. Throughput: 230.49 iter/sec. Timings for 7680K FFT length (8 cores, 2 workers): 10.17, 10.23 ms. Throughput: 196.08 iter/sec. Timings for 7680K FFT length (8 cores, 4 workers): 34.98, 35.15, 35.16, 35.32 ms. Throughput: 113.78 iter/sec. Timings for 7680K FFT length (8 cores, 8 workers): 81.23, 80.78, 80.40, 80.60, 79.95, 81.31, 79.75, 81.04 ms. Throughput: 99.22 iter/sec. Timings for 8000K FFT length (8 cores, 1 worker): 4.57 ms. Throughput: 219.06 iter/sec. Timings for 8000K FFT length (8 cores, 2 workers): 9.63, 9.49 ms. Throughput: 209.18 iter/sec. Timings for 8000K FFT length (8 cores, 4 workers): 30.01, 30.62, 31.70, 30.39 ms. Throughput: 130.44 iter/sec. Timings for 8000K FFT length (8 cores, 8 workers): 74.16, 76.71, 73.04, 73.07, 74.12, 74.25, 74.86, 74.15 ms. Throughput: 107.70 iter/sec. Timings for 8064K FFT length (8 cores, 1 worker): 4.64 ms. Throughput: 215.61 iter/sec. Timings for 8064K FFT length (8 cores, 2 workers): 9.82, 9.79 ms. Throughput: 203.99 iter/sec. Timings for 8064K FFT length (8 cores, 4 workers): 30.86, 30.88, 30.86, 32.85 ms. Throughput: 127.63 iter/sec. Timings for 8064K FFT length (8 cores, 8 workers): 74.85, 73.24, 73.21, 73.78, 77.11, 71.62, 74.85, 76.06 ms. Throughput: 107.66 iter/sec. Timings for 8192K FFT length (8 cores, 1 worker): 4.79 ms. Throughput: 208.95 iter/sec. Timings for 8192K FFT length (8 cores, 2 workers): 10.10, 10.09 ms. Throughput: 198.12 iter/sec. Timings for 8192K FFT length (8 cores, 4 workers): 31.64, 32.78, 32.29, 32.07 ms. Throughput: 124.26 iter/sec. Timings for 8192K FFT length (8 cores, 8 workers): 78.81, 75.92, 78.91, 76.51, 77.04, 74.21, 77.16, 76.64 ms. Throughput: 104.07 iter/sec. 
20220506, 07:38  #872  
"/X\(‘‘)/X\"
Jan 2013
13×227 Posts 
Quote:
Code:
Timings for 4096K FFT length (8 cores, 1 worker): 2.26 ms. Throughput: 442.60 iter/sec. Timings for 4480K FFT length (8 cores, 1 worker): 2.53 ms. Throughput: 395.06 iter/sec. Timings for 4608K FFT length (8 cores, 1 worker): 2.53 ms. Throughput: 395.21 iter/sec. Timings for 4800K FFT length (8 cores, 1 worker): 2.72 ms. Throughput: 368.13 iter/sec. Timings for 5120K FFT length (8 cores, 1 worker): 2.86 ms. Throughput: 349.51 iter/sec. Timings for 5376K FFT length (8 cores, 1 worker): 3.04 ms. Throughput: 328.71 iter/sec. Timings for 5600K FFT length (8 cores, 1 worker): 3.12 ms. Throughput: 320.55 iter/sec. Timings for 5760K FFT length (8 cores, 1 worker): 3.13 ms. Throughput: 319.52 iter/sec. Timings for 6144K FFT length (8 cores, 1 worker): 3.40 ms. Throughput: 294.26 iter/sec. Timings for 6400K FFT length (8 cores, 1 worker): 3.56 ms. Throughput: 281.20 iter/sec. Timings for 6720K FFT length (8 cores, 1 worker): 3.71 ms. Throughput: 269.24 iter/sec. Timings for 7168K FFT length (8 cores, 1 worker): 4.17 ms. Throughput: 239.98 iter/sec. Timings for 7680K FFT length (8 cores, 1 worker): 4.34 ms. Throughput: 230.49 iter/sec. Timings for 8000K FFT length (8 cores, 1 worker): 4.57 ms. Throughput: 219.06 iter/sec. Timings for 8064K FFT length (8 cores, 1 worker): 4.64 ms. Throughput: 215.61 iter/sec. Timings for 8192K FFT length (8 cores, 1 worker): 4.79 ms. Throughput: 208.95 iter/sec. Code:
Best time for 4096K FFT length: 1.936 ms., avg: 1.962 ms. Best time for 4480K FFT length: 2.416 ms., avg: 2.520 ms. Best time for 4608K FFT length: 2.298 ms., avg: 2.382 ms. Best time for 4800K FFT length: 2.299 ms., avg: 2.351 ms. Best time for 5120K FFT length: 2.536 ms., avg: 2.655 ms. Best time for 5376K FFT length: 3.022 ms., avg: 3.265 ms. Best time for 5600K FFT length: 3.337 ms., avg: 3.511 ms. Best time for 5760K FFT length: 4.065 ms., avg: 4.131 ms. Best time for 6144K FFT length: 3.502 ms., avg: 3.672 ms. Best time for 6400K FFT length: 3.539 ms., avg: 3.646 ms. Best time for 6720K FFT length: 4.854 ms., avg: 4.948 ms. Best time for 7168K FFT length: 5.050 ms., avg: 5.149 ms. Best time for 7680K FFT length: 6.219 ms., avg: 6.283 ms. Best time for 8000K FFT length: 5.612 ms., avg: 5.763 ms. Best time for 8064K FFT length: 6.222 ms., avg: 6.392 ms. Best time for 8192K FFT length: 6.397 ms., avg: 6.476 ms. Last fiddled with by Mark Rose on 20220506 at 07:38 

20220506, 14:37  #873  
"Composite as Heck"
Oct 2017
17·53 Posts 
Quote:


20220507, 19:17  #874  
"/X\(‘‘)/X\"
Jan 2013
101110000111_{2} Posts 
Quote:


20220508, 05:52  #875 
"Composite as Heck"
Oct 2017
385_{16} Posts 
The nonSMT results are in the same post you linked, above the SMT results.

20220511, 19:33  #876 
"Viliam Furík"
Jul 2018
Martin, Slovakia
1402_{8} Posts 
I was thinking about the effect of memory bandwidth as a bottleneck for the performance of a GPU or a CPU. I thought about and tried to calculate the amount of bandwidth needed for 1 TFLOPS of FP64 to be fully used. But my results were about 130 GB/s, which seems too little in the context of Radeon VII, which houses roughly 3 TFLOPS of FP64 throughput, yet the actual performance in PRP tests differs.
I used the conversion 1 TFLOPS = 500 GHzD/D, 500 GHzD is one test with an exponent around 113,500,000, which needs 6144K FFT size, and that requires about 48 MiB of FFT data to be transferred, thus 113,500,000 * 48 MiB in one day is about 130 GB/s. Could someone explain to me how the memory bandwidth affects the performance, and what could be used as ruleofthumb conversion for the bandwidth required for 1 TFLOPS FP64 to be fully used? 
20220801, 14:16  #877 
Oct 2008
2·3·11 Posts 
Does anyone have a Ryzen 5700g and is willing to post benchmark results? I'd be curious to see what impact the 16mb of L3 has on wavefront PRP throughput.
Thanks in advance 👍 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Perpetual "interesting video" thread...  Xyzzy  Lounge  46  20220722 11:59 
LLR benchmark thread  Oddball  Riesel Prime Search  5  20100802 00:11 
Perpetual I'm pi**ed off thread  rogue  Soap Box  19  20091028 19:17 
Perpetual autostereogram thread...  Xyzzy  Lounge  10  20060928 00:36 
Perpetual ECM factoring challenge thread...  Xyzzy  Factoring  65  20050905 08:16 