20220114, 09:30  #859 
"Jorge Coveiro"
Nov 2006
Moura, Portugal
2^{4}·3 Posts 
3990X (PBO disabled)
64 cores, 8 workers  win11 pro
Code:
AMD Ryzen Threadripper 3990X 64Core Processor CPU speed: 4217.15 MHz, 64 hyperthreaded cores CPU features: 3DNow! Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA L1 cache size: 64x32 KB, L2 cache size: 64x512 KB, L3 cache size: 16x16 MB L1 cache line size: 64 bytes, L2 cache line size: 64 bytes Machine topology as determined by hwloc library: Machine#0 (total=107111704KB, Backend=Windows, OSName=Windows, WindowsBuildEnvironment=MinGW, OSRelease=10, OSVersion=10.0.22000, Hostname=QUBITS, Architecture=x86_64, hwlocVersion=2.4.1, ProcessName=prime95.exe) Package (total=107111704KB, CPUVendor=AuthenticAMD, CPUFamilyNumber=23, CPUModelNumber=49, CPUModel="AMD Ryzen Threadripper 3990X 64Core Processor ", CPUStepping=0) Group0#0 (total=107111704KB) L3 (size=16384KB, linesize=64, ways=16, Inclusive=0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000003) PU#0 (cpuset: 0x00000001) PU#1 (cpuset: 0x00000002) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x0000000c) PU#2 (cpuset: 0x00000004) PU#3 (cpuset: 0x00000008) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000030) PU#4 (cpuset: 0x00000010) PU#5 (cpuset: 0x00000020) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x000000c0) PU#6 (cpuset: 0x00000040) PU#7 (cpuset: 0x00000080) L3 (size=16384KB, linesize=64, ways=16, Inclusive=0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000300) PU#8 (cpuset: 0x00000100) PU#9 (cpuset: 0x00000200) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000c00) PU#10 (cpuset: 0x00000400) PU#11 (cpuset: 0x00000800) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00003000) PU#12 (cpuset: 0x00001000) PU#13 (cpuset: 0x00002000) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x0000c000) PU#14 (cpuset: 0x00004000) PU#15 (cpuset: 0x00008000) L3 (size=16384KB, linesize=64, ways=16, Inclusive=0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00030000) PU#16 (cpuset: 0x00010000) PU#17 (cpuset: 0x00020000) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x000c0000) PU#18 (cpuset: 0x00040000) PU#19 (cpuset: 0x00080000) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00300000) PU#20 (cpuset: 0x00100000) PU#21 (cpuset: 0x00200000) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00c00000) PU#22 (cpuset: 0x00400000) PU#23 (cpuset: 0x00800000) L3 (size=16384KB, linesize=64, ways=16, Inclusive=0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x03000000) PU#24 (cpuset: 0x01000000) PU#25 (cpuset: 0x02000000) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x0c000000) PU#26 (cpuset: 0x04000000) PU#27 (cpuset: 0x08000000) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x30000000) PU#28 (cpuset: 0x10000000) PU#29 (cpuset: 0x20000000) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0xc0000000) PU#30 (cpuset: 0x40000000) PU#31 (cpuset: 0x80000000) L3 (size=16384KB, linesize=64, ways=16, Inclusive=0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000003,0x0) PU#32 (cpuset: 0x00000001,0x0) PU#33 (cpuset: 0x00000002,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x0000000c,0x0) PU#34 (cpuset: 0x00000004,0x0) PU#35 (cpuset: 0x00000008,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000030,0x0) PU#36 (cpuset: 0x00000010,0x0) PU#37 (cpuset: 0x00000020,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x000000c0,0x0) PU#38 (cpuset: 0x00000040,0x0) PU#39 (cpuset: 0x00000080,0x0) L3 (size=16384KB, linesize=64, ways=16, Inclusive=0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000300,0x0) PU#40 (cpuset: 0x00000100,0x0) PU#41 (cpuset: 0x00000200,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000c00,0x0) PU#42 (cpuset: 0x00000400,0x0) PU#43 (cpuset: 0x00000800,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00003000,0x0) PU#44 (cpuset: 0x00001000,0x0) PU#45 (cpuset: 0x00002000,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x0000c000,0x0) PU#46 (cpuset: 0x00004000,0x0) PU#47 (cpuset: 0x00008000,0x0) L3 (size=16384KB, linesize=64, ways=16, Inclusive=0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00030000,0x0) PU#48 (cpuset: 0x00010000,0x0) PU#49 (cpuset: 0x00020000,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x000c0000,0x0) PU#50 (cpuset: 0x00040000,0x0) PU#51 (cpuset: 0x00080000,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00300000,0x0) PU#52 (cpuset: 0x00100000,0x0) PU#53 (cpuset: 0x00200000,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00c00000,0x0) PU#54 (cpuset: 0x00400000,0x0) PU#55 (cpuset: 0x00800000,0x0) L3 (size=16384KB, linesize=64, ways=16, Inclusive=0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x03000000,0x0) PU#56 (cpuset: 0x01000000,0x0) PU#57 (cpuset: 0x02000000,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x0c000000,0x0) PU#58 (cpuset: 0x04000000,0x0) PU#59 (cpuset: 0x08000000,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x30000000,0x0) PU#60 (cpuset: 0x10000000,0x0) PU#61 (cpuset: 0x20000000,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0xc0000000,0x0) PU#62 (cpuset: 0x40000000,0x0) PU#63 (cpuset: 0x80000000,0x0) Group0#1 L3 (size=16384KB, linesize=64, ways=16, Inclusive=0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000003,,0x0) PU#64 (cpuset: 0x00000001,,0x0) PU#65 (cpuset: 0x00000002,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x0000000c,,0x0) PU#66 (cpuset: 0x00000004,,0x0) PU#67 (cpuset: 0x00000008,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000030,,0x0) PU#68 (cpuset: 0x00000010,,0x0) PU#69 (cpuset: 0x00000020,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x000000c0,,0x0) PU#70 (cpuset: 0x00000040,,0x0) PU#71 (cpuset: 0x00000080,,0x0) L3 (size=16384KB, linesize=64, ways=16, Inclusive=0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000300,,0x0) PU#72 (cpuset: 0x00000100,,0x0) PU#73 (cpuset: 0x00000200,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000c00,,0x0) PU#74 (cpuset: 0x00000400,,0x0) PU#75 (cpuset: 0x00000800,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00003000,,0x0) PU#76 (cpuset: 0x00001000,,0x0) PU#77 (cpuset: 0x00002000,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x0000c000,,0x0) PU#78 (cpuset: 0x00004000,,0x0) PU#79 (cpuset: 0x00008000,,0x0) L3 (size=16384KB, linesize=64, ways=16, Inclusive=0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00030000,,0x0) PU#80 (cpuset: 0x00010000,,0x0) PU#81 (cpuset: 0x00020000,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x000c0000,,0x0) PU#82 (cpuset: 0x00040000,,0x0) PU#83 (cpuset: 0x00080000,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00300000,,0x0) PU#84 (cpuset: 0x00100000,,0x0) PU#85 (cpuset: 0x00200000,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00c00000,,0x0) PU#86 (cpuset: 0x00400000,,0x0) PU#87 (cpuset: 0x00800000,,0x0) L3 (size=16384KB, linesize=64, ways=16, Inclusive=0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x03000000,,0x0) PU#88 (cpuset: 0x01000000,,0x0) PU#89 (cpuset: 0x02000000,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x0c000000,,0x0) PU#90 (cpuset: 0x04000000,,0x0) PU#91 (cpuset: 0x08000000,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x30000000,,0x0) PU#92 (cpuset: 0x10000000,,0x0) PU#93 (cpuset: 0x20000000,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0xc0000000,,0x0) PU#94 (cpuset: 0x40000000,,0x0) PU#95 (cpuset: 0x80000000,,0x0) L3 (size=16384KB, linesize=64, ways=16, Inclusive=0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000003,,,0x0) PU#96 (cpuset: 0x00000001,,,0x0) PU#97 (cpuset: 0x00000002,,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x0000000c,,,0x0) PU#98 (cpuset: 0x00000004,,,0x0) PU#99 (cpuset: 0x00000008,,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000030,,,0x0) PU#100 (cpuset: 0x00000010,,,0x0) PU#101 (cpuset: 0x00000020,,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x000000c0,,,0x0) PU#102 (cpuset: 0x00000040,,,0x0) PU#103 (cpuset: 0x00000080,,,0x0) L3 (size=16384KB, linesize=64, ways=16, Inclusive=0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000300,,,0x0) PU#104 (cpuset: 0x00000100,,,0x0) PU#105 (cpuset: 0x00000200,,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000c00,,,0x0) PU#106 (cpuset: 0x00000400,,,0x0) PU#107 (cpuset: 0x00000800,,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00003000,,,0x0) PU#108 (cpuset: 0x00001000,,,0x0) PU#109 (cpuset: 0x00002000,,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x0000c000,,,0x0) PU#110 (cpuset: 0x00004000,,,0x0) PU#111 (cpuset: 0x00008000,,,0x0) L3 (size=16384KB, linesize=64, ways=16, Inclusive=0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00030000,,,0x0) PU#112 (cpuset: 0x00010000,,,0x0) PU#113 (cpuset: 0x00020000,,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x000c0000,,,0x0) PU#114 (cpuset: 0x00040000,,,0x0) PU#115 (cpuset: 0x00080000,,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00300000,,,0x0) PU#116 (cpuset: 0x00100000,,,0x0) PU#117 (cpuset: 0x00200000,,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00c00000,,,0x0) PU#118 (cpuset: 0x00400000,,,0x0) PU#119 (cpuset: 0x00800000,,,0x0) L3 (size=16384KB, linesize=64, ways=16, Inclusive=0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x03000000,,,0x0) PU#120 (cpuset: 0x01000000,,,0x0) PU#121 (cpuset: 0x02000000,,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x0c000000,,,0x0) PU#122 (cpuset: 0x04000000,,,0x0) PU#123 (cpuset: 0x08000000,,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x30000000,,,0x0) PU#124 (cpuset: 0x10000000,,,0x0) PU#125 (cpuset: 0x20000000,,,0x0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0xc0000000,,,0x0) PU#126 (cpuset: 0x40000000,,,0x0) PU#127 (cpuset: 0x80000000,,,0x0) Prime95 64bit version 30.7, RdtscTiming=1 Timings for 2048K FFT length (64 cores, 8 workers): 2.52, 2.58, 2.51, 2.48, 2.72, 2.79, 2.37, 2.45 ms. Throughput: 3142.62 iter/sec. Timings for 2240K FFT length (64 cores, 8 workers): 2.73, 2.77, 3.26, 3.31, 2.90, 2.96, 2.93, 2.87 ms. Throughput: 2709.76 iter/sec. Timings for 2304K FFT length (64 cores, 8 workers): 2.91, 2.95, 3.37, 3.44, 3.26, 3.09, 3.08, 3.25 ms. Throughput: 2532.59 iter/sec. Timings for 2400K FFT length (64 cores, 8 workers): 2.99, 3.24, 3.64, 3.70, 3.38, 3.34, 3.36, 3.44 ms. Throughput: 2372.03 iter/sec. Timings for 2560K FFT length (64 cores, 8 workers): 3.36, 3.44, 3.31, 3.34, 3.73, 3.83, 3.10, 3.36 ms. Throughput: 2338.84 iter/sec. Timings for 2688K FFT length (64 cores, 8 workers): 4.75, 5.07, 3.20, 3.24, 5.44, 5.34, 3.49, 3.57 ms. Throughput: 1966.47 iter/sec. Timings for 2800K FFT length (64 cores, 8 workers): 3.99, 4.00, 5.98, 6.09, 3.95, 4.05, 5.78, 5.41 ms. Throughput: 1689.72 iter/sec. Timings for 2880K FFT length (64 cores, 8 workers): 4.48, 4.20, 4.42, 4.51, 4.77, 4.91, 4.05, 4.19 ms. Throughput: 1807.78 iter/sec. Timings for 3072K FFT length (64 cores, 8 workers): 4.64, 4.70, 4.78, 4.97, 5.26, 5.11, 4.57, 4.66 ms. Throughput: 1658.00 iter/sec. Timings for 3200K FFT length (64 cores, 8 workers): 4.96, 5.28, 4.98, 5.11, 5.62, 5.70, 4.80, 4.82 ms. Throughput: 1557.18 iter/sec. Timings for 3360K FFT length (64 cores, 8 workers): 4.94, 5.10, 7.09, 7.10, 5.05, 5.00, 6.30, 6.27 ms. Throughput: 1396.58 iter/sec. Timings for 3584K FFT length (64 cores, 8 workers): 6.02, 5.91, 7.04, 7.35, 6.32, 6.42, 6.63, 6.66 ms. Throughput: 1228.36 iter/sec. Timings for 3840K FFT length (64 cores, 8 workers): 7.29, 7.32, 7.91, 8.03, 7.86, 7.90, 7.33, 7.64 ms. Throughput: 1045.79 iter/sec. Timings for 4096K FFT length (64 cores, 8 workers): 8.39, 8.50, 10.12, 10.08, 8.85, 8.74, 9.38, 9.68 ms. Throughput: 872.34 iter/sec. Timings for 4480K FFT length (64 cores, 8 workers): 11.37, 11.91, 11.76, 12.06, 12.54, 12.54, 11.14, 11.43 ms. Throughput: 676.55 iter/sec. Timings for 4608K FFT length (64 cores, 8 workers): 12.66, 13.07, 11.65, 12.46, 14.16, 13.75, 11.47, 11.92 ms. Throughput: 635.95 iter/sec. Timings for 4800K FFT length (64 cores, 8 workers): 13.12, 13.29, 12.67, 13.13, 14.36, 14.31, 12.60, 12.48 ms. Throughput: 605.56 iter/sec. Timings for 5120K FFT length (64 cores, 8 workers): 14.35, 14.28, 14.27, 14.61, 15.44, 15.22, 13.61, 13.94 ms. Throughput: 553.91 iter/sec. Timings for 5376K FFT length (64 cores, 8 workers): 15.54, 15.82, 17.01, 17.81, 16.38, 16.37, 15.95, 16.30 ms. Throughput: 488.70 iter/sec. [Fri Jan 14 09:18:07 2022] Timings for 5600K FFT length (64 cores, 8 workers): 16.73, 16.57, 17.51, 17.84, 17.71, 17.59, 16.58, 16.89 ms. Throughput: 466.13 iter/sec. Timings for 5760K FFT length (64 cores, 8 workers): 17.71, 17.96, 18.26, 18.51, 18.75, 18.79, 17.70, 17.75 ms. Throughput: 440.36 iter/sec. Timings for 6144K FFT length (64 cores, 8 workers): 18.78, 18.97, 19.46, 20.00, 20.33, 20.06, 18.81, 18.94 ms. Throughput: 412.37 iter/sec. Timings for 6400K FFT length (64 cores, 8 workers): 19.66, 19.81, 20.82, 21.23, 21.32, 21.07, 19.65, 19.83 ms. Throughput: 392.15 iter/sec. Timings for 6720K FFT length (64 cores, 8 workers): 21.10, 21.46, 22.37, 22.75, 22.55, 22.77, 21.24, 21.46 ms. Throughput: 364.61 iter/sec. Timings for 7168K FFT length (64 cores, 8 workers): 22.56, 22.87, 23.99, 24.52, 24.05, 24.01, 23.00, 23.16 ms. Throughput: 340.42 iter/sec. Timings for 7680K FFT length (64 cores, 8 workers): 26.23, 26.30, 27.43, 28.05, 27.48, 27.88, 25.93, 26.09 ms. Throughput: 297.42 iter/sec. Timings for 8000K FFT length (64 cores, 8 workers): 26.51, 26.77, 27.62, 28.23, 28.25, 27.93, 26.14, 26.49 ms. Throughput: 293.91 iter/sec. Timings for 8064K FFT length (64 cores, 8 workers): 26.88, 27.27, 28.44, 29.03, 28.55, 28.59, 26.97, 27.39 ms. Throughput: 287.07 iter/sec. Timings for 8192K FFT length (64 cores, 8 workers): 27.47, 27.62, 28.56, 28.99, 28.95, 29.09, 28.09, 27.77 ms. Throughput: 282.66 iter/sec. Code:
Prime95 64bit version 30.7, RdtscTiming=1 Timing FFTs using 64 cores. Best time for 2048K FFT length: 1.615 ms., avg: 1.727 ms. Best time for 2240K FFT length: 1.714 ms., avg: 1.820 ms. Best time for 2304K FFT length: 1.634 ms., avg: 1.724 ms. Best time for 2400K FFT length: 1.731 ms., avg: 1.900 ms. Best time for 2560K FFT length: 1.493 ms., avg: 1.702 ms. Best time for 2688K FFT length: 2.217 ms., avg: 2.331 ms. Best time for 2800K FFT length: 2.345 ms., avg: 2.511 ms. Best time for 2880K FFT length: 1.819 ms., avg: 1.960 ms. Best time for 3072K FFT length: 1.585 ms., avg: 1.683 ms. Best time for 3200K FFT length: 1.729 ms., avg: 1.849 ms. Best time for 3360K FFT length: 2.043 ms., avg: 2.127 ms. Best time for 3584K FFT length: 1.644 ms., avg: 1.760 ms. Best time for 3840K FFT length: 2.020 ms., avg: 2.134 ms. Best time for 4096K FFT length: 1.901 ms., avg: 2.008 ms. Best time for 4480K FFT length: 2.316 ms., avg: 2.458 ms. Best time for 4608K FFT length: 2.689 ms., avg: 2.812 ms. Best time for 4800K FFT length: 2.204 ms., avg: 2.311 ms. Best time for 5120K FFT length: 2.329 ms., avg: 2.430 ms. Best time for 5376K FFT length: 2.453 ms., avg: 2.634 ms. Best time for 5600K FFT length: 2.324 ms., avg: 2.437 ms. Best time for 5760K FFT length: 2.340 ms., avg: 2.460 ms. Best time for 6144K FFT length: 2.479 ms., avg: 2.582 ms. Best time for 6400K FFT length: 3.294 ms., avg: 3.472 ms. Best time for 6720K FFT length: 3.058 ms., avg: 3.223 ms. Best time for 7168K FFT length: 2.676 ms., avg: 2.974 ms. Best time for 7680K FFT length: 2.934 ms., avg: 3.083 ms. Best time for 8000K FFT length: 3.238 ms., avg: 3.360 ms. Best time for 8064K FFT length: 2.915 ms., avg: 3.100 ms. Best time for 8192K FFT length: 2.948 ms., avg: 3.136 ms. Code:
Prime95 64bit version 30.7, RdtscTiming=1 Timing FFTs using 8 cores. Best time for 2048K FFT length: 1.477 ms., avg: 1.718 ms. Best time for 2240K FFT length: 1.242 ms., avg: 1.303 ms. Best time for 2304K FFT length: 1.237 ms., avg: 1.351 ms. Best time for 2400K FFT length: 1.333 ms., avg: 1.816 ms. Best time for 2560K FFT length: 1.705 ms., avg: 1.916 ms. Best time for 2688K FFT length: 1.906 ms., avg: 2.126 ms. Best time for 2800K FFT length: 1.982 ms., avg: 2.574 ms. Best time for 2880K FFT length: 1.649 ms., avg: 2.028 ms. Best time for 3072K FFT length: 2.293 ms., avg: 2.657 ms. Best time for 3200K FFT length: 2.136 ms., avg: 2.413 ms. Best time for 3360K FFT length: 1.958 ms., avg: 2.143 ms. Best time for 3584K FFT length: 2.103 ms., avg: 2.266 ms. Best time for 3840K FFT length: 2.554 ms., avg: 3.050 ms. Best time for 4096K FFT length: 2.629 ms., avg: 3.013 ms. Best time for 4480K FFT length: 3.257 ms., avg: 3.545 ms. Best time for 4608K FFT length: 3.282 ms., avg: 4.088 ms. Best time for 4800K FFT length: 3.485 ms., avg: 3.890 ms. Best time for 5120K FFT length: 3.685 ms., avg: 4.153 ms. Best time for 5376K FFT length: 4.291 ms., avg: 4.914 ms. Best time for 5600K FFT length: 3.993 ms., avg: 4.686 ms. Best time for 5760K FFT length: 4.137 ms., avg: 5.228 ms. Best time for 6144K FFT length: 4.289 ms., avg: 4.767 ms. Best time for 6400K FFT length: 4.632 ms., avg: 5.075 ms. Best time for 6720K FFT length: 4.921 ms., avg: 5.532 ms. Best time for 7168K FFT length: 5.229 ms., avg: 6.021 ms. Best time for 7680K FFT length: 5.528 ms., avg: 6.291 ms. Best time for 8000K FFT length: 6.174 ms., avg: 6.698 ms. Best time for 8064K FFT length: 5.890 ms., avg: 6.439 ms. Best time for 8192K FFT length: 6.073 ms., avg: 6.835 ms. Last fiddled with by JCoveiro on 20220114 at 09:38 
20220114, 11:17  #860 
Jun 2003
12473_{8} Posts 
Can you also do 4w (16t per w) as well? For wavefront exponents, 8 worker will not fit entirely within L3, but 4 worker will.
Also, what is your RAM speed? EDIT: 4096K FFT and higher would suffice. Smaller ones will not see any benefit Last fiddled with by axn on 20220114 at 11:18 
20220114, 17:30  #861  
"Jorge Coveiro"
Nov 2006
Moura, Portugal
2^{4}×3 Posts 
3990x (PBO disabled) + 128GB@3200MHz RAM
Quote:
Quote:
Quote:
Last fiddled with by JCoveiro on 20220114 at 17:34 

20220115, 02:48  #862 
Jun 2003
5×1,087 Posts 
Ok. These are definitely showing better thruput compared to 8w tests.
There is a chance that you may get slightly higher thruput by optimizing Infinity Fabric speed. By default IF runs at RAM speeds. So 3200 RAM gives IF speed of 3200 (or is it 1600? doesn't matter). Either running 3600 RAM or explicitly setting higher IF speed in BIOS may improve thruput slightly in this case. 
20220115, 09:54  #863 
Oct 2008
2×3×11 Posts 
Dell Optiplex 9020  i7 4770S & Dual channel 8GB 1600MHz DDR3 RAM
Code:
Compare your results to other computers at http://www.mersenne.org/report_benchmarks Intel(R) Core(TM) i74770S CPU @ 3.10GHz CPU speed: 3249.62 MHz, 4 cores CPU features: Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA L1 cache size: 4x32 KB, L2 cache size: 4x256 KB, L3 cache size: 8 MB L1 cache line size: 64 bytes, L2 cache line size: 64 bytes Machine topology as determined by hwloc library: Machine#0 (total=8049448KB, DMIProductName="OptiPlex 9020", DMIProductVersion=00, DMIBoardVendor="Dell Inc.", DMIBoardName=0XCR8D, DMIBoardVersion=A02, DMIBoardAssetTag=, DMIChassisVendor="Dell Inc.", DMIChassisType=15, DMIChassisVersion=, DMIChassisAssetTag=, DMIBIOSVendor="Dell Inc.", DMIBIOSVersion=A25, DMIBIOSDate=05/30/2019, DMISysVendor="Dell Inc.", Backend=Linux, LinuxCgroup=/, OSName=Linux, OSRelease=5.11.046generic, OSVersion="#51~20.04.1Ubuntu SMP Fri Jan 7 06:51:40 UTC 2022", HostName=optiplex, Architecture=x86_64, hwlocVersion=2.4.1, ProcessName=mprime) Package#0 (total=8049448KB, CPUVendor=GenuineIntel, CPUFamilyNumber=6, CPUModelNumber=60, CPUModel="Intel(R) Core(TM) i74770S CPU @ 3.10GHz", CPUStepping=3) L3 (size=8192KB, linesize=64, ways=16, Inclusive=1) L2 (size=256KB, linesize=64, ways=8, Inclusive=0) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core#0 (cpuset: 0x00000001) PU#0 (cpuset: 0x00000001) L2 (size=256KB, linesize=64, ways=8, Inclusive=0) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core#1 (cpuset: 0x00000002) PU#1 (cpuset: 0x00000002) L2 (size=256KB, linesize=64, ways=8, Inclusive=0) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core#2 (cpuset: 0x00000004) PU#2 (cpuset: 0x00000004) L2 (size=256KB, linesize=64, ways=8, Inclusive=0) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core#3 (cpuset: 0x00000008) PU#3 (cpuset: 0x00000008) Prime95 64bit version 30.7, RdtscTiming=1 Timings for 2048K FFT length (4 cores, 1 worker): 3.77 ms. Throughput: 265.46 iter/sec. Timings for 2048K FFT length (4 cores, 4 workers): 15.90, 16.26, 16.06, 16.34 ms. Throughput: 247.87 iter/sec. Timings for 2304K FFT length (4 cores, 1 worker): 4.17 ms. Throughput: 239.79 iter/sec. Timings for 2304K FFT length (4 cores, 4 workers): 17.41, 17.63, 17.62, 17.32 ms. Throughput: 228.68 iter/sec. Timings for 2400K FFT length (4 cores, 1 worker): 4.42 ms. Throughput: 225.99 iter/sec. [Sat Jan 15 20:29:15 2022] Timings for 2400K FFT length (4 cores, 4 workers): 18.32, 18.71, 18.26, 18.29 ms. Throughput: 217.46 iter/sec. Timings for 2560K FFT length (4 cores, 1 worker): 4.89 ms. Throughput: 204.31 iter/sec. Timings for 2560K FFT length (4 cores, 4 workers): 20.08, 20.33, 20.46, 20.01 ms. Throughput: 197.85 iter/sec. Timings for 2688K FFT length (4 cores, 1 worker): 4.98 ms. Throughput: 201.00 iter/sec. Timings for 2688K FFT length (4 cores, 4 workers): 20.93, 20.74, 20.41, 20.97 ms. Throughput: 192.66 iter/sec. Timings for 2880K FFT length (4 cores, 1 worker): 5.37 ms. Throughput: 186.09 iter/sec. Timings for 2880K FFT length (4 cores, 4 workers): 22.26, 22.14, 21.79, 21.78 ms. Throughput: 181.88 iter/sec. Timings for 3072K FFT length (4 cores, 1 worker): 6.00 ms. Throughput: 166.72 iter/sec. Timings for 3072K FFT length (4 cores, 4 workers): 24.97, 24.33, 24.28, 24.22 ms. Throughput: 163.63 iter/sec. Timings for 3200K FFT length (4 cores, 1 worker): 6.17 ms. Throughput: 162.12 iter/sec. Timings for 3200K FFT length (4 cores, 4 workers): 24.67, 25.42, 24.61, 24.57 ms. Throughput: 161.21 iter/sec. Timings for 3360K FFT length (4 cores, 1 worker): 6.67 ms. Throughput: 150.03 iter/sec. Timings for 3360K FFT length (4 cores, 4 workers): 27.21, 27.14, 26.57, 26.59 ms. Throughput: 148.84 iter/sec. Timings for 3456K FFT length (4 cores, 1 worker): 6.79 ms. Throughput: 147.33 iter/sec. Timings for 3456K FFT length (4 cores, 4 workers): 27.87, 27.70, 27.14, 27.22 ms. Throughput: 145.56 iter/sec. Timings for 3584K FFT length (4 cores, 1 worker): 7.09 ms. Throughput: 141.12 iter/sec. Timings for 3584K FFT length (4 cores, 4 workers): 28.41, 28.91, 28.29, 28.54 ms. Throughput: 140.18 iter/sec. Timings for 3840K FFT length (4 cores, 1 worker): 8.08 ms. Throughput: 123.84 iter/sec. Timings for 3840K FFT length (4 cores, 4 workers): 31.00, 30.80, 30.25, 30.64 ms. Throughput: 130.42 iter/sec. [Sat Jan 15 20:34:27 2022] Timings for 4096K FFT length (4 cores, 1 worker): 8.81 ms. Throughput: 113.50 iter/sec. Timings for 4096K FFT length (4 cores, 4 workers): 32.70, 33.73, 32.77, 32.57 ms. Throughput: 121.44 iter/sec. Timings for 4480K FFT length (4 cores, 1 worker): 14.65 ms. Throughput: 68.24 iter/sec. Timings for 4480K FFT length (4 cores, 4 workers): 34.71, 35.12, 34.58, 35.13 ms. Throughput: 114.67 iter/sec. Timings for 4608K FFT length (4 cores, 1 worker): 13.56 ms. Throughput: 73.73 iter/sec. Timings for 4608K FFT length (4 cores, 4 workers): 37.04, 38.02, 36.77, 37.02 ms. Throughput: 107.51 iter/sec. Timings for 4800K FFT length (4 cores, 1 worker): 9.41 ms. Throughput: 106.24 iter/sec. Timings for 4800K FFT length (4 cores, 4 workers): 37.53, 37.92, 37.33, 38.16 ms. Throughput: 106.01 iter/sec. Timings for 5120K FFT length (4 cores, 1 worker): 10.37 ms. Throughput: 96.43 iter/sec. Timings for 5120K FFT length (4 cores, 4 workers): 40.89, 41.97, 40.59, 40.64 ms. Throughput: 97.52 iter/sec. Timings for 5376K FFT length (4 cores, 1 worker): 10.59 ms. Throughput: 94.46 iter/sec. Timings for 5376K FFT length (4 cores, 4 workers): 42.21, 42.70, 42.66, 43.45 ms. Throughput: 93.56 iter/sec. Timings for 5760K FFT length (4 cores, 1 worker): 14.74 ms. Throughput: 67.87 iter/sec. Timings for 5760K FFT length (4 cores, 4 workers): 44.97, 45.84, 45.02, 46.07 ms. Throughput: 87.96 iter/sec. Timings for 6144K FFT length (4 cores, 1 worker): 12.22 ms. Throughput: 81.80 iter/sec. Timings for 6144K FFT length (4 cores, 4 workers): 48.41, 48.83, 48.20, 49.13 ms. Throughput: 82.24 iter/sec. Timings for 6400K FFT length (4 cores, 1 worker): 12.68 ms. Throughput: 78.87 iter/sec. Timings for 6400K FFT length (4 cores, 4 workers): 51.00, 50.75, 50.45, 50.41 ms. Throughput: 78.98 iter/sec. [Sat Jan 15 20:39:40 2022] Timings for 6720K FFT length (4 cores, 1 worker): 13.77 ms. Throughput: 72.61 iter/sec. Timings for 6720K FFT length (4 cores, 4 workers): 55.15, 55.34, 54.19, 54.16 ms. Throughput: 73.12 iter/sec. Timings for 6912K FFT length (4 cores, 1 worker): 13.71 ms. Throughput: 72.96 iter/sec. Timings for 6912K FFT length (4 cores, 4 workers): 54.72, 55.26, 53.69, 54.40 ms. Throughput: 73.38 iter/sec. Timings for 7168K FFT length (4 cores, 1 worker): 14.85 ms. Throughput: 67.36 iter/sec. Timings for 7168K FFT length (4 cores, 4 workers): 58.60, 59.39, 58.57, 58.74 ms. Throughput: 68.00 iter/sec. Timings for 7680K FFT length (4 cores, 1 worker): 15.65 ms. Throughput: 63.89 iter/sec. Timings for 7680K FFT length (4 cores, 4 workers): 62.30, 62.28, 61.19, 61.77 ms. Throughput: 64.64 iter/sec. Timings for 8064K FFT length (4 cores, 1 worker): 16.54 ms. Throughput: 60.48 iter/sec. Timings for 8064K FFT length (4 cores, 4 workers): 66.25, 66.41, 65.35, 65.21 ms. Throughput: 60.79 iter/sec. Timings for 8192K FFT length (4 cores, 1 worker): 17.86 ms. Throughput: 55.98 iter/sec. Timings for 8192K FFT length (4 cores, 4 workers): 69.37, 70.79, 68.96, 69.17 ms. Throughput: 57.50 iter/sec. 
20220418, 18:45  #864 
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
5^{2}·211 Posts 
Are these inconsistencies typical or "odd"
=== Small FFTS 8/2 slightly better
Timings for 56K FFT length (8 cores, 1 worker): 0.10 ms. Throughput: 9976.49 iter/sec. Timings for 56K FFT length (8 cores, 2 workers): 0.08, 0.08 ms. Throughput: 25139.32 iter/sec. Timings for 84K FFT length (8 cores, 1 worker): 0.12 ms. Throughput: 8448.03 iter/sec. Timings for 84K FFT length (8 cores, 2 workers): 0.10, 0.11 ms. Throughput: 18867.62 iter/sec. Timings for 200K FFT length (8 cores, 1 worker): 0.26 ms. Throughput: 3881.58 iter/sec. Timings for 200K FFT length (8 cores, 2 workers): 0.24, 0.23 ms. Throughput: 8408.37 iter/sec. Timings for 240K FFT length (8 cores, 1 worker): 0.27 ms. Throughput: 3741.51 iter/sec. Timings for 240K FFT length (8 cores, 2 workers): 0.25, 0.25 ms. Throughput: 8023.39 iter/sec. === 3 of the next 4 FFTs show 8/1 MUCH better; the other is more than 2x worse Timings for 280K FFT length (8 cores, 1 worker): 0.16 ms. Throughput: 6246.39 iter/sec. Timings for 280K FFT length (8 cores, 2 workers): 0.23, 0.23 ms. Throughput: 8765.58 iter/sec. Timings for 288K FFT length (8 cores, 1 worker): 0.16 ms. Throughput: 6282.95 iter/sec. Timings for 288K FFT length (8 cores, 2 workers): 0.23, 0.23 ms. Throughput: 8608.45 iter/sec. Timings for 300K FFT length (8 cores, 1 worker): 0.36 ms. Throughput: 2795.59 iter/sec. Timings for 300K FFT length (8 cores, 2 workers): 0.34, 0.34 ms. Throughput: 5864.26 iter/sec. Timings for 320K FFT length (8 cores, 1 worker): 0.17 ms. Throughput: 5715.57 iter/sec. Timings for 320K FFT length (8 cores, 2 workers): 0.26, 0.26 ms. Throughput: 7777.33 iter/sec. === These are very close Timings for 336K FFT length (8 cores, 1 worker): 0.36 ms. Throughput: 2807.85 iter/sec. Timings for 336K FFT length (8 cores, 2 workers): 0.34, 0.35 ms. Throughput: 5855.84 iter/sec. Timings for 360K FFT length (8 cores, 1 worker): 0.37 ms. Throughput: 2675.34 iter/sec. Timings for 360K FFT length (8 cores, 2 workers): 0.38, 0.36 ms. Throughput: 5397.72 iter/sec. Timings for 384K FFT length (8 cores, 1 worker): 0.42 ms. Throughput: 2393.08 iter/sec. Timings for 384K FFT length (8 cores, 2 workers): 0.39, 0.39 ms. Throughput: 5190.01 iter/sec. Timings for 392K FFT length (8 cores, 1 worker): 0.36 ms. Throughput: 2788.58 iter/sec. Timings for 392K FFT length (8 cores, 2 workers): 0.37, 0.39 ms. Throughput: 5314.79 iter/sec. === Except here 8/1 better again Timings for 400K FFT length (8 cores, 1 worker): 0.28 ms. Throughput: 3591.48 iter/sec. Timings for 400K FFT length (8 cores, 2 workers): 0.35, 0.35 ms. Throughput: 5767.16 iter/sec. ===Even again Timings for 420K FFT length (8 cores, 1 worker): 0.38 ms. Throughput: 2627.43 iter/sec. Timings for 420K FFT length (8 cores, 2 workers): 0.39, 0.41 ms. Throughput: 4998.83 iter/sec. === 8/2 better again Timings for 432K FFT length (8 cores, 1 worker): 0.45 ms. Throughput: 2231.70 iter/sec. Timings for 432K FFT length (8 cores, 2 workers): 0.43, 0.43 ms. Throughput: 4698.37 iter/sec. Timings for 448K FFT length (8 cores, 1 worker): 0.48 ms. Throughput: 2069.15 iter/sec. Timings for 448K FFT length (8 cores, 2 workers): 0.43, 0.43 ms. Throughput: 4605.74 iter/sec. Timings for 480K FFT length (8 cores, 1 worker): 0.47 ms. Throughput: 2136.79 iter/sec. Timings for 480K FFT length (8 cores, 2 workers): 0.43, 0.45 ms. Throughput: 4569.94 iter/sec. Timings for 504K FFT length (8 cores, 1 worker): 0.56 ms. Throughput: 1778.19 iter/sec. Timings for 504K FFT length (8 cores, 2 workers): 0.51, 0.48 ms. Throughput: 4021.12 iter/sec. Timings for 512K FFT length (8 cores, 1 worker): 0.46 ms. Throughput: 2168.60 iter/sec. Timings for 512K FFT length (8 cores, 2 workers): 0.49, 0.48 ms. Throughput: 4106.95 iter/sec. === Until here ... now 8/1 is better Timings for 560K FFT length (8 cores, 1 worker): 0.39 ms. Throughput: 2564.46 iter/sec. Timings for 560K FFT length (8 cores, 2 workers): 0.47, 0.50 ms. Throughput: 4116.87 iter/sec. Timings for 576K FFT length (8 cores, 1 worker): 0.50 ms. Throughput: 2005.67 iter/sec. Timings for 576K FFT length (8 cores, 2 workers): 0.52, 0.52 ms. Throughput: 3817.62 iter/sec. Timings for 588K FFT length (8 cores, 1 worker): 0.52 ms. Throughput: 1937.06 iter/sec. Timings for 588K FFT length (8 cores, 2 workers): 0.54, 0.54 ms. Throughput: 3704.54 iter/sec. Timings for 600K FFT length (8 cores, 1 worker): 0.44 ms. Throughput: 2267.61 iter/sec. Timings for 600K FFT length (8 cores, 2 workers): 0.49, 0.49 ms. Throughput: 4079.12 iter/sec. Last fiddled with by petrw1 on 20220418 at 18:48 
20220418, 22:09  #865  
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
5^{2}×211 Posts 
Quote:
My 5.0M P1 uses 280 (FAST ONE) for Stage 1 ... 17 minutes and 300 (SLOW ONE) for stage 2 ... 28 minutes If I could get it to use 320 (FAST ONE) for stage 2 it would actually run it in about half the time. 

20220419, 14:17  #866  
P90 years forever!
Aug 2002
Yeehaw, FL
2·5^{2}·163 Posts 
Quote:
Pminus1=FFT2=320K,1,2,70177,1,3000000000,3000000000,117 BTW, what is your CPU? P.S. Stage 2 won't get twice as fast as polynomial multiplication time dominates the stage2 cost. It should help in the polynomial preparation time. Please let us know your results. Last fiddled with by Prime95 on 20220419 at 14:20 

20220419, 16:38  #867  
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
5^{2}·211 Posts 
Quote:
With your suggestion it used 320K on both Stage1 and Stage2. Timings are now both about 18 minutes. Again without it is 280K / 300K 17/29 minutes. So this is better overall. On a related topic: When I run benchmark it seems to have no problem multithreading even on very small FFTs. But P1 does NOT multithread with very small FFTs ... something like under 40K. Code:
[Work thread Apr 19 10:26] Benchmarking multiple workers to measure the impact of memory bandwidth [Work thread Apr 19 10:26] Timing 16K FFT, 8 cores, 1 worker. Average times: 0.03 ms. Total throughput: 36090.04 iter/sec. [Work thread Apr 19 10:27] Timing 16K FFT, 8 cores, 2 workers. Average times: 0.03, 0.03 ms. Total throughput: 70720.87 iter/sec. [Work thread Apr 19 10:27] Timing 16K FFT, 8 cores, 4 workers. Average times: 0.03, 0.03, 0.03, 0.03 ms. Total throughput: 141805.81 iter/sec. [Work thread Apr 19 10:27] Timing 16K FFT, 8 cores, 8 workers. Average times: 0.03, 0.03, 0.03, 0.03, 0.03, 0.03, 0.03, 0.03 ms. Total throughput: 282437.98 iter/sec. 

20220419, 22:36  #868  
P90 years forever!
Aug 2002
Yeehaw, FL
1111111010110_{2} Posts 
Quote:
I am very curious why the default implementation of 300K FFT is so poor on your machine. Quote:


20220420, 18:31  #869  
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
5^{2}·211 Posts 
Quote:
Prime95 64bit version 30.8, RdtscTiming=1 Timing FFTs using 8 cores. Best time for 12K FFT length: 0.019 ms., avg: 0.020 ms. Best time for 15K FFT length: 0.026 ms., avg: 0.026 ms. Best time for 16K FFT length: 0.027 ms., avg: 0.027 ms. Best time for 18K FFT length: 0.031 ms., avg: 0.031 ms. Best time for 20K FFT length: 0.035 ms., avg: 0.035 ms. Best time for 24K FFT length: 0.041 ms., avg: 0.042 ms. Best time for 25K FFT length: 0.047 ms., avg: 0.050 ms. Best time for 30K FFT length: 0.055 ms., avg: 0.058 ms. Best time for 32K FFT length: 0.054 ms., avg: 0.056 ms. Best time for 35K FFT length: 0.068 ms., avg: 0.069 ms. Best time for 36K FFT length: 0.067 ms., avg: 0.070 ms. Best time for 40K FFT length: 0.075 ms., avg: 0.081 ms. Best time for 42K FFT length: 0.079 ms., avg: 0.082 ms. Best time for 48K FFT length: 0.091 ms., avg: 0.123 ms. <= Best time for 56K FFT length: 0.097 ms., avg: 0.104 ms. Best time for 60K FFT length: 0.107 ms., avg: 0.119 ms. Best time for 64K FFT length: 0.116 ms., avg: 0.128 ms. Best time for 72K FFT length: 0.111 ms., avg: 0.119 ms. Best time for 80K FFT length: 0.119 ms., avg: 0.128 ms. Best time for 84K FFT length: 0.113 ms., avg: 0.124 ms. Best time for 96K FFT length: 0.109 ms., avg: 0.120 ms. Best time for 120K FFT length: 0.112 ms., avg: 0.123 ms. Best time for 128K FFT length: 0.125 ms., avg: 0.141 ms. <= Best time for 144K FFT length: 0.113 ms., avg: 0.121 ms. Best time for 192K FFT length: 0.117 ms., avg: 0.125 ms. Best time for 200K FFT length: 0.290 ms., avg: 0.301 ms. <= Best time for 240K FFT length: 0.303 ms., avg: 0.313 ms. <= Best time for 280K FFT length: 0.150 ms., avg: 0.196 ms. Best time for 288K FFT length: 0.147 ms., avg: 0.159 ms. Best time for 300K FFT length: 0.366 ms., avg: 0.380 ms. <= Best time for 320K FFT length: 0.169 ms., avg: 0.178 ms. Best time for 336K FFT length: 0.368 ms., avg: 0.381 ms. <= Best time for 360K FFT length: 0.360 ms., avg: 0.373 ms. Best time for 384K FFT length: 0.382 ms., avg: 0.401 ms. Best time for 392K FFT length: 0.343 ms., avg: 0.357 ms. Best time for 400K FFT length: 0.298 ms., avg: 0.309 ms. Best time for 420K FFT length: 0.396 ms., avg: 0.408 ms. Best time for 432K FFT length: 0.416 ms., avg: 0.437 ms. Best time for 448K FFT length: 0.437 ms., avg: 0.453 ms. Best time for 480K FFT length: 0.458 ms., avg: 0.479 ms. Best time for 504K FFT length: 0.593 ms., avg: 0.612 ms. <= Best time for 512K FFT length: 0.437 ms., avg: 0.457 ms. Best time for 560K FFT length: 0.361 ms., avg: 0.380 ms. Best time for 576K FFT length: 0.438 ms., avg: 0.466 ms. Best time for 588K FFT length: 0.496 ms., avg: 0.521 ms. Best time for 600K FFT length: 0.496 ms., avg: 0.513 ms.  redo to 6000K Best time for 600K FFT length: 0.388 ms., avg: 0.440 ms. <= reasonably less this time Best time for 640K FFT length: 0.546 ms., avg: 0.566 ms. Best time for 672K FFT length: 0.665 ms., avg: 0.694 ms. Best time for 720K FFT length: 0.564 ms., avg: 0.581 ms. Best time for 768K FFT length: 0.311 ms., avg: 0.358 ms. <= Best time for 800K FFT length: 0.340 ms., avg: 0.423 ms. Best time for 840K FFT length: 0.542 ms., avg: 0.600 ms. Best time for 864K FFT length: 0.364 ms., avg: 0.441 ms. <= Best time for 896K FFT length: 0.758 ms., avg: 0.817 ms. <= Best time for 960K FFT length: 0.376 ms., avg: 0.455 ms. Best time for 1000K FFT length: 0.438 ms., avg: 0.521 ms. Best time for 1008K FFT length: 0.402 ms., avg: 0.521 ms. Best time for 1024K FFT length: 0.415 ms., avg: 0.500 ms. Best time for 1152K FFT length: 0.429 ms., avg: 0.555 ms. Best time for 1200K FFT length: 0.515 ms., avg: 0.630 ms. Best time for 1280K FFT length: 0.496 ms., avg: 0.526 ms. Best time for 1344K FFT length: 0.525 ms., avg: 0.612 ms. Best time for 1440K FFT length: 0.524 ms., avg: 0.635 ms. Best time for 1536K FFT length: 0.543 ms., avg: 0.659 ms. Best time for 1600K FFT length: 0.660 ms., avg: 0.771 ms. Best time for 1728K FFT length: 0.634 ms., avg: 0.750 ms. Best time for 1800K FFT length: 0.739 ms., avg: 0.876 ms. Best time for 1920K FFT length: 0.699 ms., avg: 0.784 ms. <= Best time for 1960K FFT length: 0.803 ms., avg: 0.988 ms. Best time for 2048K FFT length: 0.811 ms., avg: 0.923 ms. Best time for 2100K FFT length: 0.861 ms., avg: 0.964 ms. Best time for 2160K FFT length: 0.910 ms., avg: 0.969 ms. Best time for 2304K FFT length: 0.874 ms., avg: 1.090 ms. Best time for 2400K FFT length: 1.038 ms., avg: 1.204 ms. Best time for 2520K FFT length: 1.052 ms., avg: 1.213 ms. Best time for 2560K FFT length: 1.115 ms., avg: 1.210 ms. Best time for 2592K FFT length: 1.132 ms., avg: 1.242 ms. Best time for 2688K FFT length: 1.222 ms., avg: 1.422 ms. Best time for 2880K FFT length: 1.288 ms., avg: 1.369 ms. Best time for 2940K FFT length: 1.377 ms., avg: 1.543 ms. Best time for 3000K FFT length: 1.405 ms., avg: 1.493 ms. Best time for 3072K FFT length: 1.357 ms., avg: 1.479 ms. Best time for 3136K FFT length: 1.548 ms., avg: 1.708 ms. <= Best time for 3200K FFT length: 1.574 ms., avg: 1.705 ms. Best time for 3360K FFT length: 1.613 ms., avg: 1.716 ms. Best time for 3456K FFT length: 1.811 ms., avg: 1.965 ms. Best time for 3600K FFT length: 1.826 ms., avg: 1.998 ms. Best time for 3840K FFT length: 1.999 ms., avg: 2.110 ms. Best time for 3920K FFT length: 2.072 ms., avg: 2.169 ms. Best time for 4032K FFT length: 2.056 ms., avg: 2.126 ms. Best time for 4200K FFT length: 2.196 ms., avg: 2.316 ms. Best time for 4320K FFT length: 2.297 ms., avg: 2.395 ms. Best time for 4480K FFT length: 2.427 ms., avg: 2.528 ms. Best time for 4608K FFT length: 2.508 ms., avg: 2.651 ms. Best time for 4704K FFT length: 2.549 ms., avg: 2.640 ms. Best time for 4800K FFT length: 2.891 ms., avg: 3.024 ms. <= Best time for 5040K FFT length: 2.695 ms., avg: 2.817 ms. Best time for 5120K FFT length: 2.826 ms., avg: 2.949 ms. Best time for 5184K FFT length: 2.950 ms., avg: 3.072 ms. Best time for 5376K FFT length: 2.979 ms., avg: 3.124 ms. Best time for 5760K FFT length: 3.557 ms., avg: 3.805 ms. Last fiddled with by petrw1 on 20220420 at 18:36 Reason: part 2 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Perpetual "interesting video" thread...  Xyzzy  Lounge  51  20221006 11:28 
LLR benchmark thread  Oddball  Riesel Prime Search  5  20100802 00:11 
Perpetual I'm pi**ed off thread  rogue  Soap Box  19  20091028 19:17 
Perpetual autostereogram thread...  Xyzzy  Lounge  10  20060928 00:36 
Perpetual ECM factoring challenge thread...  Xyzzy  Factoring  65  20050905 08:16 