20201115, 17:12  #1  
Nov 2020
2_{16} Posts 
Interpretation of results
Hi,
I had run a benchmark test and I have problems interpreting it because I have no idea what the different values say. This is one row of my output: Quote:
What do the milliseconds mean? 

20201115, 18:17  #2 
6809 > 6502
"""""""""""""""""""
Aug 2003
101×103 Posts
2·4,787 Posts 
The bigger the FFT size, the more work that the processor needs to do per iteration.
The milliseconds are the time per each iteration. The smaller this value, the faster the testing. 
20201115, 18:53  #3  
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
47×109 Posts 
Quote:
1000/18.79 = 53.22 iter/sec 1000/19.54 = 51.18 1000/18.55 = 53.91 1000/18.68 = 53.53 1000/18.50 = 54.05 1000/18.08 = 55.31 Sum of six = 321.20 iter / sec (0.01 /sec difference is probably due to 2digit roundoff) Work required for an iteration is roughly exponent * log (exponent) * log( log(exponent)) and fft length is a nearly linear function of exponent, while the processor's rate of work is fairly constant. See for example the last two attachments of https://www.mersenneforum.org/showpo...19&postcount=5, right columns; constant within +20% over 2M64M fft length. (Numerous processor types have been exhaustively benchmarked and posted in that thread.) Large multiprecision multiplication is so for some rather fundamental reasons; see Donald Knuth, Seminumerical Algorithms or https://www.mersenneforum.org/showpo...21&postcount=7 If this still doesn't make sense that iteration time is dependent on fft length or exponent, time yourself for each of squaring a onedigit decimal number; a 4 digit, and a 10digit. What is most efficient on a given system depends on system and processor details and fft length. The optimal number of workers can change versus exponent or fft length. DualXeon systems do MUCH better with 2 workers or more than with one; singleworker throughput on the Knights Landing I'm benchmarking now is positively dreadful with one worker (less than 10% of maximum in some fft lengths). Hyperthreading usually is not an advantage in fftbased multiplication, but in some cases provides an advantage. Benchmarking them is the right thing to do. Welcome to the forum. And the learning curve. Last fiddled with by kriesel on 20201115 at 19:54 

20201115, 18:56  #4 
Sep 2017
USA
5·47 Posts 
The "best" setting based on your benchmarks is (6 cores, 1 worker) because it has the most throughput measured by iterations per second.
Note 1: Running multiple workers usually won't help because the fast Fourier transform (FFT) size for candidate exponents is too big to fit into the processor's cache. RAM is needed to hold the information, so RAM speed often becomes the limiting factor instead of processor speed. Having multiple workers will only lead to greater RAMbottlenecks. Note 2: Hyperthreading basically means the operating system can schedule two tasks for each core to perform because there is usually downtime between working on each task. This does not help for Prime95 since that task will fully utilize the core. 
20201115, 19:17  #5 
Nov 2020
2 Posts 
Thanks for your explanation.

20201117, 23:09  #6 
Feb 2008
Bray, Ireland
151 Posts 
That's some cool homework you got.

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
graphical interpretation of f(n)=n²+1 and f(m)=m²+m+1  bhelmes  And now for something completely different  0  20191021 16:10 
Statistical properties of categories of GIMPS results and interim results  kriesel  Probability & Probabilistic Number Theory  1  20190522 22:59 
result interpretation  esakertt  PrimeNet  3  20121114 20:03 
CPU Results last 24 hrs  Unregistered  Information & Answers  3  20100726 00:49 
Trial factoring benchmark interpretation  __HRB__  Information & Answers  3  20091022 21:50 