mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2017-06-21, 04:28   #1
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009

35778 Posts
Default CPU Worker Windows

With this new i7-7700 CPU, Prime95 is only allowing one worker window spread across four threads. The other four threads are not being used. When I installed Prime95, I only included the application file itself. No local.txt or prime.txt. I didn't want to cause problems by using ones from a totally different system. Prime95 created new ones.

I browsed undoc.txt and added the three lines below into local.txt:

Code:
NumCPUs=4
CpuNumHyperthreads=1
CpuSpeed=3978
With this addition, I now have one worker and three helpers. Four other threads remain unused.

Code:
Setting affinity to run worker on CPU core #0.
Setting affinity to run helper thread 1 on CPU core #1
Setting affinity to run helper thread 2 on CPU core #2
Setting affinity to run helper thread 3 on CPU core #3
The only other addition I made is to add 'CumulativeTining=1" to prime.txt.

I'm not sure what else to do. What I was hoping for was four workers and four helpers. I must have really fudged something! Thoughts and ideas, please...

Thanks!



Note: Windows SmartScreen blocks Prime95 v29.2, Build 2, from running on Windows 10 Pro x64, at least, for me. This is a clean install so it may not have the updates it needs yet.
storm5510 is offline   Reply With Quote
Old 2017-06-21, 07:01   #2
thyw
 
Feb 2016
! North_America

79 Posts
Default

I think you should set CpuNumHyperthreads to 2, since i7-7700 is a hyperthreaded cpu.
(Also, P95 should've figured out these settings automatically on the first startup. By including these information you're overriding the p95 CPU detection.
"If the program did not correctly figure out your CPU information, you can override the info in local.txt:")

Edit: in sub v29!, looks like
To create 4 worker window (each with 1 main and 1 helper thread, logically on separeted cores per worker)
you would want to write
Code:
WorkerThreads=4
ThreadsPerTest=2 //this is global, you could inlude this in each "worker block" for different values
[Worker #1]
Affinity=0
[Worker #2]
Affinity=2
[Worker #3]
Affinity=4
[Worker #4]
Affinity=6
PS I'm not sure about these:
Wouldn't many (4) P-1 (really memory intensive) workers be strongly bottlenecked by memory?
Does HT benefit P-1?
In the new undoc (v29) Affinity you have main and aux (helper?) threads split accross multiple CPUS, how should it be used?
Also is 8 GB RAM enough for 4 p-1 job? (you can include MaxHighMemWorkers=n, but it means waiting)

Last fiddled with by thyw on 2017-06-21 at 07:29
thyw is offline   Reply With Quote
Old 2017-06-21, 11:50   #3
GP2
 
GP2's Avatar
 
Sep 2003

13·199 Posts
Default

Quote:
Originally Posted by thyw View Post
Code:
WorkerThreads=4
ThreadsPerTest=2 //this is global, you could inlude this in each "worker block" for
Version 29 uses "CoresPerTest" rather than "ThreadsPerTest" as previous versions did. It gets automatically converted when you run it, however.
GP2 is offline   Reply With Quote
Old 2017-06-21, 14:17   #4
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

32×823 Posts
Default

You'll find that using the hyperthreads will make your CPU run hotter and *reduce* your throughput. The benchmark you posted shows this.

In v29, the worker windows menu choice lets you choose the number of workers and how many CPU cores each worker will use. There is also a checkbox if you want LL tests to use hyperthreading on those cores (not recommended).

The menus should provide access to prime95's important features. You only need to dive into undoc.txt if you want to do strange stuff.
Prime95 is offline   Reply With Quote
Old 2017-06-21, 17:17   #5
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009

19×101 Posts
Default

With a little experimentation, I now have it running two workers on eight threads. I was not sure how far I could go with this...

Code:
CoresPerTest=4
WorkerThreads=2
I don't know if this was a correct way, but it 'appears' to function properly. Core temps are running in the mid '70's. The PSU is pulling 200W from the service. If it goes higher, my APC unit starts to whistle and the light changes to yellow.

Below are two snip from the task manager.
Attached Thumbnails
Click image for larger version

Name:	cpu_snip.JPG
Views:	141
Size:	85.9 KB
ID:	16297   Click image for larger version

Name:	memory_snip.JPG
Views:	90
Size:	76.7 KB
ID:	16298  

Last fiddled with by storm5510 on 2017-06-21 at 17:19
storm5510 is offline   Reply With Quote
Old 2017-06-21, 18:36   #6
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

55638 Posts
Default

Four threads will fully utilize the CPU though. You're making more heat for no benefit.
Mark Rose is offline   Reply With Quote
Old 2017-06-22, 00:33   #7
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009

19·101 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
Four threads will fully utilize the CPU though. You're making more heat for no benefit.
I need to start looking at this by physical cores and not logical. So I changed the settings:

Code:
CoresPerTest=2
WorkerThreads=2
In the new snip below, the utilization is 100%, at the top-right. Below the graphs it is roughly half. Four threads are idling, but not cores. Temps are in the mid 60;s. The throughput from Prime95 is virtually unchanged.
Attached Thumbnails
Click image for larger version

Name:	cpu_snip.JPG
Views:	93
Size:	89.4 KB
ID:	16300  
storm5510 is offline   Reply With Quote
Old 2017-06-22, 01:49   #8
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Quote:
Originally Posted by storm5510 View Post
I need to start looking at this by physical cores and not logical.
Good way to think about Prime95.

Quote:
Originally Posted by storm5510 View Post
Below the graphs it is roughly half. Four threads are idling, but not cores.
OS Task Managers don't generally tend to consider the effect of hyperthreads on actual silicon usage.
Dubslow is offline   Reply With Quote
Old 2017-06-22, 01:56   #9
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

3·977 Posts
Default

Quote:
Originally Posted by storm5510 View Post
I need to start looking at this by physical cores and not logical. So I changed the settings:

Code:
CoresPerTest=2
WorkerThreads=2
In the new snip below, the utilization is 100%, at the top-right. Below the graphs it is roughly half. Four threads are idling, but not cores. Temps are in the mid 60;s. The throughput from Prime95 is virtually unchanged.
See? Much better.

For what it's worth, your benchmarks show you'll get the most performance running a single worker. The difference is small but it will add up to an extra LL assignment per year if you leave Prime95 running all the time.

Last fiddled with by Mark Rose on 2017-06-22 at 01:56
Mark Rose is offline   Reply With Quote
Old 2017-06-22, 02:48   #10
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009

19×101 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
See? Much better.

For what it's worth, your benchmarks show you'll get the most performance running a single worker. The difference is small but it will add up to an extra LL assignment per year if you leave Prime95 running all the time.
With this setup, Prime95 can do a DC in 36 hours. However, this is in the 40M range. I don't know what the server is passing out at this point.

I'm looking at the Exponent Status Distribution page on mersenne.org in another tab. I see a lot of available double-check's from 46M to 76M. PM1's are heavy from 82M through 146M.

I am going to make some changes here!

Update: I have changed my Prime95 configuration to do DC's. The server assigned three tests in the 45M range. Prime95 can complete each in 36 hours. The ms/iter is 2.9, give or take a very tiny bit. This is one worker and four cores per test. This seems to be the optimal spot.

Last fiddled with by storm5510 on 2017-06-22 at 03:12
storm5510 is offline   Reply With Quote
Old 2017-06-23, 06:34   #11
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

CDD16 Posts
Default

Quote:
Originally Posted by storm5510 View Post
With this setup, Prime95 can do a DC in 36 hours. However, this is in the 40M range. I don't know what the server is passing out at this point.

I'm looking at the Exponent Status Distribution page on mersenne.org in another tab. I see a lot of available double-check's from 46M to 76M. PM1's are heavy from 82M through 146M.

I am going to make some changes here!

Update: I have changed my Prime95 configuration to do DC's. The server assigned three tests in the 45M range. Prime95 can complete each in 36 hours. The ms/iter is 2.9, give or take a very tiny bit. This is one worker and four cores per test. This seems to be the optimal spot.
The double checks will be a great way to verify everything is working okay. The low end of double checks is currently working exponents in the 40-41M range.

You can look at the 4 (5 if we count zero) categories of assignments on this page and the rules on how long you have to finish, who qualifies to get what category of work, and a way to opt-in to getting "priority" work (smallest available exponents as long as you promise to finish them promptly).
https://www.mersenne.org/thresholds/
Madpoo is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to reduce number of worker windows? Chuck PrimeNet 7 2011-07-03 19:17
Auto scroll and auto maximize worker windows ? wyattwong Software 3 2011-02-25 20:20
worker windows: one per physical/logical core? ixfd64 Software 2 2010-12-09 17:38
Worker Windows - Optimal settings Unregistered Information & Answers 4 2010-07-30 21:49
Worker Windows question joblack Software 1 2009-01-02 00:24

All times are UTC. The time now is 09:08.

Thu Apr 15 09:08:38 UTC 2021 up 7 days, 3:49, 0 users, load averages: 1.05, 1.24, 1.35

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.