mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > PrimeNet

Reply
 
Thread Tools
Old 2014-09-27, 04:20   #1
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

23×3×199 Posts
Default Oh Brother, What betid to mine Haswell 4770?

Here are the specs from the Computer Properties screen.

Code:
Software Version Windows64,Prime95,v28.5,build 2 
Model Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz 
Features 4 core, hyperthreaded, Prefetch,SSE,SSE2,SSE4,AVX,AVX2,FMA, 
Speed 3.435 GHz (21.359 GHz P4 effective equivalent) 
L1/L2 Cache 32 / 256 KB 
Computer Memory 4008 MB   configured usage 800 MB day / 800 MB night
And here is a summary of the work and timings of the 4 workers over the last week or so. I noted the times when 1 or more workers iteration times changed a lot.
Code:
 Date/Time	#1  TF and DC	#2 TF      	#3 DC 	        #4 LL
18/09/2014 8:19	TF 490 Sec	TF Unknown      36.2M 17 Ms	67.8M 33 Ms
19/09/2014 0:12	TF 480 Sec	TF Unknown      33.2M 16 Ms	67.8M 33 Ms
19/09/2014 17:54	TF 475 Sec	TF Unknown	33.2M 16 Ms	67.8M 52 Ms
21/09/2014 20:21	35.7M 26 Ms	TF 470 Sec	33.2M 24 Ms	67.8M 80 Ms
22/09/2014 17:55	35.7M 24 Ms	TF 500 Sec	33.2M 23 Ms	67.8M 48 Ms
23/09/2014 17:56	35.7M 25 Ms	TF 490 Sec	33.2M 23 Ms	67.8M 50 Ms
25/09/2014 17:57	35.7M 24 Ms	TF 495 Sec	33.2M 30 Ms	67.8M 48 Ms
The questions that come to mind in no specific order of importance:
1. Almost every time all the workers stop before they send new end dates....well almost. Not on the 24th. Why are they stopping?

2. What would cause such a drastic increase in iteration times in Workers #3 and #4 when Worker #1 changes from TF to DC. Sept 21 17:55? I thought Haswell (as with Ivy and Sandy and all i-series) were much better at channel capacity and worker independence.

3. When worker #4 changed from 33 to 52 there were NO changes in work on the other 3 workers. I might just chalk that one up to external forces on the PC. Though it seemed to increase to an iteration time that is where it is consistently now.

4. Granted Benchmarks are "perfect" situations... that being said my times are WAY WAY above the benchmark I ran only a few weeks ago. About 10 Ms for the 35M DC and 20Ms for the 68M LL.

5. Could slower RAM make SUCH a big difference? Considering the TF times very little suggest to me RAM is NOT the issues...I may be wrong.

6. In a few weeks worker #2 will also be doing LL ..... are they all going to get SLOWER yet?

Or simply give me some hints of where to start looking....

Last fiddled with by petrw1 on 2014-09-27 at 04:22
petrw1 is online now   Reply With Quote
Old 2014-09-27, 04:37   #2
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

167268 Posts
Default

Have you turned off hyperthreading in the BIOS? I suspect two LL tests are getting assigned to the same physical core.
Prime95 is offline   Reply With Quote
Old 2014-09-27, 04:42   #3
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

10010101010002 Posts
Default

So setting 1 core per worker is not enough for Haswell?

Not sure I can change the BIOS. It is a "borg".
Any other ways around it.?

Thanks
petrw1 is online now   Reply With Quote
Old 2014-09-27, 04:51   #4
sdbardwick
 
sdbardwick's Avatar
 
Aug 2002
North San Diego County

13×53 Posts
Default

With HT and Windows, I always end up playing around with AffinityScramble2 to make sure threads don't share physical cores.
For example, I had to set my 2600K running 4 workers to
Code:
AffinityScramble2=02461357
sdbardwick is offline   Reply With Quote
Old 2014-09-27, 04:51   #5
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

11101110101102 Posts
Default

Quote:
Originally Posted by petrw1 View Post
So setting 1 core per worker is not enough for Haswell?
It should be, but prime95's hyperthread detection does not always work. Can you run task manager and see if two workers are running on one core?
Prime95 is offline   Reply With Quote
Old 2014-09-27, 16:10   #6
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

37×79 Posts
Default

Quote:
Originally Posted by sdbardwick View Post
With HT and Windows, I always end up playing around with AffinityScramble2 to make sure threads don't share physical cores.
For example, I had to set my 2600K running 4 workers to
Code:
AffinityScramble2=02461357
You may wish to use 13570246 instead. The first CPU core in x86 usually handles more interrupts, so having it free to handle those is an advantage.
Mark Rose is offline   Reply With Quote
Old 2014-09-29, 22:48   #7
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

23·3·199 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
You may wish to use 13570246 instead. The first CPU core in x86 usually handles more interrupts, so having it free to handle those is an advantage.
So this didn't make a difference...I suspect I still have 2 tests running on the same physical core. I still need to verify this.

Could it be their Cores (Physical and HT) are numbered different?
For example (completely made up guess) ... maybe Physical Core 0's HT partner is 7 ( 1 is 6, etc) ...
How could I find out?

Or could it even be that Haswell has in a way randomized how it numbers them based on the work load?
petrw1 is online now   Reply With Quote
Old 2014-09-29, 23:23   #8
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2·3·19·67 Posts
Default

Quote:
Originally Posted by petrw1 View Post
Could it be their Cores (Physical and HT) are numbered different?
For example (completely made up guess) ... maybe Physical Core 0's HT partner is 7 ( 1 is 6, etc) ...
How could I find out?
Set DebugAffinityScramble=1 in prime.txt. At startup, prime95 will output its calculations trying to determine logical/physical CPUs.

Prime95 does this by running some code it thinks should take 100K clock cycles. It then puts a logical CPU in a busy loop and times this 100K code on the other 7 logical CPUs. The theory is that 6 logical CPUs will time at 100K and one will time at 200K. Then the busy loop logical CPU and the 200K logical CPU are on one physical core.
Prime95 is offline   Reply With Quote
Old 2014-09-30, 04:36   #9
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

12A816 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Set DebugAffinityScramble=1 in prime.txt. At startup, prime95 will output its calculations trying to determine logical/physical CPUs.

Prime95 does this by running some code it thinks should take 100K clock cycles. It then puts a logical CPU in a busy loop and times this 100K code on the other 7 logical CPUs. The theory is that 6 logical CPUs will time at 100K and one will time at 200K. Then the busy loop logical CPU and the 200K logical CPU are on one physical core.
ok will do ... I need to get someone else to do this .... not as geeky.

Will it simply output this to results.txt or do I need to look at the actual window that runs it?

AND....

Just so I get it right once I know the pairs is the proper way to record AffinityScramble2=
A). In Physical/Logical pairs
B). All the Physical then all the Logical

i.e. if 0 Physical is with 4 Logical; and 1 with 5; 2 with 6; 3 with 7. Do I code
AffinityScramble2=04152637 (This is my guess)
OR
AffinityScramble2=01234567

Last fiddled with by petrw1 on 2014-09-30 at 04:52
petrw1 is online now   Reply With Quote
Old 2014-09-30, 17:29   #10
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

23·3·199 Posts
Default

Code:
[Main thread Sep 30 11:23] Test clocks on logical CPU #1: 214592
[Main thread Sep 30 11:23] Logical CPU 2 clocks: 407000
[Main thread Sep 30 11:23] Logical CPU 3 clocks: 214576
[Main thread Sep 30 11:23] Logical CPU 4 clocks: 214720
[Main thread Sep 30 11:23] Logical CPU 5 clocks: 214576
[Main thread Sep 30 11:23] Logical CPU 6 clocks: 214712
[Main thread Sep 30 11:23] Logical CPU 7 clocks: 214608
[Main thread Sep 30 11:23] Logical CPU 8 clocks: 214856
[Main thread Sep 30 11:23] Test clocks on logical CPU #3: 214576
[Main thread Sep 30 11:23] Logical CPU 4 clocks: 201806
[Main thread Sep 30 11:23] Logical CPU 5 clocks: 113962
[Main thread Sep 30 11:23] Logical CPU 6 clocks: 114196
[Main thread Sep 30 11:23] Logical CPU 7 clocks: 113964
[Main thread Sep 30 11:23] Logical CPU 8 clocks: 114040
[Main thread Sep 30 11:23] Test clocks on logical CPU #5: 114028
[Main thread Sep 30 11:23] Logical CPU 6 clocks: 177253
[Main thread Sep 30 11:23] Logical CPU 7 clocks: 93538
[Main thread Sep 30 11:23] Logical CPU 8 clocks: 93586
[Main thread Sep 30 11:23] Test clocks on logical CPU #7: 93583
[Main thread Sep 30 11:23] Logical CPU 8 clocks: 177235
[Main thread Sep 30 11:23] Logical CPUs 1,2 form one physical CPU.
[Main thread Sep 30 11:23] Logical CPUs 3,4 form one physical CPU.
[Main thread Sep 30 11:23] Logical CPUs 5,6 form one physical CPU.
[Main thread Sep 30 11:23] Logical CPUs 7,8 form one physical CPU.
[Main thread Sep 30 11:23] Starting workers.
So this tells me I want
AffinityScramble2=02461357 (or 13570246)


Correct???

Turns out the program wasn't completely stopped/started yesterday so I don't believe the above changes actually took effect...stay tuned...
petrw1 is online now   Reply With Quote
Old 2014-10-01, 01:06   #11
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

23×3×199 Posts
Default So here is where I am at....

My worker doing 67M LL is still getting iteration times of 50Ms. It was at 33Ms when only 1 other worker was doing DC and the rest were doing TF.
So I suspect I still don't have it right....

1. DebugAffinityScramble determined the following CPU Pairings.
Code:
Logical CPUs 1,2 form one physical CPU.
Logical CPUs 3,4 form one physical CPU.
Logical CPUs 5,6 form one physical CPU.
Logical CPUs 7,8 form one physical CPU.
Do I correctly assume that once it runs it will use that knowledge to assign the correct CPUs to each worker so that each gets a separate physical core?
Or is it strictly informational and I use that knowledge as I see fit to set AffinityScramble2?
What if I also have AffinityScramble2 set? Which setting takes precedence?

2. I tried to set AffinityScramble2 but I think I screwed up.
But is that discussion even relevant if the DebugAffinityScramble forced the correct worker/CPU settings?

3. Turns out the AffinityScramble2 I had the person enter likely did NOT take affect because Prime95 was not exited and restarted to grab the new settings. It was only a stop all/start all workers.
Am I correct here that it did not take affect?

4. Furthermore I suspect it was placed in the wrong place in local.txt. I incorrectly said it could go "anywhere" in that file. It was placed at the very end within the [Worker #4] section.
Can I assume it would have been ignored even if Prime95 was completely exited/restarted?

5. I had it set as 13570246. Is this a correct setting based on what DebugAffinityScramble determined?
Or is it not? By putting the HT cores all first will that cause Prime95 to assign the work to the HT cores instead of the Physical cores?

BOTTOM LINE:
Should I simply use the output from DebugAffinityScramble to set AffinityScramble2 correctly?
What is correct? 02461357? 13570246? something else?

OR should I leave in DebugAffinityScramble and remove AffinityScramble2?
petrw1 is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Computer isn't mine... CuriousKit PrimeNet 25 2016-06-02 05:44
gpu72 site - exp 78227507 credited but not mine? dh1 GPU to 72 1 2015-11-29 14:03
Your end or mine? davieddy Lounge 0 2011-12-11 11:31
Hey brother, can you help a friend in need? petrw1 Math 3 2008-03-30 14:20
some questions of mine, in general jerico2day Software 5 2005-03-30 09:19

All times are UTC. The time now is 23:29.


Tue Oct 26 23:29:48 UTC 2021 up 95 days, 17:58, 1 user, load averages: 1.68, 1.50, 1.37

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.