View Single Post
Old 2022-03-04, 15:15   #22
kriesel's Avatar
Mar 2017
US midwest

1CCC16 Posts
Default Xeon Phi, added DIMMs, and Windows combo incompatible with performance


On a Xeon Phi 7250 equipped Windows 10 Pro 21H2 19044.1526 system,
ordinary launch of prime95 (double click shortcut):
w1 has PRP of M577215631, w2 has PRP of M332277943 (32256K and 18M fft respectively)
with 16 GiB MCDRAM only, w1 ~45 ms/iter, w2 ~14 ms/iter (but will sometimes flip to a much slower mode where there are lots of system interrupts; must typically shutdown / restart OS to clear that, although occasionally it clears up on its own)
with 192 GiB of DIMMs w1 120 ms/iter, w2 47 ms/iter. This is of order 1/3 to 1/4 the performance of MCDRAM-only. IIRC Ernst reported similar for Mlucas stage 1 on F33 in Linux when running on 6 DIMMs. But it gets worse from here, with fewer DIMMs. (And may on Linux also.)

at cmd line, start /node 0 prime95 (in working directory)
with 192GB of DIMMs in slots A-F, w1 120 ms/iter, w2 48 ms/iter
(same as ordinary launch)

at cmd line, start /node 1 prime95
with 192 GiB of DIMMs in slots A-F
results in following error message instead of a launch:
"The system cannot execute the specified program."

Apparently Windows does not allow controlling on which RAM the application runs, MCDRAM or DIMM, except by physical removal of added DIMMs to force use of MCDRAM. Task Manager is confused by the presence of the DIMMs, indicating too many cores and wrong amounts of multiple levels of cache. Prime95 does not appear to share that confusion, but it turns out to still be badly impacted.

return to ordinary prime95 launch method:
with 96 GiB of DIMMs in slots A-C, w1 906 ms/iter, w2 353 ms/iter, only using 1.8% of cpu indicated in Task Manager instead of usual ~30%; retry gives 907 ms/iter w1 & 370 ms/iter w2, 22% total cpu usage with a lot of system overhead; 3rd try ~30% cpu usage mostly system overhead, w1 907. ms/iter, w2 370. ms/iter; did not respond to remote access attempts, used awkward access & local low-res console
with 64 GiB of DIMMs in slots A-B, w1 1300. ms/iter, w2 615. ms/iter
with 64 GiB of DIMMs in slots A, D, w1 346. ms/iter, w2 136. ms/iter
with 32 GiB of DIMM in slot A, w1 2844. ms/iter, w2 1135. ms/iter
reverted to 16 GiB MCDRAM only, w1 46.5 ms/iter, w2 12.2 ms/iter
note, late in the test sequence Task Manager was found paused; don't know how many of the above that affected. It seems to slip into that on its own sometimes.

Note, limited experimentation with a single 64 GiB DIMM on the 7210 showed similar issues.

kriesel is online now