![]() |
![]() |
#1 |
Jul 2003
So Cal
2,083 Posts |
![]()
I've been playing with msieve linear algebra on Knights Landing cpus. Specifically, each compute node has one Intel(R) Xeon Phi(TM) CPU 7250 @ 1.40GHz. This processor has 68 cores in 34 tiles, each with 4 threads, for a total of 272 threads per node.
I compiled msieve with MPI support using icc with the -xMIC-AVX512 option. This worked just fine. I also tried disabling the ASM instructions and using just the C code to see if the compiler would vectorize using AVX-512, but the resultant binary was slightly slower. Trying out different parameters, I get by far the best performance with one MPI process per tile with 8 threads per process. So with one compute node, the best layout is a 2x17 MPI grid with 8 threads. Here is a table of estimated runtimes on a 42.1M matrix: Code:
cores nodes time (hrs) 68 1 444 136 2 233 272 4 131 544 8 83 1088 16 46 2176 32 33 Would explicit use of AVX-512 speed up the matmul? |
![]() |
![]() |
![]() |
#2 |
Tribal Bullet
Oct 2004
33·131 Posts |
![]()
Probably, the scatter-gather instructions could be useful. Using 512-bit vectors explicitly in block Lanczos may or may not be faster, the vector-vector operations would need hugely more memory for precomputations.
|
![]() |
![]() |
![]() |
#3 |
Jul 2003
So Cal
208310 Posts |
![]()
Turns out KNL doesn't like a nearly symmetric grid. In the table above, I had run 544 cores as a 16x17 grid, but instead using an 8x34 grid runs nearly 10% faster. Therefore I have also removed the 2176 core run, which used a 32x34 grid.
Code:
cores nodes time (hrs) 68 1 444 136 2 233 272 4 131 544 8 76 1088 16 46 2176 32 ?? BTW, the last half of the 2,1285- linear algebra was run using the KNL nodes, so it works correctly. ![]() |
![]() |
![]() |
![]() |
#4 |
Tribal Bullet
Oct 2004
33·131 Posts |
![]()
I saw, that was awesome. The maximum grid size is just a definition in the code, but also controls the size of a binary file, so once you change the definition you will be binary incompatible with previous savefiles.
(Just change MAX_MPI_GRID_DIM in common.h) |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Msieve on a Mac (Help) | pxp | Msieve | 1 | 2013-02-28 14:56 |
Using msieve with c | burrobert | Msieve | 9 | 2012-10-26 22:46 |
msieve help | em99010pepe | Msieve | 23 | 2009-09-27 16:13 |
fun with msieve | masser | Sierpinski/Riesel Base 5 | 83 | 2007-11-17 19:39 |
Msieve 1.10 | RedGolpe | Msieve | 6 | 2006-09-07 12:56 |