Default Running Prime95 on a 16-core CPU — one 16-threaded worker or 16 single-threaded workers?


Since yesterday I'm trying my hand at GIMPS using spare cycles of my new 16-core workstation. I'm running the official build of Prime95 v29.8b7 on Linux.

When configuring Prime95 for the first time, I was presented with a choice of how many work threads I wanted to create and how many cores to allocate to each work thread (I suppose that a "work thread" is a misnomer?). Originally I chose to run 16 single-threaded workers, which subsequently received 16 different double-check assignments and started computing away at 42-45 ms/iter.

Later, I decided to experiment a bit and reconfigured Prime95 to run a single work thread using all 16 available cores. This yielded a computation speed of 1.5 ms/iter, i. e. a more than 16x speedup (meaning that 16 original assignments in total would finish quicker than if I ran them in parallel).

Hence three questions:
  • am I right to conclude that running a single multi-threaded worker is in my case better than 16 single-threaded workers?
  • is this expected behavior?
  • are there any other kind of threading recommendations for similar multi-core machines?

