View Single Post
Old 2021-09-23, 19:59   #445
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22×7×211 Posts
Default

Assuming the ~p2.1 scaling also applies to GCD operations, and you're doing ~106M P-1, there's a factor of ~4.2 unexplained difference in GCD speed in your favor. Maybe faster cores giving faster GCDs, and correspondingly faster stages too.

Timing I gave for large exponent was using ~10GB in stage 2, prime95 V30.6b4.


edit: chalsall's small exponent ~27.4M more than explains the rest of the speed ratio. 5.05sec x 2 /2hr29min = 0.11% potential speedup for him. Except, i3-9100 is 4-core no hyperthreading. Gpouwl's parallelism came about because Mihai took pity on my multi-Radeon VII/slow-cpu-forGCD P-1 factory, which spent ~5 minutes of a 40 minute wavefront P-1 factoring in single-cpu-core GCD with the GPU idle and waiting. System didn't have enough max ram to support dual-instance P-1 on its GPUs to mitigate it. 40/35 = 14.% P-1 speedup via speculative parallelism. As always, George's call what is worth George's time, and not worthwhile.

Last fiddled with by kriesel on 2021-09-23 at 20:30
kriesel is offline   Reply With Quote