When GMPECM is faster for Stage 2 with a higher B2, there is only advantage from using it. Save time, get more work what's not to like?

If it were comparing curveforcurve, sure, but it's not that straightforward. avxecm stage 2 may be slower and smaller B2, but it is doing 8 curves at once in that time.
Expanding on the example from before, with B1=7M on 2^12771:
avxecm stage 1: 78 sec for 8 curves, 9.75 sec/curve
avxecm stage 2: 55 sec for 8 curves, 6.87 sec/curve
gmpecm stage 1: 60 sec for 1 curve
gmpecm stage 2: 32 sec for 1 curve
Now, each avxecm stage 2 curve is 2.8 times less likely to find a factor, but if you had 64 seconds to spend doing something, would you rather run 2 gmpecm stage 2 curves or 8 avxecm stage 2 curves? Even with each avxecm being 2.8 times less likely to find a factor, you'd be better off.
The difficult tradeoff would be on larger numbers, when the avxecm stage 2 is likely faster than GMPECM but a smaller B2. That leads to some need for study.

The same math as above applies. I think you are better off running pure avxecm until time/curve * probabilitymultiplier is greater than gmpecm's stage 2 runtime.
It gets interesting with generic inputs, when the throughput advantage of avxecm is not as great. Then, I think it makes sense to run avxecm stage 1 followed by 8 gmpecm stage 2's. Then you get all of the probability advantage of gmpecm stage 2 but with avxecm's higher stage 1 throughput. I am putting together some data on this case.