OK, so it restarted successfully, about 15% of the way through the 2nd pass of 16 (the passes are numbered 0-15, so pass 1 means the 2nd pass).

Do you have an estimate of how long Prime95 needs to do all 16 passes to the same bitdepth on the same hardware? I expect some sse2 optimizations of the floating-point-based versions of the factoring modules used by the code on hardware like yours (as indicated by the 'USE_FLOAT = 1' part of the informational diagnostics) should yield something on the order of a 2x speedup, but if the discrepancy vs. Prime95 is significantly greater than 2x on this hardware, then clearly something more would be required to approach parity.

Also, it would be great if someone could run at least pass 1 of the same test on an AMD64, so we can gauge the speed difference between the all-integer code used on that platform (which has excellent integer multiply capability) vs. the generic floating-point code running on 32-bit x86.
