. I noticed pretty frequently I see messages like “error greater than 0.4xxx, trying again with the previous save file” and then “it seems like the problem went away, moving on.” Maybe that has something to do with FFT size and my current assignments around 83-84 million?

No that probably has nothing to do with the fft size. I had the same error message, but could get rid of it completely by lowering the core clock by 150 MHz and the memory clock by 500 mhz.
