Meanwhile I've found the time to test your VBA code on my workplace PC.
While I'm getting only 550e6/s (indicating that my PC is slower than yours), I can confirm that the two programs are comparable in speed.
Moreover, since both programs yield identical results it gives us further confidence that our independent and different algorithms and implementations are both working correctly.