Some more timings on Linux with no X:
Athlon 1050MHz
13.014 ms/iteration llr362
13.028 ms/iteration llr35
...new LLR slightly faster (0.1%)
Athlon XP1600+
9.432 ms/iteration llr362
9.401 ms/iteration llr35
...old LLR faster (0.3%)
Athlon XP2000+
8.278 ms/iteration llr362
8.250 ms/iteration llr35
...old LLR faster (0.3%)
So it seems on AthlonXPs (not ordinary old Athlons) it is better to run LLR35