View Single Post
Old 2013-03-19, 11:28   #3
alpertron's Avatar
Aug 2002
Buenos Aires, Argentina

25518 Posts

I found that the main problem in most ARM processors is that they do not include division instruction, so these are simulated by software. This means that a division is equivalent to about 100 multiplications (360 clocks vs 3 clocks for 32-bit operands).

I'm starting to replace division and remainder calculation with multiplications where possible (for example, using Montgomery multiplication). Up to this moment I only used Montgomery multiplication for multiword operands. But it is clear that the algorithm will be needed even for 32-bit operands.
alpertron is offline   Reply With Quote