![]() |
![]() |
#45 | |
"Andrew Booker"
Mar 2013
1258 Posts |
![]() Quote:
Anyway, the point is that the 63-bit mulredc works as long as the product does not exceed 127 bits. That means you can afford one unmodded addition before multiplying (so mulredc(a,b+c) is OK, but mulredc(a+b,c+d) is not). |
|
![]() |
![]() |
![]() |
#46 | |
"Dana Jacobsen"
Feb 2011
Bangkok, TH
22×227 Posts |
![]() Quote:
Technically Ben wrote the 63-bit asm, then after I fruitlessly fiddled a bit and mentioned I'd like a clean 64-bit version, he wrote the 64-bit version. I put the switch inside the function, and noted that ideally for performance the caller would choose the correct one. Probably due to good branch prediction there didn't seem to be much penalty for doing it the easy way (switch inside the function). One could of course just use the 64-bit version but that does give a measurable performance difference and doing the switch gives me almost the best of both. Re the sequence x*x+a vs. x*(x+a) it gives me about a 5% speedup in my code and works great up to 63-bit. It does not work right for 64-bit inputs (where x*x+a does work since this is with a 64-bit mulredc). Last fiddled with by danaj on 2017-10-12 at 23:36 |
|
![]() |
![]() |
![]() |
#47 | |
"Ben"
Feb 2007
3,371 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#48 | ||
"Andrew Booker"
Mar 2013
5·17 Posts |
![]() Quote:
Quote:
|
||
![]() |
![]() |
![]() |
#49 |
"Andrew Booker"
Mar 2013
10101012 Posts |
![]()
With 64-bit inputs, all additions have to be addmods, since the computation of a+b can overflow. There is one exception to that: when x is at most n-1 you can compute mulredc(x,x+1) without risk of overflow. The problem is that 1 is not Montgomery's 1, but if you don't care about using the same polynomial every time (which is fine if we're thinking of it as a randomized algorithm and are happy to allow occasional failures) then you could get this speedup for 64-bit inputs as well.
|
![]() |
![]() |
![]() |
#50 |
"Forget I exist"
Jul 2009
Dumbassville
26×131 Posts |
![]()
and so can unrolling loops ( at least using pure asm instead of compiler optimization. in most asm, loops are if statements enclosed in a header, that can be jumped back to to create a piece of code that runs multiple times sometimes if you know how many runs you are doing of it unrolling it a bit makes sense and can speed it up as jmp/jne/jle/etc. commands can be expensive. so depending on how fast the code inside is you may be able to do it a few more times without using a jump and speed it up.
|
![]() |
![]() |
![]() |
#51 | |
"Ben"
Feb 2007
1101001010112 Posts |
![]() Quote:
Last fiddled with by bsquared on 2017-10-13 at 14:38 |
|
![]() |
![]() |
![]() |
#52 |
"Andrew Booker"
Mar 2013
5·17 Posts |
![]()
Good point! In fact for n > 2^63, Montgomery's 1 is just 2^64-n. So x+one never overflows, and mulredc(x,x+one) should work as expected even with 64-bit inputs.
|
![]() |
![]() |
![]() |
#53 | ||
"Dana Jacobsen"
Feb 2011
Bangkok, TH
16148 Posts |
![]() Quote:
Quote:
|
||
![]() |
![]() |
![]() |
#54 | |
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
160658 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#55 |
"Forget I exist"
Jul 2009
Dumbassville
100000110000002 Posts |
![]()
I know such things exist ( at least by name), but I also was replying to a specific statement, that seemed to be a misunderstanding by arbooker. The statement suggest, they think that just because a program has more code to it, it should take longer to execute.
Last fiddled with by science_man_88 on 2017-10-13 at 22:41 |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Factoring Mersenne numbers | paulunderwood | Miscellaneous Math | 18 | 2017-08-27 14:56 |
Factoring Big Numbers In c# | ShridharRasal | Factoring | 10 | 2008-03-20 17:17 |
Question about factoring code | Peter Nelson | Software | 9 | 2005-03-30 18:28 |
Factoring Fermat Numbers | Axel Fox | Software | 14 | 2003-07-04 18:57 |
Questions about SSE2 code and Factoring | Joe O | Software | 2 | 2002-09-13 23:39 |