![]() |
![]() |
#1 |
Jun 2003
7×167 Posts |
![]()
Is there a way to handle the carry flag in C or C++?
It would be nice to do something like this: Code:
uint_64* aptr, bprt; ... for (i=start;i<end;i++) aptr[i]+c=bptr[i]; The only way I can think of actually doing this in c would be to handle the carries explicitly, essentially reimplementing at the high level what the hardware is perfectly capable of doing far more efficiently by itself. |
![]() |
![]() |
![]() |
#2 | |
Undefined
"The unspeakable one"
Jun 2006
My evil lair
23×32×5×17 Posts |
![]()
No. Unless you use inline asm.
Quote:
BTW: You could use inline asm. |
|
![]() |
![]() |
![]() |
#3 |
Jun 2003
49116 Posts |
![]() |
![]() |
![]() |
![]() |
#4 |
Just call me Henry
"David"
Sep 2007
Cambridge (GMT/BST)
5,857 Posts |
![]() |
![]() |
![]() |
![]() |
#5 |
Undefined
"The unspeakable one"
Jun 2006
My evil lair
23×32×5×17 Posts |
![]()
Create your own language based upon C but supporting access to the hardware CPU carry flag. Call it C+c, or something else more catchy.
BTW: You need to be careful about making sure the CPU you compile your new language to uses the carry in the same way as you expect. Different CPUs generate the carry differently (like ARM vs x86 for instance). |
![]() |
![]() |
![]() |
#6 | |
Jun 2003
22218 Posts |
![]() Quote:
But that's as maybe. The point is that I'm really not familiar with the architecture of a 21st century core i5 to feel confident of doing it right, and still less of doing it even half way optimal. One thing I could to would be write the program without the carry functionality. Just ignore it. Then disassemble the resulting code and replace the add instructions with add-with-carrys Dunno if that would work, but it's a plan. |
|
![]() |
![]() |
![]() |
#7 |
Just call me Henry
"David"
Sep 2007
Cambridge (GMT/BST)
10110111000012 Posts |
![]()
I imagine there are people on the forum willing to help with a good macro.
|
![]() |
![]() |
![]() |
#8 |
Tribal Bullet
Oct 2004
33·131 Posts |
![]()
Accessing the carry flag is easy using inline asm, but it's less easy to come up with the asm that does exactly what you want. For example, you have a loop:
for (i = 0; i < n; i++) { asm(add_with_carry_here) } It does what you want, but using the carry flag will only work inside the loop and if it's not implemented exactly correct then the arithmetic for handling 'i' will silently mess up the carry flag. Even if it's implemented correctly there could be a lot of register shuffling around the magic use-the-carry instruction. Likewise, since the carry is implicit on x86 you can use it, but actually saving it will take one or two more instructions that may negate the benefit of dropping to this level. Of course, many processors don't have a carry flag at all (MIPS, Alpha), some have one carry flag but lots of other flag registers (PowerPC) and ARM has a carry flag that works the opposite way sometimes. For an example of inline asm that has an entire multiple precision add loop in it, where the arrays are of fixed size, see the mp_add and mp_sub functions here. Last fiddled with by jasonp on 2013-03-27 at 17:10 |
![]() |
![]() |
![]() |
#9 |
∂2ω=0
Sep 2002
República de California
3×53×31 Posts |
![]()
I'd say it boils down to a binary decision:
0. It's performance-critical: Use inline ASM. [Hey, if I can learn it after age 40, there may be hope for us yet.] 1. It's not performance-critical: Emulate using unsigned integer compare, e.g. uint *a,*b, cy = 0; loop over i: a[i] += b[i] + cy; cy = a[i] < cy; /loop This often proves faster - especially when mixed with other code - even on CPUs supporting a carry flag, due to the easier job it makes in terms of scheduling. Last fiddled with by ewmayer on 2013-03-27 at 19:00 |
![]() |
![]() |
![]() |
#10 | |
Jun 2003
7×167 Posts |
![]() Quote:
Even if you interpret the < as a signed inequality, it won't work. For example, if a[i]=b[i] = 101010102 adding b[i] to a[i] will generate a carry, but the sum will be 0101010c2, a positive number. A bitwise test for a carry is rather difficult to do. Basically uint a + uint b + carry-in will generate a carry-out if and only if at least one of the following is true: 1. The result is zero 2. The most significant zero bit of a XOR b corresponds to 1 in a (or equivalently in b) I can't see a way to test for this without multiple test and branches. Here's how it probably should be done: short_uint *a,*b long_uint cy= 0; loop over i: cy+=a[i]; cy+=b[i]; a[i] = cy; Shift right cy, (size of short_unit); /loop [/QUOTE] |
|
![]() |
![]() |
![]() |
#11 |
Jun 2003
2×33×7×13 Posts |
![]()
I think "cy = a[i] < b[i]" is the correct version
EDIT:- Even this version has a corner case that it can't handle. Hmmm... Last fiddled with by axn on 2013-03-28 at 04:19 |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Which SIMD flag to use for Raspberry Pi | BrainStone | Mlucas | 14 | 2017-11-19 00:59 |
GMP-ECM with --enable-openmp flag set in configure = bad results? | GP2 | GMP-ECM | 3 | 2016-10-16 10:21 |
Work flag | richs | YAFU | 11 | 2016-01-30 14:27 |