mersenneforum.org C and the scarry flag
 Register FAQ Search Today's Posts Mark Forums Read

 2013-03-27, 09:45 #1 Mr. P-1     Jun 2003 7×167 Posts C and the scarry flag Is there a way to handle the carry flag in C or C++? It would be nice to do something like this: Code: `uint_64* aptr, bprt; ... for (i=start;i
2013-03-27, 10:58   #2
retina
Undefined

"The unspeakable one"
Jun 2006
My evil lair

10110111101112 Posts

Quote:
 Originally Posted by Mr. P-1 Is there a way to handle the carry flag in C or C++?
No. Unless you use inline asm.
Quote:
 Originally Posted by Mr. P-1 The only way I can think of actually doing this in c would be to handle the carries explicitly, essentially reimplementing at the high level what the hardware is perfectly capable of doing far more efficiently by itself.
Yes. Unless you use inline asm.

BTW: You could use inline asm.

2013-03-27, 13:45   #3
Mr. P-1

Jun 2003

22218 Posts

Quote:
 Originally Posted by retina No. Unless you use inline asm.Yes. Unless you use inline asm. BTW: You could use inline asm.
I was hoping to avoid having to use inline asm.

2013-03-27, 13:48   #4
henryzz
Just call me Henry

"David"
Sep 2007
Cambridge (GMT/BST)

10110011101002 Posts

Quote:
 Originally Posted by Mr. P-1 I was hoping to avoid having to use inline asm.
If you just want to avoid having the inline asm everywhere you could use an inline asm macro. If it is that you don't want to be compiling asm then no can do.

2013-03-27, 13:56   #5
retina
Undefined

"The unspeakable one"
Jun 2006
My evil lair

5,879 Posts

Quote:
 Originally Posted by Mr. P-1 I was hoping to avoid having to use inline asm.
Create your own language based upon C but supporting access to the hardware CPU carry flag. Call it C+c, or something else more catchy.

BTW: You need to be careful about making sure the CPU you compile your new language to uses the carry in the same way as you expect. Different CPUs generate the carry differently (like ARM vs x86 for instance).

2013-03-27, 14:49   #6
Mr. P-1

Jun 2003

7×167 Posts

Quote:
 Originally Posted by henryzz If you just want to avoid having the inline asm everywhere you could use an inline asm macro. If it is that you don't want to be compiling asm then no can do.
I haven't written any serious asm in a long, long, and it wasn't Intel. It was for a Motorola M68010, which was a beautiful processor from the programmer's point of view. It had a flat address space, and largely orthogonal instruction set, at a time when the comparable Intel chip, probably an 80286 or thereabouts was a total mess in terms of it's segmented address space, and which register you could use for what. I've always thought it a crying shame that Intel and not Motorola won the processor war.

But that's as maybe. The point is that I'm really not familiar with the architecture of a 21st century core i5 to feel confident of doing it right, and still less of doing it even half way optimal.

One thing I could to would be write the program without the carry functionality. Just ignore it. Then disassemble the resulting code and replace the add instructions with add-with-carrys

Dunno if that would work, but it's a plan.

 2013-03-27, 15:17 #7 henryzz Just call me Henry     "David" Sep 2007 Cambridge (GMT/BST) 131648 Posts I imagine there are people on the forum willing to help with a good macro.
 2013-03-27, 17:04 #8 jasonp Tribal Bullet     Oct 2004 3·1,163 Posts Accessing the carry flag is easy using inline asm, but it's less easy to come up with the asm that does exactly what you want. For example, you have a loop: for (i = 0; i < n; i++) { asm(add_with_carry_here) } It does what you want, but using the carry flag will only work inside the loop and if it's not implemented exactly correct then the arithmetic for handling 'i' will silently mess up the carry flag. Even if it's implemented correctly there could be a lot of register shuffling around the magic use-the-carry instruction. Likewise, since the carry is implicit on x86 you can use it, but actually saving it will take one or two more instructions that may negate the benefit of dropping to this level. Of course, many processors don't have a carry flag at all (MIPS, Alpha), some have one carry flag but lots of other flag registers (PowerPC) and ARM has a carry flag that works the opposite way sometimes. For an example of inline asm that has an entire multiple precision add loop in it, where the arrays are of fixed size, see the mp_add and mp_sub functions here. Last fiddled with by jasonp on 2013-03-27 at 17:10
 2013-03-27, 18:57 #9 ewmayer ∂2ω=0     Sep 2002 República de California 22·33·7·13 Posts I'd say it boils down to a binary decision: 0. It's performance-critical: Use inline ASM. [Hey, if I can learn it after age 40, there may be hope for us yet.] 1. It's not performance-critical: Emulate using unsigned integer compare, e.g. uint *a,*b, cy = 0; loop over i: a[i] += b[i] + cy; cy = a[i] < cy; /loop This often proves faster - especially when mixed with other code - even on CPUs supporting a carry flag, due to the easier job it makes in terms of scheduling. Last fiddled with by ewmayer on 2013-03-27 at 19:00
2013-03-28, 03:39   #10
Mr. P-1

Jun 2003

7×167 Posts

Quote:
 Originally Posted by ewmayer I'd say it boils down to a binary decision: 0. It's performance-critical: Use inline ASM. [Hey, if I can learn it after age 40, there may be hope for us yet.] 1. It's not performance-critical: Emulate using unsigned integer compare, e.g. uint *a,*b, cy = 0; loop over i: a[i] += b[i] + cy; cy = a[i] < cy; /loop
That won't work. a[i] can only be < cy if a[i] is 0 and cy is 1

Even if you interpret the < as a signed inequality, it won't work. For example, if a[i]=b[i] = 101010102 adding b[i] to a[i] will generate a carry, but the sum will be 0101010c2, a positive number.

A bitwise test for a carry is rather difficult to do. Basically uint a + uint b + carry-in will generate a carry-out if and only if at least one of the following is true:

1. The result is zero

2. The most significant zero bit of a XOR b corresponds to 1 in a (or equivalently in b) I can't see a way to test for this without multiple test and branches.

Here's how it probably should be done:

short_uint *a,*b
long_uint cy= 0;

loop over i:
cy+=a[i];
cy+=b[i];
a[i] = cy;
Shift right cy, (size of short_unit);
/loop

[/QUOTE]

2013-03-28, 04:07   #11
axn

Jun 2003

477710 Posts

Quote:
 Originally Posted by Mr. P-1 That won't work. a[i] can only be < cy if a[i] is 0 and cy is 1
I think "cy = a[i] < b[i]" is the correct version

EDIT:- Even this version has a corner case that it can't handle. Hmmm...

Last fiddled with by axn on 2013-03-28 at 04:19

 Similar Threads Thread Thread Starter Forum Replies Last Post BrainStone Mlucas 14 2017-11-19 00:59 GP2 GMP-ECM 3 2016-10-16 10:21 richs YAFU 11 2016-01-30 14:27

All times are UTC. The time now is 16:21.

Thu Nov 26 16:21:31 UTC 2020 up 77 days, 13:32, 4 users, load averages: 1.41, 1.55, 1.55