To the best of my knowledge, this is the only safe way (without -fwrapv) to check whether the product of two signed integers overflowed: bool product_does_not_overflow(signed x, signed y) { unsigned tmp = x * unsigned(y); return signed(tmp) > 0 && tmp / x == unsigned(y); } (I believe C and C++ are the same in this regard but I could be wrong. If there is a better way to write this test I would love to know about it.) g++ 4.6 produces this assembly dump on x86-64: _Z25product_does_not_overflowii: movl %esi, %edx xorl %eax, %eax imull %edi, %edx testl %edx, %edx jle .L2 movl %edx, %eax xorl %edx, %edx divl %edi cmpl %eax, %esi sete %al .L2: rep ret but, if I understand the semantics of IMUL correctly, it could do this instead: _Z25product_does_not_overflowii: xorl %eax, %eax imull %edi, %esi setno %al ret which is a pretty substantial micro-win, particularly in getting rid of a divide.

On Tue, 12 Apr 2011, zackw at panix dot com wrote: > To the best of my knowledge, this is the only safe way (without -fwrapv) to > check whether the product of two signed integers overflowed: > > bool product_does_not_overflow(signed x, signed y) > { > unsigned tmp = x * unsigned(y); > > return signed(tmp) > 0 && tmp / x == unsigned(y); > } Two signed integers given that they are known to be positive, anyway. This may return unexpected results if either or both arguments are negative or zero. I sort of think GCC should have built-in functions exposing C and C++ interfaces for: each basic arithmetic operation, defined to wrap; each basic arithmetic operation, defined to saturate; each basic arithmetic operation, defined to have undefined overflow; each basic arithmetic operation, with a separate overflow flag being set; each basic arithmetic operation, defined to trap on overflow. All of these for both signed and unsigned and for any desired number of bits (up to the maximum number supported for arithmetic, so generally 1-64 bits on 32-bit configurations and 1-128 bits on 64-bit configurations); except for the defined-to-trap case, all would still have undefined behavior on division by 0. You could then have optimizations mapping generic C idioms to such built-in operations where the target supports efficient code for the operations. But this rather relies on the no-undefined-overflow work being finished first so that some of the required operations actually exist inside GCC, before they can easily be exposed to the user. > which is a pretty substantial micro-win, particularly in getting rid of a > divide. (If the function gets called with one constant operand, you can make it inline and use __builtin_constant_p to replace a divide with a range check on the other operand. That's only useful for some cases of overflow checks, of course.)

(In reply to comment #1) > > Two signed integers given that they are known to be positive, anyway. > This may return unexpected results if either or both arguments are > negative or zero. ... > (If the function gets called with one constant operand, you can make it > inline and use __builtin_constant_p to replace a divide with a range check > on the other operand. That's only useful for some cases of overflow > checks, of course.) In the code that this is cut down from, both arguments are known to be strictly positive, but neither is constant. (They're only signed for historical reasons, I think, but it would be a huge amount of work to change that.) > I sort of think GCC should have built-in functions exposing C and C++ > interfaces for: each basic arithmetic operation, defined to wrap; each > basic arithmetic operation, defined to saturate; each basic arithmetic > operation, defined to have undefined overflow; each basic arithmetic > operation, with a separate overflow flag being set; each basic arithmetic > operation, defined to trap on overflow. All of these for both signed and > unsigned and for any desired number of bits (up to the maximum number > supported for arithmetic, so generally 1-64 bits on 32-bit configurations > and 1-128 bits on 64-bit configurations); except for the defined-to-trap > case, all would still have undefined behavior on division by 0. You could > then have optimizations mapping generic C idioms to such built-in > operations where the target supports efficient code for the operations. > But this rather relies on the no-undefined-overflow work being finished > first so that some of the required operations actually exist inside GCC, > before they can easily be exposed to the user. So you see this as more of a tree-level than an RTL-level missed optimization, then? Your plan sounds fine to me, although I might look for a less ambitious but more likely to get done soon approach, personally.

On Tue, 12 Apr 2011, zackw at panix dot com wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 > > --- Comment #2 from Zack Weinberg <zackw at panix dot com> 2011-04-12 20:40:41 UTC --- > (In reply to comment #1) > > > > Two signed integers given that they are known to be positive, anyway. > > This may return unexpected results if either or both arguments are > > negative or zero. > ... > > (If the function gets called with one constant operand, you can make it > > inline and use __builtin_constant_p to replace a divide with a range check > > on the other operand. That's only useful for some cases of overflow > > checks, of course.) > > In the code that this is cut down from, both arguments are known to be strictly > positive, but neither is constant. (They're only signed for historical > reasons, I think, but it would be a huge amount of work to change that.) My point in noting the need for the integers to be positive was really that unless the compiler knows they are positive, the transformation you're asking for appears to be incorrect - the semantics of your function are that a product with either term 0 counts as overflowing, but using a processor overflow flag would report it as not overflowing. > So you see this as more of a tree-level than an RTL-level missed optimization, > then? Your plan sounds fine to me, although I might look for a less ambitious > but more likely to get done soon approach, personally. I think it's both levels, but probably tree-level for spotting the patterns unless it seems likely that expansion or RTL optimizations will produce more instances of them. And tree-level is where we have VRP information, if you want the compiler to tell that the arguments must be strictly positive that way.

On Tue, Apr 12, 2011 at 1:52 PM, joseph at codesourcery dot com <gcc-bugzilla@gcc.gnu.org> wrote: >> In the code that this is cut down from, both arguments are known to be strictly >> positive, but neither is constant. (They're only signed for historical >> reasons, I think, but it would be a huge amount of work to change that.) > > My point in noting the need for the integers to be positive was really > that unless the compiler knows they are positive, the transformation > you're asking for appears to be incorrect - the semantics of your function > are that a product with either term 0 counts as overflowing, but using a > processor overflow flag would report it as not overflowing. Well, if the compiler didn't know that, it could still use the overflow flag plus an extra test for either input operand being zero, couldn't it? The C idiom has to test for a zero result, because e.g. 0x4000_0000U * 16 wraps to zero. (The original code does in fact check for x or y <= 0 in a place where VRP would notice; I should have said that instead of "known to be strictly positive", sorry for any confusion.)

Addendum: what would *you* describe as the correct C idiom for ensuring that the product of two signed integers was positive and did not overflow the range of a same-sized signed integer, assuming nothing about either multiplicand?

On Tue, 12 Apr 2011, zackw at panix dot com wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 > > --- Comment #4 from Zack Weinberg <zackw at panix dot com> 2011-04-12 21:03:01 UTC --- > On Tue, Apr 12, 2011 at 1:52 PM, joseph at codesourcery dot com > <gcc-bugzilla@gcc.gnu.org> wrote: > >> In the code that this is cut down from, both arguments are known to be strictly > >> positive, but neither is constant. (They're only signed for historical > >> reasons, I think, but it would be a huge amount of work to change that.) > > > > My point in noting the need for the integers to be positive was really > > that unless the compiler knows they are positive, the transformation > > you're asking for appears to be incorrect - the semantics of your function > > are that a product with either term 0 counts as overflowing, but using a > > processor overflow flag would report it as not overflowing. > > Well, if the compiler didn't know that, it could still use the > overflow flag plus an extra test for either input operand being zero, > couldn't it? The C idiom has to test for a zero result, because e.g. > 0x4000_0000U * 16 wraps to zero. Yes (a check for them being <= 0, that is; I think that function will report any case with negative operands as overflowing as well).

On Tue, 12 Apr 2011, zackw at panix dot com wrote: > Addendum: what would *you* describe as the correct C idiom for > ensuring that the product of two signed integers was positive and did > not overflow the range of a same-sized signed integer, assuming > nothing about either multiplicand? I think that's bordering complicated enough that any two people are likely to produce different enough code that it may not be worth detecting - the cases where generic C becomes complicated like that are the ones providing the best case for built-in operations where you can just say "multiply two (signed) values, check whether the result fits in 31-bit unsigned and set an overflow flag accordingly".

I too would see it as two pieces, pattern matching on trees to produce a canonical builtin call and RTL expansion of that for optimal target code generation (where I don't know whether we can do better than using UNSPECs). Note that usually you also want to use the result of the multiplication (if it didn't overflow), and using just a single multiplication might be even more complicated (if we need to go the UNSPEC way). For the latter reasons I think the builtins should be sth like __builtin_smul_ovfl_p (multiplication-result, op0, op1), thus pass in the multiplication result and keep the multiplication itself in the IL to also allow for regular optimizations to work on them. If the multiplication is just used as the builtin call argument expansion can get rid of it. The set of builtins with defined behavior is still useful, if not only to allow mixing of wrapv/trapv/... operations in C source. But it isn't exactly the form I'd like the operations to reside in the IL. Pattern-matching the multiplication overflow code can be tricky because of the various ways the handling of zero operands can be implemented. Implementing this entirely on the RTL side seems very tricky to me.

On Wed, 13 Apr 2011, rguenth at gcc dot gnu.org wrote: > For the latter reasons I think the builtins should be sth like > __builtin_smul_ovfl_p (multiplication-result, op0, op1), thus pass > in the multiplication result and keep the multiplication itself > in the IL to also allow for regular optimizations to work on them. > If the multiplication is just used as the builtin call argument > expansion can get rid of it. My inclination is that we probably want to separate the API for users from the built-in functions or other operations used internally - possibly providing a header gccarith.h with a supported API and saying the built-in functions are subject to change and should not be used directly in source code. Maybe (inspired by the style used on C1X stdatomic.h) one could have operations like arith_mul_signed (a, b, 32, ARITH_SAT) that multiplies a and b (integers to infinite precision, producing an infinite precision result), saturates the result to 32-bit signed and returns an implementation-defined type (signed, two's complement, at least 32 bits), or arith_mul_signed_check (a, b, 32, ARITH_WRAP, &overflow_p) that stores an overflow flag in the provided _Bool *. Or the overflow handling (wrap, saturate, undefined, unspecified-value, trap) could be part of the macro/function name (putting it as a separate operand is inspired by the memory_order parameters in stdatomic.h). The _check versions with explicit overflow flags could be used with any overflow handling other than "undefined behavior". (So if you don't care about what the value is when the overflow flag is set - if you'll just call some error handler - then you'd use "unspecified-value" as the overflow handling.) On divide by zero, both division and modulo operations would have undefined behavior except for the "trap" case - but modulo -1 would always be defined, unlike for C INT_MIN % -1 (I'm thinking of these operations as all being defined on their own by producing an infinite precision result first, but there might also be operations such as divmod producing multiple results). It may also make sense to expose different variations of division/modulo (rounding to 0, -infinity or +infinity; a true modulo operation, rounding to -infinity, is one of those things C doesn't make easy; GCC has these as TRUNC_DIV_EXPR, FLOOR_DIV_EXPR, CEIL_DIV_EXPR and corresponding *_MOD_EXPR). Starting with addition then increasing the set of operations gradually probably makes sense. Along with all the arithmetic operations you get range check operations - take a value (in any integer type), and check whether it is in the signed/unsigned range with a given number of bits, wrapping/saturating etc. and setting overflow flags as needed. These are just like the above but with one fewer operands. The explicit number of bits (required to be an integer constant expression) is deliberate because it is sometimes useful e.g. to saturate to 16 bits, but integer promotions in C make it rather fragile to rely on 16-bit operands remaining 16-bit rather than getting implicitly promoted to 32 bits. A consequence is that operations with strange numbers of bits can be written without involving bit-field types (since the result is allowed to have more than the given number of bits). I don't expect those to be very efficient in general, but note that ARM for example has SSAT and USAT instructions to saturate a value to a given number of bits (up to 32) and set a flag if it was out of range.

There are C and x86 assembly code fragments showing how to do signed and unsigned saturating arithmetic here: http://locklessinc.com/articles/sat_arithmetic/ Although, if there were new intrinsics that allowed direct access to the overflow and carry flags from arithmetic instructions, they would be quite useful.

*** Bug 49467 has been marked as a duplicate of this bug. ***

I don't much like the idea of using builtins for operations as fundamental as integer arithmetic. How about this as a straw-man suggestion: adding new qualifiers for "fat" integers-with-flags, somewhat in the spirit of the embedded-C fractional/saturating types? So you might have: int x, y; void foo (void) { _Flagged int res; res = (_Flagged int) x + y; if (_Carry (res)) printf ("sum carried\n"); if (_Overflow (res)) printf ("sum overflowed\n"); } this avoids problems with global state, and allows for the programming style which (I vaguely get the impression that) people seem to want -- performing a "normal" integer operation, then querying for carry/overflow flags afterwards. These types wouldn't be allowed to cross ABI boundaries (although of course you could use the magic builtins _Carry/_Overflow to extract values to pass to functions), and "_Overflow" would only be allowed for signed types. You could also have "_Borrow" for subtraction maybe (whose meaning would be the inverse of _Carry). Signalling variants could look like, e.g.: void bar (void) { int res; res = (_Signalling _Flagged int) x + y; } Things would get awkward if you tried to use these new constructs in more-complicated expressions, I suppose. Anyway, just an idea.

Coming to think of it, if _Sat were allowed on plain integers too, a _Flagged _Sat int could also be queried for saturation using a similar mechanism, like: int foo (_Sat int x, _Sat int y) { return _Saturated ((_Sat _Flagged) x + y); } I'm probably getting ahead of myself :-).

On Wed, 5 Oct 2011, jules at gcc dot gnu.org wrote: > I don't much like the idea of using builtins for operations as fundamental as > integer arithmetic. How about this as a straw-man suggestion: adding new > qualifiers for "fat" integers-with-flags, somewhat in the spirit of the > embedded-C fractional/saturating types? So you might have: The trouble is that the nature of an operation is more a property of the operation than of the type - and the proliferation of types for what should be operations on normal types is much of the problem with what the embedded-C TR does. You could have pragmas to say that a cumulative flag for a particular scope goes in a particular variable, and that normal operations have particular overflow semantics in that scope, maybe.

(In reply to comment #7) > […] built-in operations where you can just say "multiply two > (signed) values, check whether the result fits in 31-bit unsigned and set > an overflow flag accordingly". Would be easier to read, easier to maintain, and less difficult to detect. Sounds like an overall win. I'm very much in favor of builtins like these. (In reply to comment #9) > arith_mul_signed_check (a, b, 32, ARITH_WRAP, &overflow_p) I'd rather make the overflow indicator the return value, and transfer the result to a location indicated via a pointer. I.e. overflow_p = arith_mul_signed_check (a, b, 32, ARITH_WRAP, &a_times_b) One reason is that one usually wants to check for overflow BEFORE using the result, i.e. the common use case would likely use the above as a condition of some branching construct. A second reason is that this would allow for a flavor which will not modify a_times_b in case of an overflow, which might be useful for some scenarios, particularly if the result is stored in the same location as one of the operands, thus overwriting the operand. So an unchecked a*=b could be transformed into the checked construct overflow_p = arith_mul_signed_check (a, b, 32, ARITH_CAREFUL, &a) (In reply to comment #14) > The trouble is that the nature of an operation is more a property of the > operation than of the type I agree. The main point of all of this is optimization. And in terms of optimization, one would want to examine some flag immediately after an operation setting that flag. One would act upon the flag, and then discard it. If the information were part of the type, one would require extra storage to save those flags, which would lead to a change in the size of these types, which in turn would severely impact many other things.

(In reply to comment #15) > I agree. The main point of all of this is optimization. And in terms of > optimization, one would want to examine some flag immediately after an > operation setting that flag. One would act upon the flag, and then discard it. I somewhat disagree. A program must be correct; it should be secure; and it can be efficient. I'm interested in "correct" and "secure". If a program silently overflows, its surely not correct. If an adversary can do something with the errant result, its not secure either. What's the point in doing something wrong but doing it quickly? Jeff

(In reply to comment #16) > I somewhat disagree. A program must be correct; it should be secure; > and it can be efficient. I'm interested in "correct" and "secure". > If a program silently overflows, its surely not correct. I'm not talking about silently ignoring overflows, quite the contrary. Always doing the correct thing leads to arbitrary size integers. Checking all (signed) arithmetic leads to -ftrapv. Checking some arithmetic might perhaps be achieved with the signalling types from comment #12, although semantics for mixed types might be problematic. The non-signalling versions will only improve things if one actually checks the additional information after the operation, which might easily be forgotten. Checking individual operations could also (and in my opinion better) be achieved with builtins, and in this case a warning could be issued if the return value indicating the overflow is ignored. Builtins might even allow using specific overflow semantics for code otherwise compiled with -ftrapv, thus increasing the usability of that flag.

I find it a little disappointing that what should have been a straightforward additional optimization has gotten totally derailed into bikeshedding of an enormous class of builtins which seem unlikely ever to gain any traction. I just want the code in the original bug report to be optimized. That would be enough.

Bug 49467 asked about builtins, and got duped here, so small wonder people wanting a builtin-colored bikeshed like I do end up here...

(In reply to Zack Weinberg from comment #5) > Addendum: what would *you* describe as the correct C idiom for > ensuring that the product of two signed integers was positive and did > not overflow the range of a same-sized signed integer, assuming > nothing about either multiplicand? The most natural way to do it depends on whether you have access to a wider type (and may rely on modular casts to signed integer types as guaranteed by gcc). For instance if you want a non-negative result that fits an int: return x * (unsigned long long)(y) <= __INT_MAX__; or if you only care about signed overflow and not the sign of the result: long long l = x * (long long)(y); return l == (int) l; The value of the overflow flags after a multiplication doesn't seem modeled in i386.md currently (apart from "clobbered"), so it won't be used, but the code generated is not too horrible.

(In reply to myself from comment #15) > (In reply to comment #7) > > […] built-in operations where you can just say "multiply two > > (signed) values, check whether the result fits in 31-bit unsigned and set > > an overflow flag accordingly". > > Would be easier to read, easier to maintain, and less difficult to detect. > Sounds like an overall win. I'm very much in favor of builtins like these. > > I'd rather make the overflow indicator the return value, and transfer the > result to a location indicated via a pointer. Cross reference: bug #59708 is going in this direction, motivated by a precedence set by clang. (In reply to Zack Weinberg from comment #18) > I just want the code in the original bug report to be optimized. I suggested duping bug #49467 there instead, in which case this issue here could return to the originally requested goal of detecting idioms without relying on builtins.