This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC] Detect most integer overflows.
- From: Geert Bosch <bosch at adacore dot com>
- To: Richard Biener <richard dot guenther at gmail dot com>
- Cc: Hannes Frederic Sowa <hannes at stressinduktion dot org>, OndÅej BÃlka <neleai at seznam dot cz>, "gcc at gnu dot org" <gcc at gnu dot org>
- Date: Fri, 8 Nov 2013 20:31:38 -0500
- Subject: Re: [RFC] Detect most integer overflows.
- Authentication-results: sourceware.org; auth=none
- References: <20131026192912 dot GA25428 at domone dot podge> <20131026235014 dot GF18009 at order dot stressinduktion dot org> <CAFiYyc0+wTbE1FwwLscquWvoEtM6JQw4p5qhnhBmGtVCMkx9fQ at mail dot gmail dot com>
On Oct 29, 2013, at 05:41, Richard Biener <richard.guenther@gmail.com> wrote:
> For reference those
> (http://clang.llvm.org/docs/LanguageExtensions.html) look like
>
> if (__builtin_umul_overflow(x, y, &result))
> return kErrorCodeHackers;
>
> which should be reasonably easy to support in GCC (if you factor out
> generating best code and just aim at compatibility). Code-generation
> will be somewhat pessimized by providing the multiplication result
> via memory, but that's an implementation detail.
I've done the overflow checking in Gigi (Ada front end). Benchmarking
real world large Ada programs (where every integer operation is checked,
including array index computations etc.), I found the performance cost
*very* small (less than 1% on typical code, never more than 2%). There
is a bit more cost in code size, but that is mostly due to the fact that
we want to generate error messages with correct line number information
without relying on backtraces.
The rest of the run time checks in Ada (especially index checks and range
checks) were far more costly (more on the order of 10-15%, but very
variable depending on code style).
A few things helped to make the cost small: the biggest one is that
typically on of the operands is known to be negative or positive.
Gigi will use Ada type information, and Natural or Positive integer
variables are very common. So, if you compute X + C with C positive,
you can write the conditional expression as:
(if X < Integer'Last - C then X + C else raise Constraint_Error)
On my x86-64 this generates something like:
__ada_add:
00 cmpl $0x7fffffff,%edi
06 je 0x0000000c
08 leal 0x01(%rdi),%eax
0b ret
0c leaq 0x0000000d(%rip),%rdi
13 pushq %rax
14 movl $0x00000003,%esi
19 xorl %eax,%eax
1b callq ___gnat_rcheck_CE_Overflow_Check
While this may look like a lot, these operations are expanded
inline, and only the first three are on the normal execution
path. As the exception raise is a No_Return subprogram, it will
be moved to the end of the file. The jumps will both statically
and dynamically be treated as not-taken, and have very little cost.
Additionally, the comparison is visible for the optimizers, in
effect giving more value range information which can be used
for optimizing away further checks. The drawback of using any
"special" new operations is that we loose that aspect.
For the less common case in which neither operand has a known
sign, widening to 64-bits is the straightforward solution. For Ada,
we have a mode where we do this kind of widening for entire
expressions, so we only have to check on the final assignment.
The semantics here are that you'd get the mathematically correct
result, even if there was an intermediate overflow. The drawback
of this approach is that an overflow check may not fail, but
that suppressing the checks removes the widening and causes
wrong answers.
-Geert