This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC] Detect most integer overflows.
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Richard Biener <richard dot guenther at gmail dot com>
- Cc: Hannes Frederic Sowa <hannes at stressinduktion dot org>, "gcc at gnu dot org" <gcc at gnu dot org>
- Date: Wed, 30 Oct 2013 09:34:13 +0100
- Subject: Re: [RFC] Detect most integer overflows.
- Authentication-results: sourceware.org; auth=none
- References: <20131026192912 dot GA25428 at domone dot podge> <20131026235014 dot GF18009 at order dot stressinduktion dot org> <CAFiYyc0+wTbE1FwwLscquWvoEtM6JQw4p5qhnhBmGtVCMkx9fQ at mail dot gmail dot com>
On Tue, Oct 29, 2013 at 10:41:56AM +0100, Richard Biener wrote:
> On Sun, Oct 27, 2013 at 1:50 AM, Hannes Frederic Sowa
> <hannes@stressinduktion.org> wrote:
> > On Sat, Oct 26, 2013 at 09:29:12PM +0200, OndÅej BÃlka wrote:
> >> Hi, as I brainstormed how prevent possible overflows in memory allocation I
> >> came with heretic idea:
> >>
> >> For gcc -D_FORTIFY_SOURCE=2 we expand all multiplication with size_t
> >> type by one that checks for integer overflow and aborts on it. This
> >> would prevent most overflow at cost of breaking some legitimate
> >> applications that use multiplication in clever way.
> >>
> >> A less heretic way that is applicable for C++ would be write a class
> >> size_t overflow that would do arithmetic in saturating way and issue
> >> warnings when there is a size_t multiplication.
> >
> > I am afraid of the false-positive aborts which could result in DoS against
> > applications. I like the checked arithmetic builtins LLVM introduced in
> > 3.4 (not yet released) where one can test for overflow manually and handle
> > the overflows appropriately. They also generate better code (e.g. they
> > use the overflow flag and get inlined on x86 compared to the ftrapv insn).
> >
> > So I would vote for fast checked arithmetic builtins first.
>
> For reference those
> (http://clang.llvm.org/docs/LanguageExtensions.html) look like
>
> if (__builtin_umul_overflow(x, y, &result))
> return kErrorCodeHackers;
>
> which should be reasonably easy to support in GCC (if you factor out
> generating best code and just aim at compatibility). Code-generation
> will be somewhat pessimized by providing the multiplication result
> via memory, but that's an implementation detail.
>
The reasons of adding builtins is performance. Without that one can
write a simple template to generically check overflows like
template <class C> class overflow {
public:
C val;
overflow <C> operator + (overflow <C> &y) {
overflow <C> ret;
if (val > 0 && y.val > 0 && val + y.val < val)
throw std::overflow_error();
/* ... */
ret.val = val + y.val;
return ret;
}
/* ... */
};
and use it as
overflow x = 3, y = 4;
x = x * x - 3 * y + 9;
> LLVM covers addition, subtraction and multiply on signed and unsigned
> int, long and long long types. Not sure why they offer anything for
> unsigned - possibly for size_t arithmetic and security concerns with
> malloc? For practicability and to be less error-prone I'd have done
> the builtins in a type-generic way (like tgmath) as using the
> wrong typed builtin can lead both to undetected overflow, unwanted
> truncation of arguments and possibly memory overflow of 'result'
> (if you ignore warnings about incompatible pointer types).
>
Unsigned size_t calculations were point of original mail.
For size calculations and so you do not need most complexity as in
original case. Most of time you just deal with expressions consisting of
addition and multiplication of terms that are positive.
After calculation you bound result by some constant (explicitly or
implicitly like by calling mallloc that fails for large sizes.)
When you add type that on overflow sets result to SIZE_MAX which will
stay SIZE_MAX and trigger a bound.
No extra handling is needed there. A problem start when you need to use
subtraction, but this could be in lot of cases avoided by first
calculating differences so you get positive number.
> For a "quick" GCC implementation of the builtins you could expand
> them to a open-coded sequence during gimplification. But due to
> the issues pointed out above I'm not sure it is the best interface
> to support (though now the names are taken).
>
> Richard.
>
> > Greetings,
> >
> > Hannes
> >
--
manager in the cable duct