This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [Patch, fortran, RFC] PR28105 Overflow check for ALLOCATE statement

From: Janne Blomqvist <blomqvist dot janne at gmail dot com>
To: Tobias Burnus <burnus at net-b dot de>
Cc: Fortran List <fortran at gcc dot gnu dot org>, GCC Patches <gcc-patches at gcc dot gnu dot org>
Date: Wed, 24 Nov 2010 14:15:33 +0200
Subject: Re: [Patch, fortran, RFC] PR28105 Overflow check for ALLOCATE statement
References: <AANLkTimWJMty0jEWVc8d3wZvmfw=aXxXY3tFCNa7tEaz@mail.gmail.com> <4CEBE2D8.5000306@net-b.de>

On Tue, Nov 23, 2010 at 17:50, Tobias Burnus <burnus@net-b.de> wrote:
> On 11/22/2010 11:37 PM, Janne Blomqvist wrote:
>>
>> the attached patch implements overflow checking for the ALLOCATE
>> statement. It also fixes the minor regression introduced a few days
>> ago when the old "< Â0" overflow check was removed as a side-effect of
>> fixing the unsignedness of size_t.
>
> Frankly, I do not like the way the overflow check is done - namely by
> requiring an overflow:
>
> Â Â Â ÂD.1506 = MAX_EXPR <my_size, 0>;
> Â Â Â ÂD.1507 = D.1506 != 0 ? 2147483647 / D.1506 <= 0 ? 1 : 0 : 0;
> Â Â Â ÂD.1508 = NON_LVALUE_EXPR <D.1506>;
> Â Â Â ÂD.1509 = ((<unnamed-unsigned:32>) D.1508 > -1 ? 1 : 0) + D.1507;
> Â Â Â Âoverflow.1 = D.1509;
> Â Â Â Âif (overflow.1 != 0)
> Â Â Â Â Â Â_gfortran_runtime_error (&"Integer overflow when calculating the
> amount of memory to allocate"[1]{lb: 1 sz: 1});
>
> Namely, doing the "MAXSIZE/my_size <= 0" check where my_size is the signed
> requested size of the array. Explicitly asking for an overflow seems to be
> wrong.

I suspect you are confused; I wondered the same thing when I was
developing the patch, and as far as I was able to determine, the issue
is that the tree dump is slightly different (although the result is
the same) than the actual code that the frontend generates. So, what
gfc_array_init_size does is roughly

// overflow = 0 is set in gfc_array_allocate() before calling
gfc_array_init_size().
stride = 1;
for (int i = 0; i < rank; i++)
{
    size = gfc_conv_array_extent_dim (...);  /* max(ubound + 1 - lbound, 0) */
    overflow += size != 0 ? (MAX/size < stride ? 1: 0): 0;
    stride *= size;
}

So, somehow for the D.1507 statement, the overflow check for the first
dimension which is then "MAX/size < 1" has been turned into "MAX/size
<= 0" which is of course equivalent but is not what the code generated
by the patch actually does! But for higher dimensions you'll see that
the tree dump corresponds to the pseudocode above.

> Additionally, I have no idea what the supposed result is; MAXSIZE (=
> "TYPE_MAX_VALUE (sizetype)") is unsigned and does not fit into a signed
> variable, however, the calculation is supposed to be done signed as both the
> D.1506 and the D.1507 is signed.

Ah, you're right. This should be fixed. Come to think of it, one might
as well instead convert size and stride to size_t right away and do
the overflow check and multiplication in that type, since in the end
it has to be converted to size_t anyway in order to call malloc()
(also, I think I read in the AMD64 software optimization manual that
unsigned divides are slightly faster than signed). In
gfc_conv_array_extent_dim() the calculation must still be done with
signed arithmetic since there the result might be negative, but that
function makes sure to return only a positive value, so size and
stride are guaranteed to be always positive in gfc_array_init_size().

> Additionally, you completely ignore one important ingredient: The size of
> the actual object which is to be allocated; using derived types those can be
> large.

No, AFAICS this is included in the patch. After the loop above, the
pseudocode continues as

tmp = TYPE_SIZE_UNIT (gfc_get_element_type (type));
element_size = fold_convert (sizetype, tmp);
stride = fold_convert (sizetype, stride);
overflow += size != 0 ? (MAX/element_size < stride ? 1: 0): 0;

I believe this check corresponds to the line beginning with D.1509 in
your tree dump above.

> For the conditions, one should consider to use __builtin_expect to give the
> compiler some additional optimization potential.

Good point, I'll have to investigate how to do this.

>> Wrt whether these checks should be done always, or only when
>> -fcheck=mem is enabled, IMHO some benchmarks should be run and if no
>> difference can be seen they can be left enabled. So, does anybody know
>> of suitable benchmarks that are influenced by ALLOCATE performance?
>> Polyhedron, maybe not so much?
>
> I don't. The issue is presumably largest for calls to small functions called
> in loops, which get later inlined. The less conditions and other trickery is
> done, the more likely the compiler can move the malloc out of the loop. I
> think it is trivial to construct such a program but it is hard to know how
> representative such a program would be for the real world. The probability
> has to be weighted against using a too large value for ALLOCATE without
> -check=all/-check=mem. I am inclined to make it conditional to -check=mem;
> but on the other hand, I live in a 64bit world where the chance of an
> overflow is rather low (for somewhat reasonable programs).

I'll see if I can come up with some benchmarks.

Thanks for the review.

-- 
Janne Blomqvist

Follow-Ups:
- Re: [Patch, fortran, RFC] PR28105 Overflow check for ALLOCATE statement
  - From: Janne Blomqvist

References:
- [Patch, fortran, RFC] PR28105 Overflow check for ALLOCATE statement
  - From: Janne Blomqvist
- Re: [Patch, fortran, RFC] PR28105 Overflow check for ALLOCATE statement
  - From: Tobias Burnus

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]