This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Allow non-overflow ops in vect_is_simple_reduction_1
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Tom de Vries <Tom_deVries at mentor dot com>
- Cc: "gcc-patches at gnu dot org" <gcc-patches at gnu dot org>
- Date: Fri, 31 Jul 2015 12:22:38 +0200
- Subject: Re: [PATCH] Allow non-overflow ops in vect_is_simple_reduction_1
- Authentication-results: sourceware.org; auth=none
- References: <55B24E1F dot 1070705 at mentor dot com> <CAFiYyc1-F5cx760mQFr6D1gByhz2n0cpfLMdNGkMfU-i=DN0LA at mail dot gmail dot com> <55B770A1 dot 6010308 at mentor dot com> <CAFiYyc3ahG0gzkfXxdjig-V2SJijV_UPaQJd0hORv5i93qSi4A at mail dot gmail dot com> <55B8B758 dot 8010902 at mentor dot com> <CAFiYyc3hHBAFQ5bzh1ztAfp32wef9mM7FWpJEQ35_g3PUGuveg at mail dot gmail dot com> <55B90AD9 dot 9040707 at mentor dot com>
On Wed, Jul 29, 2015 at 7:18 PM, Tom de Vries <Tom_deVries@mentor.com> wrote:
> On 29/07/15 14:00, Richard Biener wrote:
>>
>> On Wed, Jul 29, 2015 at 1:22 PM, Tom de Vries <Tom_deVries@mentor.com>
>> wrote:
>>>
>>> On 29/07/15 10:09, Richard Biener wrote:
>>>>
>>>>
>>>> On Tue, Jul 28, 2015 at 2:08 PM, Tom de Vries <Tom_deVries@mentor.com>
>>>> wrote:
>>>>>
>>>>>
>>>>> On 28/07/15 09:59, Richard Biener wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jul 24, 2015 at 4:39 PM, Tom de Vries <Tom_deVries@mentor.com>
>>>>>>
>>>>>>
>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> this patch allows parallelization and vectorization of reduction
>>>>>>> operators
>>>>>>> that are guaranteed to not overflow (such as min and max operators),
>>>>>>> independent of the overflow behaviour of the type.
>>>>>>>
>>>>>>> Bootstrapped and reg-tested on x86_64.
>>>>>>>
>>>>>>> OK for trunk?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hmm, I don't like that no_overflow_tree_code function. We have a much
>>>>>> more
>>>>>> clear understanding which codes may overflow or trap. Thus please add
>>>>>> a operation specific variant of TYPE_OVERFLOW_{TRAPS,WRAPS,UNDEFINED}
>>>>>> like
>>>>>>
>>>>>
>>>>> Done.
>>>>>
>>>>>> bool
>>>>>> operation_overflow_traps (tree type, enum tree_code code)
>>>>>> {
>>>>>> if (!ANY_INTEGRAL_TYPE_P (type)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I've changed this test into a gcc_checking_assert.
>>>>>
>>>>>
>>>>>> || !TYPE_OVERFLOW_TRAPS (type))
>>>>>> return false;
>>>>>> switch (code)
>>>>>> {
>>>>>> case PLUS_EXPR:
>>>>>> case MINUS_EXPR:
>>>>>> case MULT_EXPR:
>>>>>> case LSHIFT_EXPR:
>>>>>> /* Can overflow in various ways */
>>>>>> case TRUNC_DIV_EXPR:
>>>>>> case EXACT_DIV_EXPR:
>>>>>> case FLOOR_DIV_EXPR:
>>>>>> case CEIL_DIV_EXPR:
>>>>>> /* For INT_MIN / -1 */
>>>>>> case NEGATE_EXPR:
>>>>>> case ABS_EXPR:
>>>>>> /* For -INT_MIN */
>>>>>> return true;
>>>>>> default:
>>>>>> return false;
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> and similar variants for _wraps and _undefined. I think we decided at
>>>>>> some point
>>>>>> the compiler should not take advantage of the fact that lshift or
>>>>>> *_div have undefined
>>>>>> behavior on signed integer overflow, similar we only take advantage of
>>>>>> integral-type
>>>>>> overflow behavior, not vector or complex. So we could reduce the
>>>>>> number of cases
>>>>>> the functions return true if we document that it returns true only for
>>>>>> the cases where
>>>>>> the compiler needs to / may assume wrapping behavior does not take
>>>>>> place.
>>>>>> As for _traps for example we only have optabs and libfuncs for
>>>>>> plus,minus,mult,negate
>>>>>> and abs.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I've tried to capture all of this in the three new functions:
>>>>> - operation_overflows_and_traps
>>>>> - operation_no_overflow_or_wraps
>>>>> - operation_overflows_and_undefined (unused atm)
>>>>>
>>>>> I've also added the graphite bit.
>>>>>
>>>>> OK for trunk, if bootstrap and reg-test succeeds?
>>>>
>>>>
>>>>
>>>> +/* Returns true if CODE operating on operands of type TYPE can
>>>> overflow,
>>>> and
>>>> + fwrapv generates trapping insns for CODE. */
>>>>
>>>> ftrapv
>>>>
>>>
>>> Done.
>>>
>>>> +bool
>>>> +operation_overflows_and_traps (tree type, enum tree_code code)
>>>> +{
>>>>
>>>> operation_overflow_traps
>>>>
>>>> is better wording. Meaning that when the operation overflows then it
>>>> traps.
>>>>
>>>
>>> AFAIU, the purpose of the function is to enable optimizations when it
>>> returns false, that is:
>>> - if the operation doesn't overflow, or
>>> - if the operation overflows, but doesn't trap.
>>>
>>> The name operation_overflow_traps does not make clear what it returns
>>> when
>>> the operation doesn't overflow. If the name doesn't make it clear, you
>>> need
>>> to be conservative, that is, return true. Which defies the purpose of the
>>> function.
>>>
>>> I've changed the name to operation_no_trapping_overflow (and inverted
>>> logic
>>> in the function).
>>>
>>> But perhaps you want operation_overflow_traps with a conservative return
>>> for
>>> non-overflow operations, and use it like this:
>>> ...
>>> else if (INTEGRAL_TYPE_P (type) && check_reduction)
>>> {
>>> if (operation_overflows (type, code)
>>> && operation_overflow_traps (type, code))
>>> {
>>> /* Changing the order of operations changes the semantics. */
>>> ...
>>> ?
>>
>>
>> I think operation_no_trapping_overflow has the same wording issue as
>> operation_overflow_traps but I'm not a native speaker
>
>
> Hmm, I'm also not a native speaker. I think I understand what you mean, if
> operation_no_trapping_overflow is read with stress on 'trapping', we have
> the same ambiguity issue.
>
> [ Possibility: a more verbose variant, but I hope no ambiguity:
> operation_can_overflow_and_trap ? ]
I'd say we choose a small name and simply refer to documentation.
>
>> so I'll take your
>>
>> word that operation_no_trapping_overflow is non-ambiguous iff the
>> operation cannot overflow.
>>
>> And no, I didn't mean to use it in combination with operation_overflows.
>>
>>>> + /* We don't take advantage of integral type overflow behaviour for
>>>> complex and
>>>> + vector types. */
>>>>
>>>> We don't generate instructions that trap on overflow for complex or
>>>> vector
>>>> types
>>>>
>>>
>>> Done.
>>>
>>>> + if (!INTEGRAL_TYPE_P (type))
>>>> + return true;
>>>>
>>>> + switch (code)
>>>> + {
>>>> + case PLUS_EXPR:
>>>> + case MINUS_EXPR:
>>>> + case MULT_EXPR:
>>>> + case LSHIFT_EXPR:
>>>> + /* Can overflow in various ways. */
>>>>
>>>> we don't have a trapping optab for lshift
>>>>
>>>> + case TRUNC_DIV_EXPR:
>>>> + case EXACT_DIV_EXPR:
>>>> + case FLOOR_DIV_EXPR:
>>>> + case CEIL_DIV_EXPR:
>>>>
>>>> nor division. See optabs.c:optab_for_tree_code. I suggest to only
>>>> return
>>>> true
>>>> for those.
>>>>
>>>
>>> Before the logic inversion, we return false for these (And also for
>>> operators that do not overflow).
>>>
>>>> +/* Returns true if CODE operating on operands of type TYPE cannot
>>>> overflow, or
>>>> + wraps on overflow. */
>>>> +
>>>> +bool
>>>> +operation_no_overflow_or_wraps (tree type, enum tree_code code)
>>>> +{
>>>> + gcc_checking_assert (ANY_INTEGRAL_TYPE_P (type));
>>>>
>>>> operation_overflow_wraps ()
>>>>
>>>> is still my preferred name.
>>>>
>>>
>>> The name operation_overflow_wraps doesn't make clear what it returns if
>>> the
>>> operation doesn't overflow. And I didn't manage to come up with a better
>>> name sofar.
>>>
>>> Btw, I wonder about something like vector max operation. The current
>>> implementation of operation_no_overflow_or_wraps returns false. Could we
>>> do:
>>> ...
>>> /* We don't take advantage of integral type overflow behaviour for
>>> complex and vector types. */
>>> if (!INTEGRAL_TYPE_P (type))
>>> return !operation_overflows (type, code);
>>> ...
>>> ?
>>
>>
>> Yes, we can use operation_overflows and the existing overflow macros as
>> well:
>>
>> if (!INTEGRAL_TYPE_P (type)
>> || TYPE_OVERFLOW_WRAPS (type)
>> || !operation_overflows (type, code))
>>
>
> For an unsigned vector add, this would evaluate to true.
Yes, so?
> You remarked "similar we only take advantage of integral-type overflow
> behavior, not vector or complex". Was that remark then limited to exploiting
> undefined behaviour?
Yes, but it's also not documented anywhere.
>> and get rid of operation_overflow_{wraps,undefined}
>
>
> That's not what I meant, but ok.
>
>> unless we want to take
>>
>> advantage of the cases the compiler doesn't take advantage of the overflow
>> behavior.
>
>
> I thought the purpose of the functions was to specify in one location and
> clearly documented the situations that the compiler is allowed to take
> advantage of the overflow behavior.
Yeah, but we're iterating over names here and that's not good progress ;)
We can always followup with that.
>> I think keeping the traps variant separate makes sense because
>> of the clear facts on what trapping optabs we implement.
>>
>>>> +bool
>>>> +operation_overflow_and_undefined (tree type, enum tree_code code)
>>>> +{
>>>> + gcc_checking_assert (ANY_INTEGRAL_TYPE_P (type));
>>>> +
>>>>
>>>> operation_overflow_undefined ()
>>>>
>>>
>>> The name operation_overflow_undefined doesn't make clear what it returns
>>> if
>>> the operation doesn't overflow. I've changed it into
>>> operation_undefined_overflow.
>>>
>>>> If you like to keep an explicit operation_can_overflow then there is the
>>>> opportunity to split out the switch statement from
>>>> operation_overflow_wraps
>>>> and operation_overflow_undefined.
>>>>
>>
>> Why 'operation_overflows' ... it's operation_can_overflow, it clearly
>> doesn't
>> always overflow...
>
>
> Done.
>
>> Also it doesn't need its type argument (the assert
>> doesn't make much sense, for example fixed-point types can overflow
>> as well, likewise real types).
>
>
> Done.
>
>>
>> I'm fine with operation_no_trapping_overflow as in the patch, likewise
>> with operation_overflows apart from its name.
>>
>> operation_no_overflow_or_wraps can be replaced by
>> ANY_INTEGRAL_TYPE_P ()
>> && (TYPE_OVERFLOW_WRAPS () || !operation_can_overflows (code))
>>
>> conservatively operation_undefined_overflow could be treated the same
>> and I'd prefer to do it that way for now.
>>
>
> Dropped wrap/undefined functions.
>
> OK like this?
Ok.
Thanks,
Richard.
> Thanks,
> - Tom