This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH GCC]Improve no-overflow check in SCEV using value range info.


On Wed, Jul 20, 2016 at 11:01 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Jul 19, 2016 at 6:15 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> On Tue, Jul 19, 2016 at 1:10 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On Mon, Jul 18, 2016 at 6:27 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>>>> Hi,
>>>> Scalar evolution needs to prove no-overflow for source variable when handling type conversion.  This is important because otherwise we would fail to recognize result of the conversion as SCEV, resulting in missing loop optimizations.  Take case added by this patch as an example, the loop can't be distributed as memset call because address of memory reference is not recognized.  At the moment, we rely on type overflow semantics and loop niter info for no-overflow checking, unfortunately that's not enough.  This patch introduces new method checking no-overflow using value range information.  As commented in the patch, value range can only be used when source operand variable evaluates on every loop iteration, rather than guarded by some conditions.
>>>>
>>>> This together with patch improving loop niter analysis (https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00736.html) can help various loop passes like vectorization.
>>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>>
>>> @@ -3187,7 +3187,8 @@ idx_infer_loop_bounds (tree base, tree *idx, void *dta)
>>>    /* If access is not executed on every iteration, we must ensure that overlow
>>>       may not make the access valid later.  */
>>>    if (!dominated_by_p (CDI_DOMINATORS, loop->latch, gimple_bb (data->stmt))
>>> -      && scev_probably_wraps_p (initial_condition_in_loop_num (ev, loop->num),
>>> +      && scev_probably_wraps_p (NULL,
>>>
>>> use NULL_TREE for the null pointer constant of tree.
>>>
>>> +  /* Check if VAR evaluates in every loop iteration.  */
>>> +  gimple *def;
>>> +  if ((def = SSA_NAME_DEF_STMT (var)) != NULL
>>>
>>> def is never NULL but it might be a GIMPLE_NOP which has a NULL gimple_bb.
>>> Better check for ! SSA_DEFAULT_DEF_P (var)
>>>
>>> +  if (TREE_CODE (step) != INTEGER_CST || !INTEGRAL_TYPE_P (TREE_TYPE (var)))
>>> +    return false;
>>>
>>> this looks like a cheaper test so please do that first.
>>>
>>> +  step_wi = step;
>>> +  type = TREE_TYPE (var);
>>> +  if (tree_int_cst_sign_bit (step))
>>> +    {
>>> +      diff = lower_bound_in_type (type, type);
>>> +      diff = minv - diff;
>>> +      step_wi = - step_wi;
>>> +    }
>>> +  else
>>> +    {
>>> +      diff = upper_bound_in_type (type, type);
>>> +      diff = diff - maxv;
>>> +    }
>>>
>>> this lacks a comment - it's not obvious to me what the gymnastics
>>> with lower/upper_bound_in_type are supposed to achieve.
>>
>> Thanks for reviewing, I will prepare another version of patch.
>>>
>>> As VRP uses niter analysis itself I wonder how this fires back-to-back between
>> I am not sure if I mis-understood the question.  If the VRP
>> information comes from loop niter, I think it will not change loop
>> niter or VRP2 in back because that's the best information we got in
>> the first place in niter.  If the VRP information comes from other
>> places (guard conditions?)  SCEV and loop niter after vrp1 might be
>> improved and thus VRP2.  There should be no problems in either case,
>> as long as GCC breaks the recursive chain among niter/scev/vrp
>> correctly.
>
> Ok.
>
>>> VRP1 and VRP2?  If the def of var dominates the latch isn't it enough to do
>>> a + 1 to check whether VRP bumped the range up to INT_MAX/MIN?  That is,
>>> why do we need to add step if not for the TYPE_OVERFLOW_UNDEFINED case
>>> of VRP handling the ranges optimistically?
>> Again, please correct me if I mis-understood.  Considering a variable
>> whose type is unsigned int and scev is {0, 4}_loop, the value range
>> could be computed as [0, 0xfffffffc], thus MAX + 1 is smaller than
>> type_MAX, but the scev could be overflow.
>
> Yes.  I was wondering about the case where VRP bumps the range to +INF
> because it gave up during iteration or because overflow behavior is undefined.
> Do I understand correctly that the code is mostly to improve the not
> undefined-overflow case?
Hi Richard,

I think we resolved these on IRC, here are words for the record.
The motivation case is for unsigned type loop counter, while the patch
should work for signed type in theory.  Considering a loop has signed
char counter i and it's used in array_ref[i + 10], since front-end
converts signed char addition into unsigned operation, we may need the
range information to prove (unsigned char)i + 10 doesn't overflow,
thus address of array reference is a scev.  I am not sure if the
signed case can be handled by current code, or there are other
fallouts preventing this patch from working.

>
> Also I was wondering if the range DEF dominates the latch then why
> do we necessarily need to add step to verify overflow?  Can't we do better
> if we for example see that the DEF is the loop header PHI?
I don't think we can, and there is nothing special for loop header PHI
in this case, right?

Attachment is the updated patch with your comments resolved.
Bootstrap and test on x86_64 and AArch64, is it OK?

Thanks,
bin

Attachment: scev-overflow-with-range-info-20160712.txt
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]