[PATCH PR85190]Adjust pointer for aligned access

Bin.Cheng amker.cheng@gmail.com
Tue Apr 10 17:27:00 GMT 2018


On Tue, Apr 10, 2018 at 5:28 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Tue, Apr 10, 2018 at 3:58 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> On Tue, Apr 10, 2018 at 2:26 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>>> On Tue, Apr 10, 2018 at 09:55:35AM +0000, Bin Cheng wrote:
>>>> Hi Rainer, could you please help me double check that this solves the issue?
>>>>
>>>> Thanks,
>>>> bin
>>>>
>>>> gcc/testsuite
>>>> 2018-04-10  Bin Cheng  <bin.cheng@arm.com>
>>>>
>>>>       PR testsuite/85190
>>>>       * gcc.dg/vect/pr81196.c: Adjust pointer for aligned access.
>>>
>>>> diff --git a/gcc/testsuite/gcc.dg/vect/pr81196.c b/gcc/testsuite/gcc.dg/vect/pr81196.c
>>>> index 46d7a9e..15320ae 100644
>>>> --- a/gcc/testsuite/gcc.dg/vect/pr81196.c
>>>> +++ b/gcc/testsuite/gcc.dg/vect/pr81196.c
>>>> @@ -4,14 +4,14 @@
>>>>
>>>>  void f(short*p){
>>>>    p=(short*)__builtin_assume_aligned(p,64);
>>>> -  short*q=p+256;
>>>> +  short*q=p+255;
>>>>    for(;p!=q;++p,--q){
>>>>      short t=*p;*p=*q;*q=t;
>>>
>>> This is UB then though, because p will never be equal to q.
>
> Hmm, though it's UB in this case, is it OK for niter analysis gives
> below results?
>
> Analyzing # of iterations of loop 1
>   exit condition [126, + , 18446744073709551615] != 0
>   bounds on difference of bases: -126 ... -126
>   result:
>     # of iterations 126, bounded by 126
>
> I don't really follow last piece of code in number_of_iterations_ne:
>
>   /* Let nsd (step, size of mode) = d.  If d does not divide c, the loop
>      is infinite.  Otherwise, the number of iterations is
>      (inverse(s/d) * (c/d)) mod (size of mode/d).  */
>   bits = num_ending_zeros (s);
>   bound = build_low_bits_mask (niter_type,
>                    (TYPE_PRECISION (niter_type)
>                 - tree_to_uhwi (bits)));
>
>   d = fold_binary_to_constant (LSHIFT_EXPR, niter_type,
>                    build_int_cst (niter_type, 1), bits);
>   s = fold_binary_to_constant (RSHIFT_EXPR, niter_type, s, bits);
>
>   if (!exit_must_be_taken)
>     {
>       /* If we cannot assume that the exit is taken eventually, record the
>      assumptions for divisibility of c.  */
>       assumption = fold_build2 (FLOOR_MOD_EXPR, niter_type, c, d);
>       assumption = fold_build2 (EQ_EXPR, boolean_type_node,
>                 assumption, build_int_cst (niter_type, 0));
>       if (!integer_nonzerop (assumption))
>     niter->assumptions = fold_build2 (TRUTH_AND_EXPR, boolean_type_node,
>                       niter->assumptions, assumption);
>     }
>
>   c = fold_build2 (EXACT_DIV_EXPR, niter_type, c, d);
>   tmp = fold_build2 (MULT_EXPR, niter_type, c, inverse (s, bound));
>   niter->niter = fold_build2 (BIT_AND_EXPR, niter_type, tmp, bound);
>   return true;
>
> Though infinite niters is mentioned, I don't see it's handled?

So the default behavior we have for long time is to completely unroll
below loop:

void f(short*p){
  p=(short*)__builtin_assume_aligned(p,64);
  short*q=p+5;
  for(;p!=q;++p,--q){
    short t=*p;*p=*q;*q=t;
  }
}

because:

Analyzing # of iterations of loop 1
  exit condition [p_6, + , 4](no_overflow) != p_6 + 10
  bounds on difference of bases: 10 ... 10
  result:
    # of iterations 2, bounded by 2

Thanks,
bin
>
> Thanks,
> bin
>> Sorry I already checked in, will try to correct it in another patch.
>>
>> Thanks,
>> bin
>>>
>>>>    }
>>>>  }
>>>>  void b(short*p){
>>>>    p=(short*)__builtin_assume_aligned(p,64);
>>>> -  short*q=p+256;
>>>> +  short*q=p+255;
>>>>    for(;p<q;++p,--q){
>>>>      short t=*p;*p=*q;*q=t;
>>>
>>> This one is fine, sure.
>>>
>>>         Jakub



More information about the Gcc-patches mailing list