This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Use SCEV information when aligning for vectorisation (PR 84005)


Richard Biener <richard.guenther@gmail.com> writes:
> On Mon, Mar 19, 2018 at 10:27 PM, Richard Sandiford
> <richard.sandiford@linaro.org> wrote:
>> Richard Biener <richard.guenther@gmail.com> writes:
>>> On Sat, Mar 17, 2018 at 11:45 AM, Richard Sandiford
>>> <richard.sandiford@linaro.org> wrote:
>>>> Index: gcc/tree-data-ref.c
>>>> ===================================================================
>>>> --- gcc/tree-data-ref.c 2018-02-14 13:14:36.268006810 +0000
>>>> +++ gcc/tree-data-ref.c 2018-03-17 10:43:42.544669549 +0000
>>>> @@ -5200,6 +5200,75 @@ dr_alignment (innermost_loop_behavior *d
>>>>    return alignment;
>>>>  }
>>>>
>>>> +/* If BASE is a pointer-typed SSA name, try to find the object that it
>>>> +   is based on.  Return this object X on success and store the alignment
>>>> +   in bytes of BASE - &X in *ALIGNMENT_OUT.  */
>>>> +
>>>> +static tree
>>>> +get_base_for_alignment_1 (tree base, unsigned int *alignment_out)
>>>> +{
>>>> +  if (TREE_CODE (base) != SSA_NAME || !POINTER_TYPE_P (TREE_TYPE (base)))
>>>> +    return NULL_TREE;
>>>> +
>>>> +  gimple *def = SSA_NAME_DEF_STMT (base);
>>>> +  base = analyze_scalar_evolution (loop_containing_stmt (def), base);
>>>
>>> I think it is an error to use the def stmt of base here.  Instead you
>>> need to pass down the point/loop of the use.  For the same reason you
>>> also want to instantiate parameters after analyzing the evolution.
>>>
>>> In the end, if the BB to be vectorized is contained in a loop nest we want to
>>> instantiate up to the point where (eventually) a DECL we can re-align
>>> appears,
>>> but using the containing loop for now looks OK.
>>
>> Why's that necessary/better though?  We're not interested in the
>> evolution of the value at the point it*s used by the potential vector
>> code, but in how we get from the ultimate base (if there is one) to the
>> definition of the SSA name.
>
> Hmm, ok.
>
>> I don't think instantiating the SCEV would give any new information,
>> but it could lose some.  E.g. if we have:
>>
>>   x = &foo;
>>   do
>>     x += 8;
>>   while (*y++ < 10)
>>   ...potential vector use of *x...
>>
>> we wouldn't be able to express the address of x after the do-while
>> loop, because there's nothing that counts the number of iterations.
>> But the uninstantiated SCEV could tell us that x = &foo + N * 8 for
>> some (unknown) N.
>
> Not sure that it works that way.  I'm not 100% sure what kind of
> parameters are left symbolic, and if analysis loop and instantiation
> "loop" are the same I guess it doesn't make any difference...
>
>> (OK, so that doesn't actually work unless we take the effort
>> to look through loop-closed SSA form, but still...)
>>
>>>> +  /* Peel chrecs and record the minimum alignment preserved by
>>>> +     all steps.  */
>>>> +  unsigned int alignment = MAX_OFILE_ALIGNMENT / BITS_PER_UNIT;
>>>> +  while (TREE_CODE (base) == POLYNOMIAL_CHREC)
>>>> +    {
>>>> + unsigned int step_alignment = highest_pow2_factor (CHREC_RIGHT
>>>> (base));
>>>> +      alignment = MIN (alignment, step_alignment);
>>>> +      base = CHREC_LEFT (base);
>>>> +    }
>>>> +
>>>> +  /* Punt if the expression is too complicated to handle.  */
>>>> + if (tree_contains_chrecs (base, NULL) || !POINTER_TYPE_P (TREE_TYPE
>>>> (base)))
>>>> +    return NULL_TREE;
>>>> +
>>>> +  /* Analyze the base to which the steps we peeled were applied.  */
>>>> +  poly_int64 bitsize, bitpos, bytepos;
>>>> +  machine_mode mode;
>>>> +  int unsignedp, reversep, volatilep;
>>>> +  tree offset;
>>>> +  base = get_inner_reference (build_fold_indirect_ref (base),
>>>
>>> ick :/
>>>
>>> what happens if you simply use get_pointer_alignment here and
>>> strip down POINTER_PLUS_EXPRs to the ultimate LHS?  (basically
>>> what get_pointer_alignment_1 does)  After all get_base_for_alignment_1
>>> itself only deals with plain SSA_NAMEs, not, say, &a_1->b.c or &MEM[a_1 + 4].
>>
>> Yeah, but those have (hopefully) been handled by dr_analyze_innermost
>> already, which should have stripped off both the constant and variable
>> parts of the offset.  So I think SSA names are the only interesting
>> input.  The problem is that once we follow the definitions of an SSA
>> name through CHREC_LEFTs, we get a general address again.
>>
>> get_pointer_alignment and get_pointer_alignment_1 don't do what we want
>> because they take the alignment of the existing object into account,
>> whereas here we explicitly want to ignore that and only calculate the
>> alignment of the offset.
>
> Ah, indeed - I missed that fact.
>
>> I guess we could pass a flag down to ignore the current alignment though?
>
> But we need the base anyway...  so, given we can only ever re-align decls
> and never plain pointers instead of using build_fold_indirect_ref do
>
>  if (TREE_CODE (base) != ADDR_EXPR)
>    return NULL_TREE;
>
> else use TREE_OPERAND (base, 0)?

The build_fold_indirect_ref also helps for POINTER_PLUS_EXPR,
which is what we get for things like the do-while above (e.g.
{ &a + 64, +, 64 }_n if x points to a double *.)

I guess everything we care about would be handled by fold_indirect_ref_1
though, so would that be OK instead?

Thanks,
Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]