This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Use SCEV information when aligning for vectorisation (PR 84005)


On Tue, Mar 20, 2018 at 4:26 PM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> Richard Biener <richard.guenther@gmail.com> writes:
>> On Mon, Mar 19, 2018 at 10:27 PM, Richard Sandiford
>> <richard.sandiford@linaro.org> wrote:
>>> Richard Biener <richard.guenther@gmail.com> writes:
>>>> On Sat, Mar 17, 2018 at 11:45 AM, Richard Sandiford
>>>> <richard.sandiford@linaro.org> wrote:
>>>>> Index: gcc/tree-data-ref.c
>>>>> ===================================================================
>>>>> --- gcc/tree-data-ref.c 2018-02-14 13:14:36.268006810 +0000
>>>>> +++ gcc/tree-data-ref.c 2018-03-17 10:43:42.544669549 +0000
>>>>> @@ -5200,6 +5200,75 @@ dr_alignment (innermost_loop_behavior *d
>>>>>    return alignment;
>>>>>  }
>>>>>
>>>>> +/* If BASE is a pointer-typed SSA name, try to find the object that it
>>>>> +   is based on.  Return this object X on success and store the alignment
>>>>> +   in bytes of BASE - &X in *ALIGNMENT_OUT.  */
>>>>> +
>>>>> +static tree
>>>>> +get_base_for_alignment_1 (tree base, unsigned int *alignment_out)
>>>>> +{
>>>>> +  if (TREE_CODE (base) != SSA_NAME || !POINTER_TYPE_P (TREE_TYPE (base)))
>>>>> +    return NULL_TREE;
>>>>> +
>>>>> +  gimple *def = SSA_NAME_DEF_STMT (base);
>>>>> +  base = analyze_scalar_evolution (loop_containing_stmt (def), base);
>>>>
>>>> I think it is an error to use the def stmt of base here.  Instead you
>>>> need to pass down the point/loop of the use.  For the same reason you
>>>> also want to instantiate parameters after analyzing the evolution.
>>>>
>>>> In the end, if the BB to be vectorized is contained in a loop nest we want to
>>>> instantiate up to the point where (eventually) a DECL we can re-align
>>>> appears,
>>>> but using the containing loop for now looks OK.
>>>
>>> Why's that necessary/better though?  We're not interested in the
>>> evolution of the value at the point it*s used by the potential vector
>>> code, but in how we get from the ultimate base (if there is one) to the
>>> definition of the SSA name.
>>
>> Hmm, ok.
>>
>>> I don't think instantiating the SCEV would give any new information,
>>> but it could lose some.  E.g. if we have:
>>>
>>>   x = &foo;
>>>   do
>>>     x += 8;
>>>   while (*y++ < 10)
>>>   ...potential vector use of *x...
>>>
>>> we wouldn't be able to express the address of x after the do-while
>>> loop, because there's nothing that counts the number of iterations.
>>> But the uninstantiated SCEV could tell us that x = &foo + N * 8 for
>>> some (unknown) N.
>>
>> Not sure that it works that way.  I'm not 100% sure what kind of
>> parameters are left symbolic, and if analysis loop and instantiation
>> "loop" are the same I guess it doesn't make any difference...
>>
>>> (OK, so that doesn't actually work unless we take the effort
>>> to look through loop-closed SSA form, but still...)
>>>
>>>>> +  /* Peel chrecs and record the minimum alignment preserved by
>>>>> +     all steps.  */
>>>>> +  unsigned int alignment = MAX_OFILE_ALIGNMENT / BITS_PER_UNIT;
>>>>> +  while (TREE_CODE (base) == POLYNOMIAL_CHREC)
>>>>> +    {
>>>>> + unsigned int step_alignment = highest_pow2_factor (CHREC_RIGHT
>>>>> (base));
>>>>> +      alignment = MIN (alignment, step_alignment);
>>>>> +      base = CHREC_LEFT (base);
>>>>> +    }
>>>>> +
>>>>> +  /* Punt if the expression is too complicated to handle.  */
>>>>> + if (tree_contains_chrecs (base, NULL) || !POINTER_TYPE_P (TREE_TYPE
>>>>> (base)))
>>>>> +    return NULL_TREE;
>>>>> +
>>>>> +  /* Analyze the base to which the steps we peeled were applied.  */
>>>>> +  poly_int64 bitsize, bitpos, bytepos;
>>>>> +  machine_mode mode;
>>>>> +  int unsignedp, reversep, volatilep;
>>>>> +  tree offset;
>>>>> +  base = get_inner_reference (build_fold_indirect_ref (base),
>>>>
>>>> ick :/
>>>>
>>>> what happens if you simply use get_pointer_alignment here and
>>>> strip down POINTER_PLUS_EXPRs to the ultimate LHS?  (basically
>>>> what get_pointer_alignment_1 does)  After all get_base_for_alignment_1
>>>> itself only deals with plain SSA_NAMEs, not, say, &a_1->b.c or &MEM[a_1 + 4].
>>>
>>> Yeah, but those have (hopefully) been handled by dr_analyze_innermost
>>> already, which should have stripped off both the constant and variable
>>> parts of the offset.  So I think SSA names are the only interesting
>>> input.  The problem is that once we follow the definitions of an SSA
>>> name through CHREC_LEFTs, we get a general address again.
>>>
>>> get_pointer_alignment and get_pointer_alignment_1 don't do what we want
>>> because they take the alignment of the existing object into account,
>>> whereas here we explicitly want to ignore that and only calculate the
>>> alignment of the offset.
>>
>> Ah, indeed - I missed that fact.
>>
>>> I guess we could pass a flag down to ignore the current alignment though?
>>
>> But we need the base anyway...  so, given we can only ever re-align decls
>> and never plain pointers instead of using build_fold_indirect_ref do
>>
>>  if (TREE_CODE (base) != ADDR_EXPR)
>>    return NULL_TREE;
>>
>> else use TREE_OPERAND (base, 0)?
>
> The build_fold_indirect_ref also helps for POINTER_PLUS_EXPR,
> which is what we get for things like the do-while above (e.g.
> { &a + 64, +, 64 }_n if x points to a double *.)
>
> I guess everything we care about would be handled by fold_indirect_ref_1
> though, so would that be OK instead?

Sounds like a compromise ;)

Patch is ok with that change.

Thanks,
Richard.

> Thanks,
> Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]