This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Use SCEV information when aligning for vectorisation (PR 84005)
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: GCC Patches <gcc-patches at gcc dot gnu dot org>, Richard Sandiford <richard dot sandiford at linaro dot org>
- Date: Mon, 19 Mar 2018 15:38:14 +0100
- Subject: Re: Use SCEV information when aligning for vectorisation (PR 84005)
- References: <871sgj0ym5.fsf@linaro.org>
On Sat, Mar 17, 2018 at 11:45 AM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> This PR is another regression caused by the removal of the simple_iv
> check in dr_analyze_innermost for BB analysis. Without splitting out
> the step, we weren't able to find an underlying object whose alignment
> could be increased.
>
> As with PR81635, I think the simple_iv was only handling one special
> case of something that ought to be more general. The more general
> thing here is that if the address can be analysed as a scalar
> evolution, and if all updates preserve alignment N, it's possible
> to align the address to N by increasing the alignment of the base
> object to N. That applies also to outer loops, and to both loop
> and BB analysis.
>
> I wasn't sure where the new functions ought to live, but tree-data-ref.c
> seemed OK since (a) that already does scev analysis on addresses and
> (b) you'd want to use dr_analyze_innermost first if you were analysing
> a reference.
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install?
>
> Richard
>
>
> 2018-03-17 Richard Sandiford <richard.sandiford@linaro.org>
>
> gcc/
> PR tree-optimization/84005
> * tree-data-ref.h (get_base_for_alignment): Declare.
> * tree-data-ref.c (get_base_for_alignment_1): New function.
> (get_base_for_alignment): Likewise.
> * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Use
> get_base_for_alignment to find a suitable base object, instead
> of always using drb->base_address.
>
> gcc/testsuite/
> PR tree-optimization/84005
> * gcc.dg/vect/bb-slp-1.c: Make sure there is no message about
> failing to force the alignment.
>
> Index: gcc/tree-data-ref.h
> ===================================================================
> --- gcc/tree-data-ref.h 2018-01-13 18:02:00.948360274 +0000
> +++ gcc/tree-data-ref.h 2018-03-17 10:43:42.544669549 +0000
> @@ -463,6 +463,7 @@ extern bool compute_all_dependences (vec
> extern tree find_data_references_in_bb (struct loop *, basic_block,
> vec<data_reference_p> *);
> extern unsigned int dr_alignment (innermost_loop_behavior *);
> +extern tree get_base_for_alignment (tree, unsigned int *);
>
> /* Return the alignment in bytes that DR is guaranteed to have at all
> times. */
> Index: gcc/tree-data-ref.c
> ===================================================================
> --- gcc/tree-data-ref.c 2018-02-14 13:14:36.268006810 +0000
> +++ gcc/tree-data-ref.c 2018-03-17 10:43:42.544669549 +0000
> @@ -5200,6 +5200,75 @@ dr_alignment (innermost_loop_behavior *d
> return alignment;
> }
>
> +/* If BASE is a pointer-typed SSA name, try to find the object that it
> + is based on. Return this object X on success and store the alignment
> + in bytes of BASE - &X in *ALIGNMENT_OUT. */
> +
> +static tree
> +get_base_for_alignment_1 (tree base, unsigned int *alignment_out)
> +{
> + if (TREE_CODE (base) != SSA_NAME || !POINTER_TYPE_P (TREE_TYPE (base)))
> + return NULL_TREE;
> +
> + gimple *def = SSA_NAME_DEF_STMT (base);
> + base = analyze_scalar_evolution (loop_containing_stmt (def), base);
I think it is an error to use the def stmt of base here. Instead you
need to pass down
the point/loop of the use. For the same reason you also want to instantiate
parameters after analyzing the evolution.
In the end, if the BB to be vectorized is contained in a loop nest we want to
instantiate up to the point where (eventually) a DECL we can re-align appears,
but using the containing loop for now looks OK.
> + /* Peel chrecs and record the minimum alignment preserved by
> + all steps. */
> + unsigned int alignment = MAX_OFILE_ALIGNMENT / BITS_PER_UNIT;
> + while (TREE_CODE (base) == POLYNOMIAL_CHREC)
> + {
> + unsigned int step_alignment = highest_pow2_factor (CHREC_RIGHT (base));
> + alignment = MIN (alignment, step_alignment);
> + base = CHREC_LEFT (base);
> + }
> +
> + /* Punt if the expression is too complicated to handle. */
> + if (tree_contains_chrecs (base, NULL) || !POINTER_TYPE_P (TREE_TYPE (base)))
> + return NULL_TREE;
> +
> + /* Analyze the base to which the steps we peeled were applied. */
> + poly_int64 bitsize, bitpos, bytepos;
> + machine_mode mode;
> + int unsignedp, reversep, volatilep;
> + tree offset;
> + base = get_inner_reference (build_fold_indirect_ref (base),
ick :/
what happens if you simply use get_pointer_alignment here and
strip down POINTER_PLUS_EXPRs to the ultimate LHS? (basically
what get_pointer_alignment_1 does) After all get_base_for_alignment_1
itself only deals with plain SSA_NAMEs, not, say, &a_1->b.c or &MEM[a_1 + 4].
Thanks,
Richard.
> + &bitsize, &bitpos, &offset, &mode,
> + &unsignedp, &reversep, &volatilep);
> + if (!base || !multiple_p (bitpos, BITS_PER_UNIT, &bytepos))
> + return NULL_TREE;
> +
> + /* Restrict the alignment to that guaranteed by the offsets. */
> + unsigned int bytepos_alignment = known_alignment (bytepos);
> + if (bytepos_alignment != 0)
> + alignment = MIN (alignment, bytepos_alignment);
> + if (offset)
> + {
> + unsigned int offset_alignment = highest_pow2_factor (offset);
> + alignment = MIN (alignment, offset_alignment);
> + }
> +
> + *alignment_out = alignment;
> + return base;
> +}
> +
> +/* Return the object whose alignment would need to be changed in order
> + to increase the alignment of ADDR. Store the maximum achievable
> + alignment in *MAX_ALIGNMENT. */
> +
> +tree
> +get_base_for_alignment (tree addr, unsigned int *max_alignment)
> +{
> + tree base = get_base_for_alignment_1 (addr, max_alignment);
> + if (base)
> + return base;
> +
> + if (TREE_CODE (addr) == ADDR_EXPR)
> + addr = TREE_OPERAND (addr, 0);
> + *max_alignment = MAX_OFILE_ALIGNMENT / BITS_PER_UNIT;
> + return addr;
> +}
> +
> /* Recursive helper function. */
>
> static bool
> Index: gcc/tree-vect-data-refs.c
> ===================================================================
> --- gcc/tree-vect-data-refs.c 2018-03-02 09:45:40.478265383 +0000
> +++ gcc/tree-vect-data-refs.c 2018-03-17 10:43:42.544669549 +0000
> @@ -957,11 +957,11 @@ vect_compute_data_ref_alignment (struct
>
> if (base_alignment < vector_alignment)
> {
> - tree base = drb->base_address;
> - if (TREE_CODE (base) == ADDR_EXPR)
> - base = TREE_OPERAND (base, 0);
> - if (!vect_can_force_dr_alignment_p (base,
> - vector_alignment * BITS_PER_UNIT))
> + unsigned int max_alignment;
> + tree base = get_base_for_alignment (drb->base_address, &max_alignment);
> + if (max_alignment < vector_alignment
> + || !vect_can_force_dr_alignment_p (base,
> + vector_alignment * BITS_PER_UNIT))
> {
> if (dump_enabled_p ())
> {
> Index: gcc/testsuite/gcc.dg/vect/bb-slp-1.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/vect/bb-slp-1.c 2017-11-09 15:15:06.846650880 +0000
> +++ gcc/testsuite/gcc.dg/vect/bb-slp-1.c 2018-03-17 10:43:42.541669742 +0000
> @@ -54,5 +54,5 @@ int main (void)
> return 0;
> }
>
> +/* { dg-final { scan-tree-dump-not "can't force alignment" "slp1" } } */
> /* { dg-final { scan-tree-dump-times "basic block vectorized" 1 "slp1" } } */
> -