This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PR81635: Use chrecs to help find related data refs


On Wed, Aug 16, 2017 at 2:38 PM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> The first loop in the testcase regressed after my recent changes to
> dr_analyze_innermost.  Previously we would treat "i" as an iv even
> for bb analysis and end up with:
>
>    DR_BASE_ADDRESS: p or q
>    DR_OFFSET: 0
>    DR_INIT: 0 or 4
>    DR_STEP: 16
>
> We now always keep the step as 0 instead, so for an int "i" we'd have:
>
>    DR_BASE_ADDRESS: p or q
>    DR_OFFSET: (intptr_t) i
>    DR_INIT: 0 or 4
>    DR_STEP: 0
>
> This is also what we'd like to have for the unsigned "i", but the
> problem is that strip_constant_offset thinks that the "i + 1" in
> "(intptr_t) (i + 1)" could wrap and so doesn't peel off the "+ 1".
> The [i + 1] accesses therefore have a DR_OFFSET equal to the SSA
> name that holds "(intptr_t) (i + 1)", meaning that the accesses no
> longer seem to be related to the [i] ones.

Didn't read the change in detail, so sorry if I mis-understood the issue.
I made changes in scev to better fold type conversion by various sources
of information, for example, vrp, niters, undefined overflow behavior etc.
In theory these information should be available for other optimizers without
querying scev.  For the mentioned test, vrp should compute accurate range
information for "i" so that we can fold (intptr_t) (i + 1) it without worrying
overflow.  Note we don't do it in generic folding because (intptr_t) (i) + 1
could be more expensive (especially in case of (T)(i + j)), or because the
CST part is in bigger precision after conversion.
But such folding is wanted in several places, e.g, IVOPTs.  To provide such
an interface, we changed tree-affine and made it do aggressive fold.  I am
curious if it's possible to use aff_tree to implement strip_constant_offset
here since aggressive folding is wanted.  After all, using additional chrec
looks like a little heavy wrto the simple test.

Thanks,
bin
>
> There seem to be two main uses of DR_OFFSET and DR_INIT.  One type
> expects DR_OFFSET and DR_INIT to be generic expressions whose sum
> gives the initial offset from DR_BASE_ADDRESS.  The other type uses
> the pair (DR_BASE_ADDRESS, DR_OFFSET) to identify related data
> references, with the distance between their start addresses being
> the difference between their DR_INITs.  For this second type, the
> exact form of DR_OFFSET doesn't really matter, and the more it is
> able to canonicalise the non-constant offset, the better.
>
> The patch fixes the PR by adding another offset/init pair to the
> innermost loop behaviour for this second group.  The new pair use chrecs
> rather than generic exprs for the offset part, with the chrec being
> analysed in the innermost loop for which the offset isn't invariant.
> This allows us to vectorise the second function in the testcase
> as well, which wasn't possible before the earlier patch.
>
> Tested on x86_64-linux-gnu and aarch64-linux-gnu.  OK to install?
>
> Richard
>
>
> gcc/
>         PR tree-optimization/81635
>         * tree-data-ref.h (innermost_loop_behavior): Add chrec_offset and
>         chrec_init.
>         (DR_CHREC_OFFSET, DR_CHREC_INIT): New macros.
>         (dr_equal_offsets_p): Delete.
>         (dr_chrec_offsets_equal_p, dr_chrec_offsets_compare): Declare.
>         * tree-data-ref.c: Include tree-ssa-loop-ivopts.h.
>         (split_constant_offset): Handle POLYNOMIAL_CHREC.
>         (dr_analyze_innermost): Initialize dr_chrec_offset and dr_chrec_init.
>         (operator ==): Use dr_chrec_offsets_equal_p and compare the
>         DR_CHREC_INITs.
>         (prune_runtime_alias_test_list): Likewise.
>         (comp_dr_with_seg_len_pair): Use dr_chrec_offsets_compare and compare
>         the DR_CHREC_INITs.
>         (dr_equal_offsets_p1, dr_equal_offsets_p): Delete.
>         (analyze_offset_scev): New function.
>         (compute_offset_chrecs): Likewise.
>         (dr_chrec_offsets_equal_p): Likewise.
>         (dr_chrec_offsets_compare): Likewise.
>         * tree-loop-distribution.c (compute_alias_check_pairs): Use
>         dr_chrec_offsets_compare.
>         * tree-vect-data-refs.c (vect_find_same_alignment_drs): Use
>         dr_chrec_offsets_compare and compare the DR_CHREC_INITs.
>         (dr_group_sort_cmp): Likewise.
>         (vect_analyze_group_access_1): Use DR_CHREC_INIT instead of DR_INIT.
>         (vect_no_alias_p): Likewise.
>         (vect_analyze_data_ref_accesses): Use dr_chrec_offsets_equal_p and
>         compare the DR_CHREC_INITs.
>         (vect_prune_runtime_alias_test_list): Use dr_chrec_offsets_compare.
>         * tree-vect-stmts.c (vectorizable_load): Use DR_CHREC_INIT instead
>         of DR_INIT.
>
> gcc/testsuite/
>         PR tree-optimization/81635
>         * gcc.dg/vect/bb-slp-pr81635.c: New test.
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]