This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Handle data dependence relations with different bases
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Richard Biener <richard dot guenther at gmail dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Richard Sandiford <richard dot sandiford at linaro dot org>
- Date: Fri, 4 Aug 2017 11:35:28 +0200
- Subject: Re: Handle data dependence relations with different bases
- Authentication-results: sourceware.org; auth=none
- References: <87tw52s73r.fsf@linaro.org> <CAFiYyc0MREgfp6GCL6=d1hkRaEVX8NOAaX=hNgaGooEs8L5QbQ@mail.gmail.com> <87lgp38p59.fsf@linaro.org> <3947361.mLME8MOua7@polaris> <87y3s08hy2.fsf@linaro.org> <87wp6urs0j.fsf@linaro.org> <CAFiYyc3RhH453YJUX8m7856NDOqAuMkpREW6FBEbzid3V2SDsw@mail.gmail.com> <87mv7fof5a.fsf@linaro.org>
On Fri, Aug 4, 2017 at 11:28 AM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> Richard Biener <richard.guenther@gmail.com> writes:
>> On Thu, Jul 27, 2017 at 2:19 PM, Richard Sandiford
>> <richard.sandiford@linaro.org> wrote:
>>> Richard Sandiford <richard.sandiford@linaro.org> writes:
>>>> Eric Botcazou <ebotcazou@adacore.com> writes:
>>>>> [Sorry for missing the previous messages]
>>>>>
>>>>>> Thanks. Just been retesting, and I think I must have forgotten
>>>>>> to include Ada last time. It turns out that the patch causes a dg-scan
>>>>>> regression in gnat.dg/vect17.adb, because we now think that if the
>>>>>> array RECORD_TYPEs *do* alias in:
>>>>>>
>>>>>> procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
>>>>>> begin
>>>>>> for I in Sarray'Range loop
>>>>>> R(I) := X(I) + Y(I);
>>>>>> end loop;
>>>>>> end;
>>>>>>
>>>>>> then the dependence distance must be zero. Eric, does that hold true
>>>>>> for Ada? I.e. if X and R (or Y and R) alias, must it be the case that
>>>>>> X(I) can only alias R(I) and not for example R(I-1) or R(I+1)?
>>>>>
>>>>> Yes, I'd think so (even without the artificial RECORD_TYPE around
>> the arrays).
>>>>
>>>> Good!
>>>>
>>>>>> 2017-06-07 Richard Sandiford <richard.sandiford@linaro.org>
>>>>>>
>>>>>> gcc/testsuite/
>>>>>> * gnat.dg/vect17.ads (Sarray): Increase range to 1 .. 5.
>>>>>> * gnat.dg/vect17.adb (Add): Create a dependence distance of 1
>>>>>> when X = R or Y = R.
>>>>>
>>>>> I think that you need to modify vect15 and vect16 the same way.
>>>>
>>>> Ah, yeah. And doing that shows that I'd not handled safelen for
>>>> DDR_COULD_BE_INDEPENDENT_P. I've fixed that locally.
>>>>
>>>> How does this look? Tested on x86_64-linux-gnu both without the
>>>> vectoriser changes and with the fixed vectoriser patch.
>>>
>>> Here's a version of the patch that handles safelen. I split the
>>> handling out into a new function (vect_analyze_possibly_independent_ddr)
>>> since it was getting too big to do inline.
>>>
>>> Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install?
>>
>> Ok.
>
> Thanks!
>
>> Did you check whether BB vectorization is affected? See
>> vect_slp_analyze_instance_dependence
>> and friends. It's quite conservative but given the prefetching change
>> I wonder if we need
>> to rule out DDR_COULD_BE_INDEPENDENT_P?
>
> I think it should be OK. When DDR_COULD_BE_INDEPENDENT_P is set,
> we've effectively changed from DDR_ARE_DEPENDENT == chrec_dont_know
> to a conservatively-correct distance vector. It looks like
> vect_slp_analyze_data_ref_dependence handles both cases in the
> same way (by returning true).
Yes. Could be improved of course.
Thanks for double-checking.
Richard.
> Thanks,
> Richard
>
>>
>> Thanks,
>> Richard.
>>
>>> Thanks,
>>> Richard
>>>
>>>
>>> 2017-07-27 Richard Sandiford <richard.sandiford@linaro.org>
>>>
>>> gcc/
>>> * tree-data-ref.h (subscript): Add access_fn field.
>>> (data_dependence_relation): Add could_be_independent_p.
>>> (SUB_ACCESS_FN, DDR_COULD_BE_INDEPENDENT_P): New macros.
>>> (same_access_functions): Move to tree-data-ref.c.
>>> * tree-data-ref.c (ref_contains_union_access_p): New function.
>>> (access_fn_component_p): Likewise.
>>> (access_fn_components_comparable_p): Likewise.
>>> (dr_analyze_indices): Add a reference to access_fn_component_p.
>>> (dump_data_dependence_relation): Use SUB_ACCESS_FN instead of
>>> DR_ACCESS_FN.
>>> (constant_access_functions): Likewise.
>>> (add_other_self_distances): Likewise.
>>> (same_access_functions): Likewise. (Moved from tree-data-ref.h.)
>>> (initialize_data_dependence_relation): Use XCNEW and remove
>>> explicit zeroing of DDR_REVERSED_P. Look for a subsequence
>>> of access functions that have the same type. Allow the
>>> subsequence to end with different bases in some circumstances.
>>> Record the chosen access functions in SUB_ACCESS_FN.
>>> (build_classic_dist_vector_1): Replace ddr_a and ddr_b with
>>> a_index and b_index. Use SUB_ACCESS_FN instead of DR_ACCESS_FN.
>>> (subscript_dependence_tester_1): Likewise dra and drb.
>>> (build_classic_dist_vector): Update calls accordingly.
>>> (subscript_dependence_tester): Likewise.
>>> * tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Check
>>> DDR_COULD_BE_INDEPENDENT_P.
>>> * tree-vectorizer.h (LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Test
>>> comp_alias_ddrs instead of may_alias_ddrs.
>>> * tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
>>> New function.
>>> (vect_analyze_data_ref_dependence): Use it if
>>> DDR_COULD_BE_INDEPENDENT_P, but fall back to using the recorded
>>> distance vectors if that fails.
>>> (dependence_distance_ge_vf): New function.
>>> (vect_prune_runtime_alias_test_list): Use it. Don't clear
>>> LOOP_VINFO_MAY_ALIAS_DDRS.
>>>
>>> gcc/testsuite/
>>> * gcc.dg/vect/vect-alias-check-3.c: New test.
>>> * gcc.dg/vect/vect-alias-check-4.c: Likewise.
>>> * gcc.dg/vect/vect-alias-check-5.c: Likewise.
>>>
>>> Index: gcc/tree-data-ref.h
>>> ===================================================================
>>> --- gcc/tree-data-ref.h 2017-07-27 13:10:29.620045506 +0100
>>> +++ gcc/tree-data-ref.h 2017-07-27 13:10:33.023912613 +0100
>>> @@ -260,6 +260,9 @@ struct conflict_function
>>>
>>> struct subscript
>>> {
>>> + /* The access functions of the two references. */
>>> + tree access_fn[2];
>>> +
>>> /* A description of the iterations for which the elements are
>>> accessed twice. */
>>> conflict_function *conflicting_iterations_in_a;
>>> @@ -278,6 +281,7 @@ struct subscript
>>>
>>> typedef struct subscript *subscript_p;
>>>
>>> +#define SUB_ACCESS_FN(SUB, I) (SUB)->access_fn[I]
>>> #define SUB_CONFLICTS_IN_A(SUB) (SUB)->conflicting_iterations_in_a
>>> #define SUB_CONFLICTS_IN_B(SUB) (SUB)->conflicting_iterations_in_b
>>> #define SUB_LAST_CONFLICT(SUB) (SUB)->last_conflict
>>> @@ -333,6 +337,33 @@ struct data_dependence_relation
>>> /* Set to true when the dependence relation is on the same data
>>> access. */
>>> bool self_reference_p;
>>> +
>>> + /* True if the dependence described is conservatively correct rather
>>> + than exact, and if it is still possible for the accesses to be
>>> + conditionally independent. For example, the a and b references in:
>>> +
>>> + struct s *a, *b;
>>> + for (int i = 0; i < n; ++i)
>>> + a->f[i] += b->f[i];
>>> +
>>> + conservatively have a distance vector of (0), for the case in which
>>> + a == b, but the accesses are independent if a != b. Similarly,
>>> + the a and b references in:
>>> +
>>> + struct s *a, *b;
>>> + for (int i = 0; i < n; ++i)
>>> + a[0].f[i] += b[i].f[i];
>>> +
>>> + conservatively have a distance vector of (0), but they are indepenent
>>> + when a != b + i. In contrast, the references in:
>>> +
>>> + struct s *a;
>>> + for (int i = 0; i < n; ++i)
>>> + a->f[i] += a->f[i];
>>> +
>>> + have the same distance vector of (0), but the accesses can never be
>>> + independent. */
>>> + bool could_be_independent_p;
>>> };
>>>
>>> typedef struct data_dependence_relation *ddr_p;
>>> @@ -363,6 +394,7 @@ #define DDR_DIR_VECT(DDR, I) \
>>> #define DDR_DIST_VECT(DDR, I) \
>>> DDR_DIST_VECTS (DDR)[I]
>>> #define DDR_REVERSED_P(DDR) (DDR)->reversed_p
>>> +#define DDR_COULD_BE_INDEPENDENT_P(DDR) (DDR)->could_be_independent_p
>>>
>>>
>>> bool dr_analyze_innermost (innermost_loop_behavior *, tree, struct loop *);
>>> @@ -457,22 +489,6 @@ same_data_refs (data_reference_p a, data
>>> return false;
>>>
>>> return true;
>>> -}
>>> -
>>> -/* Return true when the DDR contains two data references that have the
>>> - same access functions. */
>>> -
>>> -static inline bool
>>> -same_access_functions (const struct data_dependence_relation *ddr)
>>> -{
>>> - unsigned i;
>>> -
>>> - for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
>>> - if (!eq_evolutions_p (DR_ACCESS_FN (DDR_A (ddr), i),
>>> - DR_ACCESS_FN (DDR_B (ddr), i)))
>>> - return false;
>>> -
>>> - return true;
>>> }
>>>
>>> /* Returns true when all the dependences are computable. */
>>> Index: gcc/tree-data-ref.c
>>> ===================================================================
>>> --- gcc/tree-data-ref.c 2017-07-27 13:10:29.620045506 +0100
>>> +++ gcc/tree-data-ref.c 2017-07-27 13:10:33.023912613 +0100
>>> @@ -124,8 +124,7 @@ Software Foundation; either version 3, o
>>> } dependence_stats;
>>>
>>> static bool subscript_dependence_tester_1 (struct data_dependence_relation *,
>>> - struct data_reference *,
>>> - struct data_reference *,
>>> + unsigned int, unsigned int,
>>> struct loop *);
>>> /* Returns true iff A divides B. */
>>>
>>> @@ -145,6 +144,21 @@ int_divides_p (int a, int b)
>>> return ((b % a) == 0);
>>> }
>>>
>>> +/* Return true if reference REF contains a union access. */
>>> +
>>> +static bool
>>> +ref_contains_union_access_p (tree ref)
>>> +{
>>> + while (handled_component_p (ref))
>>> + {
>>> + ref = TREE_OPERAND (ref, 0);
>>> + if (TREE_CODE (TREE_TYPE (ref)) == UNION_TYPE
>>> + || TREE_CODE (TREE_TYPE (ref)) == QUAL_UNION_TYPE)
>>> + return true;
>>> + }
>>> + return false;
>>> +}
>>> +
>>>
>>>
>>> /* Dump into FILE all the data references from DATAREFS. */
>>> @@ -434,13 +448,14 @@ dump_data_dependence_relation (FILE *out
>>> unsigned int i;
>>> struct loop *loopi;
>>>
>>> - for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
>>> + subscript *sub;
>>> + FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
>>> {
>>> fprintf (outf, " access_fn_A: ");
>>> - print_generic_stmt (outf, DR_ACCESS_FN (dra, i));
>>> + print_generic_stmt (outf, SUB_ACCESS_FN (sub, 0));
>>> fprintf (outf, " access_fn_B: ");
>>> - print_generic_stmt (outf, DR_ACCESS_FN (drb, i));
>>> - dump_subscript (outf, DDR_SUBSCRIPT (ddr, i));
>>> + print_generic_stmt (outf, SUB_ACCESS_FN (sub, 1));
>>> + dump_subscript (outf, sub);
>>> }
>>>
>>> fprintf (outf, " inner loop index: %d\n", DDR_INNER_LOOP (ddr));
>>> @@ -920,6 +935,27 @@ dr_analyze_innermost (innermost_loop_beh
>>> return true;
>>> }
>>>
>>> +/* Return true if OP is a valid component reference for a DR access
>>> + function. This accepts a subset of what handled_component_p accepts. */
>>> +
>>> +static bool
>>> +access_fn_component_p (tree op)
>>> +{
>>> + switch (TREE_CODE (op))
>>> + {
>>> + case REALPART_EXPR:
>>> + case IMAGPART_EXPR:
>>> + case ARRAY_REF:
>>> + return true;
>>> +
>>> + case COMPONENT_REF:
>>> + return TREE_CODE (TREE_TYPE (TREE_OPERAND (op, 0))) == RECORD_TYPE;
>>> +
>>> + default:
>>> + return false;
>>> + }
>>> +}
>>> +
>>> /* Determines the base object and the list of indices of memory reference
>>> DR, analyzed in LOOP and instantiated in loop nest NEST. */
>>>
>>> @@ -957,7 +993,9 @@ dr_analyze_indices (struct data_referenc
>>> access_fns.safe_push (integer_one_node);
>>> }
>>>
>>> - /* Analyze access functions of dimensions we know to be independent. */
>>> + /* Analyze access functions of dimensions we know to be independent.
>>> + The list of component references handled here should be kept in
>>> + sync with access_fn_component_p. */
>>> while (handled_component_p (ref))
>>> {
>>> if (TREE_CODE (ref) == ARRAY_REF)
>>> @@ -2148,6 +2186,38 @@ dr_may_alias_p (const struct data_refere
>>> return refs_may_alias_p (addr_a, addr_b);
>>> }
>>>
>>> +/* REF_A and REF_B both satisfy access_fn_component_p. Return true
>>> + if it is meaningful to compare their associated access functions
>>> + when checking for dependencies. */
>>> +
>>> +static bool
>>> +access_fn_components_comparable_p (tree ref_a, tree ref_b)
>>> +{
>>> + /* Allow pairs of component refs from the following sets:
>>> +
>>> + { REALPART_EXPR, IMAGPART_EXPR }
>>> + { COMPONENT_REF }
>>> + { ARRAY_REF }. */
>>> + tree_code code_a = TREE_CODE (ref_a);
>>> + tree_code code_b = TREE_CODE (ref_b);
>>> + if (code_a == IMAGPART_EXPR)
>>> + code_a = REALPART_EXPR;
>>> + if (code_b == IMAGPART_EXPR)
>>> + code_b = REALPART_EXPR;
>>> + if (code_a != code_b)
>>> + return false;
>>> +
>>> + if (TREE_CODE (ref_a) == COMPONENT_REF)
>>> + /* ??? We cannot simply use the type of operand #0 of the refs here as
>>> + the Fortran compiler smuggles type punning into COMPONENT_REFs.
>>> + Use the DECL_CONTEXT of the FIELD_DECLs instead. */
>>> + return (DECL_CONTEXT (TREE_OPERAND (ref_a, 1))
>>> + == DECL_CONTEXT (TREE_OPERAND (ref_b, 1)));
>>> +
>>> + return types_compatible_p (TREE_TYPE (TREE_OPERAND (ref_a, 0)),
>>> + TREE_TYPE (TREE_OPERAND (ref_b, 0)));
>>> +}
>>> +
>>> /* Initialize a data dependence relation between data accesses A and
>>> B. NB_LOOPS is the number of loops surrounding the references: the
>>> size of the classic distance/direction vectors. */
>>> @@ -2160,11 +2230,10 @@ initialize_data_dependence_relation (str
>>> struct data_dependence_relation *res;
>>> unsigned int i;
>>>
>>> - res = XNEW (struct data_dependence_relation);
>>> + res = XCNEW (struct data_dependence_relation);
>>> DDR_A (res) = a;
>>> DDR_B (res) = b;
>>> DDR_LOOP_NEST (res).create (0);
>>> - DDR_REVERSED_P (res) = false;
>>> DDR_SUBSCRIPTS (res).create (0);
>>> DDR_DIR_VECTS (res).create (0);
>>> DDR_DIST_VECTS (res).create (0);
>>> @@ -2182,82 +2251,277 @@ initialize_data_dependence_relation (str
>>> return res;
>>> }
>>>
>>> - /* The case where the references are exactly the same. */
>>> - if (operand_equal_p (DR_REF (a), DR_REF (b), 0))
>>> + unsigned int num_dimensions_a = DR_NUM_DIMENSIONS (a);
>>> + unsigned int num_dimensions_b = DR_NUM_DIMENSIONS (b);
>>> + if (num_dimensions_a == 0 || num_dimensions_b == 0)
>>> {
>>> - if ((loop_nest.exists ()
>>> - && !object_address_invariant_in_loop_p (loop_nest[0],
>>> - DR_BASE_OBJECT (a)))
>>> - || DR_NUM_DIMENSIONS (a) == 0)
>>> + DDR_ARE_DEPENDENT (res) = chrec_dont_know;
>>> + return res;
>>> + }
>>> +
>>> + /* For unconstrained bases, the root (highest-indexed) subscript
>>> + describes a variation in the base of the original DR_REF rather
>>> + than a component access. We have no type that accurately describes
>>> + the new DR_BASE_OBJECT (whose TREE_TYPE describes the type *after*
>>> + applying this subscript) so limit the search to the last real
>>> + component access.
>>> +
>>> + E.g. for:
>>> +
>>> + void
>>> + f (int a[][8], int b[][8])
>>> + {
>>> + for (int i = 0; i < 8; ++i)
>>> + a[i * 2][0] = b[i][0];
>>> + }
>>> +
>>> + the a and b accesses have a single ARRAY_REF component reference [0]
>>> + but have two subscripts. */
>>> + if (DR_UNCONSTRAINED_BASE (a))
>>> + num_dimensions_a -= 1;
>>> + if (DR_UNCONSTRAINED_BASE (b))
>>> + num_dimensions_b -= 1;
>>> +
>>> + /* These structures describe sequences of component references in
>>> + DR_REF (A) and DR_REF (B). Each component reference is tied to a
>>> + specific access function. */
>>> + struct {
>>> + /* The sequence starts at DR_ACCESS_FN (A, START_A) of A and
>>> + DR_ACCESS_FN (B, START_B) of B (inclusive) and extends to higher
>>> + indices. In C notation, these are the indices of the rightmost
>>> + component references; e.g. for a sequence .b.c.d, the start
>>> + index is for .d. */
>>> + unsigned int start_a;
>>> + unsigned int start_b;
>>> +
>>> + /* The sequence contains LENGTH consecutive access functions from
>>> + each DR. */
>>> + unsigned int length;
>>> +
>>> + /* The enclosing objects for the A and B sequences respectively,
>>> + i.e. the objects to which DR_ACCESS_FN (A, START_A + LENGTH - 1)
>>> + and DR_ACCESS_FN (B, START_B + LENGTH - 1) are applied. */
>>> + tree object_a;
>>> + tree object_b;
>>> + } full_seq = {}, struct_seq = {};
>>> +
>>> + /* Before each iteration of the loop:
>>> +
>>> + - REF_A is what you get after applying DR_ACCESS_FN (A, INDEX_A) and
>>> + - REF_B is what you get after applying DR_ACCESS_FN (B, INDEX_B). */
>>> + unsigned int index_a = 0;
>>> + unsigned int index_b = 0;
>>> + tree ref_a = DR_REF (a);
>>> + tree ref_b = DR_REF (b);
>>> +
>>> + /* Now walk the component references from the final DR_REFs back up to
>>> + the enclosing base objects. Each component reference corresponds
>>> + to one access function in the DR, with access function 0 being for
>>> + the final DR_REF and the highest-indexed access function being the
>>> + one that is applied to the base of the DR.
>>> +
>>> + Look for a sequence of component references whose access functions
>>> + are comparable (see access_fn_components_comparable_p). If more
>>> + than one such sequence exists, pick the one nearest the base
>>> + (which is the leftmost sequence in C notation). Store this sequence
>>> + in FULL_SEQ.
>>> +
>>> + For example, if we have:
>>> +
>>> + struct foo { struct bar s; ... } (*a)[10], (*b)[10];
>>> +
>>> + A: a[0][i].s.c.d
>>> + B: __real b[0][i].s.e[i].f
>>> +
>>> + (where d is the same type as the real component of f) then the access
>>> + functions would be:
>>> +
>>> + 0 1 2 3
>>> + A: .d .c .s [i]
>>> +
>>> + 0 1 2 3 4 5
>>> + B: __real .f [i] .e .s [i]
>>> +
>>> + The A0/B2 column isn't comparable, since .d is a COMPONENT_REF
>>> + and [i] is an ARRAY_REF. However, the A1/B3 column contains two
>>> + COMPONENT_REF accesses for struct bar, so is comparable. Likewise
>>> + the A2/B4 column contains two COMPONENT_REF accesses for struct foo,
>>> + so is comparable. The A3/B5 column contains two ARRAY_REFs that
>>> + index foo[10] arrays, so is again comparable. The sequence is
>>> + therefore:
>>> +
>>> + A: [1, 3] (i.e. [i].s.c)
>>> + B: [3, 5] (i.e. [i].s.e)
>>> +
>>> + Also look for sequences of component references whose access
>>> + functions are comparable and whose enclosing objects have the same
>>> + RECORD_TYPE. Store this sequence in STRUCT_SEQ. In the above
>>> + example, STRUCT_SEQ would be:
>>> +
>>> + A: [1, 2] (i.e. s.c)
>>> + B: [3, 4] (i.e. s.e) */
>>> + while (index_a < num_dimensions_a && index_b < num_dimensions_b)
>>> + {
>>> + /* REF_A and REF_B must be one of the component access types
>>> + allowed by dr_analyze_indices. */
>>> + gcc_checking_assert (access_fn_component_p (ref_a));
>>> + gcc_checking_assert (access_fn_component_p (ref_b));
>>> +
>>> + /* Get the immediately-enclosing objects for REF_A and REF_B,
>>> + i.e. the references *before* applying DR_ACCESS_FN (A, INDEX_A)
>>> + and DR_ACCESS_FN (B, INDEX_B). */
>>> + tree object_a = TREE_OPERAND (ref_a, 0);
>>> + tree object_b = TREE_OPERAND (ref_b, 0);
>>> +
>>> + tree type_a = TREE_TYPE (object_a);
>>> + tree type_b = TREE_TYPE (object_b);
>>> + if (access_fn_components_comparable_p (ref_a, ref_b))
>>> + {
>>> + /* This pair of component accesses is comparable for dependence
>>> + analysis, so we can include DR_ACCESS_FN (A, INDEX_A) and
>>> + DR_ACCESS_FN (B, INDEX_B) in the sequence. */
>>> + if (full_seq.start_a + full_seq.length != index_a
>>> + || full_seq.start_b + full_seq.length != index_b)
>>> + {
>>> + /* The accesses don't extend the current sequence,
>>> + so start a new one here. */
>>> + full_seq.start_a = index_a;
>>> + full_seq.start_b = index_b;
>>> + full_seq.length = 0;
>>> + }
>>> +
>>> + /* Add this pair of references to the sequence. */
>>> + full_seq.length += 1;
>>> + full_seq.object_a = object_a;
>>> + full_seq.object_b = object_b;
>>> +
>>> + /* If the enclosing objects are structures (and thus have the
>>> + same RECORD_TYPE), record the new sequence in STRUCT_SEQ. */
>>> + if (TREE_CODE (type_a) == RECORD_TYPE)
>>> + struct_seq = full_seq;
>>> +
>>> + /* Move to the next containing reference for both A and B. */
>>> + ref_a = object_a;
>>> + ref_b = object_b;
>>> + index_a += 1;
>>> + index_b += 1;
>>> + continue;
>>> + }
>>> +
>>> + /* Try to approach equal type sizes. */
>>> + if (!COMPLETE_TYPE_P (type_a)
>>> + || !COMPLETE_TYPE_P (type_b)
>>> + || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_a))
>>> + || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_b)))
>>> + break;
>>> +
>>> + unsigned HOST_WIDE_INT size_a = tree_to_uhwi (TYPE_SIZE_UNIT (type_a));
>>> + unsigned HOST_WIDE_INT size_b = tree_to_uhwi (TYPE_SIZE_UNIT (type_b));
>>> + if (size_a <= size_b)
>>> {
>>> - DDR_ARE_DEPENDENT (res) = chrec_dont_know;
>>> - return res;
>>> + index_a += 1;
>>> + ref_a = object_a;
>>> + }
>>> + if (size_b <= size_a)
>>> + {
>>> + index_b += 1;
>>> + ref_b = object_b;
>>> }
>>> - DDR_AFFINE_P (res) = true;
>>> - DDR_ARE_DEPENDENT (res) = NULL_TREE;
>>> - DDR_SUBSCRIPTS (res).create (DR_NUM_DIMENSIONS (a));
>>> - DDR_LOOP_NEST (res) = loop_nest;
>>> - DDR_INNER_LOOP (res) = 0;
>>> - DDR_SELF_REFERENCE (res) = true;
>>> - for (i = 0; i < DR_NUM_DIMENSIONS (a); i++)
>>> - {
>>> - struct subscript *subscript;
>>> -
>>> - subscript = XNEW (struct subscript);
>>> - SUB_CONFLICTS_IN_A (subscript) = conflict_fn_not_known ();
>>> - SUB_CONFLICTS_IN_B (subscript) = conflict_fn_not_known ();
>>> - SUB_LAST_CONFLICT (subscript) = chrec_dont_know;
>>> - SUB_DISTANCE (subscript) = chrec_dont_know;
>>> - DDR_SUBSCRIPTS (res).safe_push (subscript);
>>> - }
>>> - return res;
>>> }
>>>
>>> - /* If the references do not access the same object, we do not know
>>> - whether they alias or not. We do not care about TBAA or alignment
>>> - info so we can use OEP_ADDRESS_OF to avoid false negatives.
>>> - But the accesses have to use compatible types as otherwise the
>>> - built indices would not match. */
>>> - if (!operand_equal_p (DR_BASE_OBJECT (a), DR_BASE_OBJECT (b),
>> OEP_ADDRESS_OF)
>>> - || !types_compatible_p (TREE_TYPE (DR_BASE_OBJECT (a)),
>>> - TREE_TYPE (DR_BASE_OBJECT (b))))
>>> + /* See whether FULL_SEQ ends at the base and whether the two bases
>>> + are equal. We do not care about TBAA or alignment info so we can
>>> + use OEP_ADDRESS_OF to avoid false negatives. */
>>> + tree base_a = DR_BASE_OBJECT (a);
>>> + tree base_b = DR_BASE_OBJECT (b);
>>> + bool same_base_p = (full_seq.start_a + full_seq.length == num_dimensions_a
>>> + && full_seq.start_b + full_seq.length == num_dimensions_b
>>> + && DR_UNCONSTRAINED_BASE (a) == DR_UNCONSTRAINED_BASE (b)
>>> + && operand_equal_p (base_a, base_b, OEP_ADDRESS_OF)
>>> + && types_compatible_p (TREE_TYPE (base_a),
>>> + TREE_TYPE (base_b))
>>> + && (!loop_nest.exists ()
>>> + || (object_address_invariant_in_loop_p
>>> + (loop_nest[0], base_a))));
>>> +
>>> + /* If the bases are the same, we can include the base variation too.
>>> + E.g. the b accesses in:
>>> +
>>> + for (int i = 0; i < n; ++i)
>>> + b[i + 4][0] = b[i][0];
>>> +
>>> + have a definite dependence distance of 4, while for:
>>> +
>>> + for (int i = 0; i < n; ++i)
>>> + a[i + 4][0] = b[i][0];
>>> +
>>> + the dependence distance depends on the gap between a and b.
>>> +
>>> + If the bases are different then we can only rely on the sequence
>>> + rooted at a structure access, since arrays are allowed to overlap
>>> + arbitrarily and change shape arbitrarily. E.g. we treat this as
>>> + valid code:
>>> +
>>> + int a[256];
>>> + ...
>>> + ((int (*)[4][3]) &a[1])[i][0] += ((int (*)[4][3]) &a[2])[i][0];
>>> +
>>> + where two lvalues with the same int[4][3] type overlap, and where
>>> + both lvalues are distinct from the object's declared type. */
>>> + if (same_base_p)
>>> {
>>> - DDR_ARE_DEPENDENT (res) = chrec_dont_know;
>>> - return res;
>>> + if (DR_UNCONSTRAINED_BASE (a))
>>> + full_seq.length += 1;
>>> }
>>> + else
>>> + full_seq = struct_seq;
>>>
>>> - /* If the base of the object is not invariant in the loop nest, we cannot
>>> - analyze it. TODO -- in fact, it would suffice to record that there may
>>> - be arbitrary dependences in the loops where the base object varies. */
>>> - if ((loop_nest.exists ()
>>> - && !object_address_invariant_in_loop_p (loop_nest[0], DR_BASE_OBJECT
>> (a)))
>>> - || DR_NUM_DIMENSIONS (a) == 0)
>>> + /* Punt if we didn't find a suitable sequence. */
>>> + if (full_seq.length == 0)
>>> {
>>> DDR_ARE_DEPENDENT (res) = chrec_dont_know;
>>> return res;
>>> }
>>>
>>> - /* If the number of dimensions of the access to not agree we can have
>>> - a pointer access to a component of the array element type and an
>>> - array access while the base-objects are still the same. Punt. */
>>> - if (DR_NUM_DIMENSIONS (a) != DR_NUM_DIMENSIONS (b))
>>> + if (!same_base_p)
>>> {
>>> - DDR_ARE_DEPENDENT (res) = chrec_dont_know;
>>> - return res;
>>> + /* Partial overlap is possible for different bases when strict aliasing
>>> + is not in effect. It's also possible if either base involves a union
>>> + access; e.g. for:
>>> +
>>> + struct s1 { int a[2]; };
>>> + struct s2 { struct s1 b; int c; };
>>> + struct s3 { int d; struct s1 e; };
>>> + union u { struct s2 f; struct s3 g; } *p, *q;
>>> +
>>> + the s1 at "p->f.b" (base "p->f") partially overlaps the s1 at
>>> + "p->g.e" (base "p->g") and might partially overlap the s1 at
>>> + "q->g.e" (base "q->g"). */
>>> + if (!flag_strict_aliasing
>>> + || ref_contains_union_access_p (full_seq.object_a)
>>> + || ref_contains_union_access_p (full_seq.object_b))
>>> + {
>>> + DDR_ARE_DEPENDENT (res) = chrec_dont_know;
>>> + return res;
>>> + }
>>> +
>>> + DDR_COULD_BE_INDEPENDENT_P (res) = true;
>>> }
>>>
>>> DDR_AFFINE_P (res) = true;
>>> DDR_ARE_DEPENDENT (res) = NULL_TREE;
>>> - DDR_SUBSCRIPTS (res).create (DR_NUM_DIMENSIONS (a));
>>> + DDR_SUBSCRIPTS (res).create (full_seq.length);
>>> DDR_LOOP_NEST (res) = loop_nest;
>>> DDR_INNER_LOOP (res) = 0;
>>> DDR_SELF_REFERENCE (res) = false;
>>>
>>> - for (i = 0; i < DR_NUM_DIMENSIONS (a); i++)
>>> + for (i = 0; i < full_seq.length; ++i)
>>> {
>>> struct subscript *subscript;
>>>
>>> subscript = XNEW (struct subscript);
>>> + SUB_ACCESS_FN (subscript, 0) = DR_ACCESS_FN (a, full_seq.start_a + i);
>>> + SUB_ACCESS_FN (subscript, 1) = DR_ACCESS_FN (b, full_seq.start_b + i);
>>> SUB_CONFLICTS_IN_A (subscript) = conflict_fn_not_known ();
>>> SUB_CONFLICTS_IN_B (subscript) = conflict_fn_not_known ();
>>> SUB_LAST_CONFLICT (subscript) = chrec_dont_know;
>>> @@ -3839,14 +4103,15 @@ add_outer_distances (struct data_depende
>>> }
>>>
>>> /* Return false when fail to represent the data dependence as a
>>> - distance vector. INIT_B is set to true when a component has been
>>> + distance vector. A_INDEX is the index of the first reference
>>> + (0 for DDR_A, 1 for DDR_B) and B_INDEX is the index of the
>>> + second reference. INIT_B is set to true when a component has been
>>> added to the distance vector DIST_V. INDEX_CARRY is then set to
>>> the index in DIST_V that carries the dependence. */
>>>
>>> static bool
>>> build_classic_dist_vector_1 (struct data_dependence_relation *ddr,
>>> - struct data_reference *ddr_a,
>>> - struct data_reference *ddr_b,
>>> + unsigned int a_index, unsigned int b_index,
>>> lambda_vector dist_v, bool *init_b,
>>> int *index_carry)
>>> {
>>> @@ -3864,8 +4129,8 @@ build_classic_dist_vector_1 (struct data
>>> return false;
>>> }
>>>
>>> - access_fn_a = DR_ACCESS_FN (ddr_a, i);
>>> - access_fn_b = DR_ACCESS_FN (ddr_b, i);
>>> + access_fn_a = SUB_ACCESS_FN (subscript, a_index);
>>> + access_fn_b = SUB_ACCESS_FN (subscript, b_index);
>>>
>>> if (TREE_CODE (access_fn_a) == POLYNOMIAL_CHREC
>>> && TREE_CODE (access_fn_b) == POLYNOMIAL_CHREC)
>>> @@ -3925,10 +4190,11 @@ build_classic_dist_vector_1 (struct data
>>> constant_access_functions (const struct data_dependence_relation *ddr)
>>> {
>>> unsigned i;
>>> + subscript *sub;
>>>
>>> - for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
>>> - if (!evolution_function_is_constant_p (DR_ACCESS_FN (DDR_A (ddr), i))
>>> - || !evolution_function_is_constant_p (DR_ACCESS_FN (DDR_B (ddr), i)))
>>> + FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
>>> + if (!evolution_function_is_constant_p (SUB_ACCESS_FN (sub, 0))
>>> + || !evolution_function_is_constant_p (SUB_ACCESS_FN (sub, 1)))
>>> return false;
>>>
>>> return true;
>>> @@ -3991,10 +4257,11 @@ add_other_self_distances (struct data_de
>>> lambda_vector dist_v;
>>> unsigned i;
>>> int index_carry = DDR_NB_LOOPS (ddr);
>>> + subscript *sub;
>>>
>>> - for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
>>> + FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
>>> {
>>> - tree access_fun = DR_ACCESS_FN (DDR_A (ddr), i);
>>> + tree access_fun = SUB_ACCESS_FN (sub, 0);
>>>
>>> if (TREE_CODE (access_fun) == POLYNOMIAL_CHREC)
>>> {
>>> @@ -4006,7 +4273,7 @@ add_other_self_distances (struct data_de
>>> return;
>>> }
>>>
>>> - access_fun = DR_ACCESS_FN (DDR_A (ddr), 0);
>>> + access_fun = SUB_ACCESS_FN (DDR_SUBSCRIPT (ddr, 0), 0);
>>>
>>> if (TREE_CODE (CHREC_LEFT (access_fun)) == POLYNOMIAL_CHREC)
>>> add_multivariate_self_dist (ddr, access_fun);
>>> @@ -4077,6 +4344,23 @@ add_distance_for_zero_overlaps (struct d
>>> }
>>> }
>>>
>>> +/* Return true when the DDR contains two data references that have the
>>> + same access functions. */
>>> +
>>> +static inline bool
>>> +same_access_functions (const struct data_dependence_relation *ddr)
>>> +{
>>> + unsigned i;
>>> + subscript *sub;
>>> +
>>> + FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
>>> + if (!eq_evolutions_p (SUB_ACCESS_FN (sub, 0),
>>> + SUB_ACCESS_FN (sub, 1)))
>>> + return false;
>>> +
>>> + return true;
>>> +}
>>> +
>>> /* Compute the classic per loop distance vector. DDR is the data
>>> dependence relation to build a vector from. Return false when fail
>>> to represent the data dependence as a distance vector. */
>>> @@ -4108,8 +4392,7 @@ build_classic_dist_vector (struct data_d
>>> }
>>>
>>> dist_v = lambda_vector_new (DDR_NB_LOOPS (ddr));
>>> - if (!build_classic_dist_vector_1 (ddr, DDR_A (ddr), DDR_B (ddr),
>>> - dist_v, &init_b, &index_carry))
>>> + if (!build_classic_dist_vector_1 (ddr, 0, 1, dist_v, &init_b,
>> &index_carry))
>>> return false;
>>>
>>> /* Save the distance vector if we initialized one. */
>>> @@ -4142,12 +4425,11 @@ build_classic_dist_vector (struct data_d
>>> if (!lambda_vector_lexico_pos (dist_v, DDR_NB_LOOPS (ddr)))
>>> {
>>> lambda_vector save_v = lambda_vector_new (DDR_NB_LOOPS (ddr));
>>> - if (!subscript_dependence_tester_1 (ddr, DDR_B (ddr), DDR_A (ddr),
>>> - loop_nest))
>>> + if (!subscript_dependence_tester_1 (ddr, 1, 0, loop_nest))
>>> return false;
>>> compute_subscript_distance (ddr);
>>> - if (!build_classic_dist_vector_1 (ddr, DDR_B (ddr), DDR_A (ddr),
>>> - save_v, &init_b, &index_carry))
>>> + if (!build_classic_dist_vector_1 (ddr, 1, 0, save_v, &init_b,
>>> + &index_carry))
>>> return false;
>>> save_dist_v (ddr, save_v);
>>> DDR_REVERSED_P (ddr) = true;
>>> @@ -4183,12 +4465,10 @@ build_classic_dist_vector (struct data_d
>>> {
>>> lambda_vector opposite_v = lambda_vector_new (DDR_NB_LOOPS (ddr));
>>>
>>> - if (!subscript_dependence_tester_1 (ddr, DDR_B (ddr),
>>> - DDR_A (ddr), loop_nest))
>>> + if (!subscript_dependence_tester_1 (ddr, 1, 0, loop_nest))
>>> return false;
>>> compute_subscript_distance (ddr);
>>> - if (!build_classic_dist_vector_1 (ddr, DDR_B (ddr), DDR_A (ddr),
>>> - opposite_v, &init_b,
>>> + if (!build_classic_dist_vector_1 (ddr, 1, 0, opposite_v, &init_b,
>>> &index_carry))
>>> return false;
>>>
>>> @@ -4267,13 +4547,13 @@ build_classic_dir_vector (struct data_de
>>> }
>>> }
>>>
>>> -/* Helper function. Returns true when there is a dependence between
>>> - data references DRA and DRB. */
>>> +/* Helper function. Returns true when there is a dependence between the
>>> + data references. A_INDEX is the index of the first reference (0 for
>>> + DDR_A, 1 for DDR_B) and B_INDEX is the index of the second reference. */
>>>
>>> static bool
>>> subscript_dependence_tester_1 (struct data_dependence_relation *ddr,
>>> - struct data_reference *dra,
>>> - struct data_reference *drb,
>>> + unsigned int a_index, unsigned int b_index,
>>> struct loop *loop_nest)
>>> {
>>> unsigned int i;
>>> @@ -4285,8 +4565,8 @@ subscript_dependence_tester_1 (struct da
>>> {
>>> conflict_function *overlaps_a, *overlaps_b;
>>>
>>> - analyze_overlapping_iterations (DR_ACCESS_FN (dra, i),
>>> - DR_ACCESS_FN (drb, i),
>>> + analyze_overlapping_iterations (SUB_ACCESS_FN (subscript, a_index),
>>> + SUB_ACCESS_FN (subscript, b_index),
>>> &overlaps_a, &overlaps_b,
>>> &last_conflicts, loop_nest);
>>>
>>> @@ -4335,7 +4615,7 @@ subscript_dependence_tester_1 (struct da
>>> subscript_dependence_tester (struct data_dependence_relation *ddr,
>>> struct loop *loop_nest)
>>> {
>>> - if (subscript_dependence_tester_1 (ddr, DDR_A (ddr), DDR_B (ddr),
>> loop_nest))
>>> + if (subscript_dependence_tester_1 (ddr, 0, 1, loop_nest))
>>> dependence_stats.num_dependence_dependent++;
>>>
>>> compute_subscript_distance (ddr);
>>> Index: gcc/tree-ssa-loop-prefetch.c
>>> ===================================================================
>>> --- gcc/tree-ssa-loop-prefetch.c 2017-07-27 13:10:29.620045506 +0100
>>> +++ gcc/tree-ssa-loop-prefetch.c 2017-07-27 13:10:33.023912613 +0100
>>> @@ -1668,6 +1668,7 @@ determine_loop_nest_reuse (struct loop *
>>> refb = (struct mem_ref *) DDR_B (dep)->aux;
>>>
>>> if (DDR_ARE_DEPENDENT (dep) == chrec_dont_know
>>> + || DDR_COULD_BE_INDEPENDENT_P (dep)
>>> || DDR_NUM_DIST_VECTS (dep) == 0)
>>> {
>>> /* If the dependence cannot be analyzed, assume that there might be
>>> Index: gcc/tree-vectorizer.h
>>> ===================================================================
>>> --- gcc/tree-vectorizer.h 2017-07-27 13:10:29.620045506 +0100
>>> +++ gcc/tree-vectorizer.h 2017-07-27 13:10:33.024912868 +0100
>>> @@ -358,7 +358,7 @@ #define LOOP_VINFO_ORIG_LOOP_INFO(L)
>>> #define LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT(L) \
>>> ((L)->may_misalign_stmts.length () > 0)
>>> #define LOOP_REQUIRES_VERSIONING_FOR_ALIAS(L) \
>>> - ((L)->may_alias_ddrs.length () > 0)
>>> + ((L)->comp_alias_ddrs.length () > 0)
>>> #define LOOP_REQUIRES_VERSIONING_FOR_NITERS(L) \
>>> (LOOP_VINFO_NITERS_ASSUMPTIONS (L))
>>> #define LOOP_REQUIRES_VERSIONING(L) \
>>> Index: gcc/tree-vect-data-refs.c
>>> ===================================================================
>>> --- gcc/tree-vect-data-refs.c 2017-07-27 13:10:29.620045506 +0100
>>> +++ gcc/tree-vect-data-refs.c 2017-07-27 13:10:33.024912868 +0100
>>> @@ -160,6 +160,60 @@ vect_mark_for_runtime_alias_test (ddr_p
>>> }
>>>
>>>
>>> +/* A subroutine of vect_analyze_data_ref_dependence. Handle
>>> + DDR_COULD_BE_INDEPENDENT_P ddr DDR that has a known set of dependence
>>> + distances. These distances are conservatively correct but they don't
>>> + reflect a guaranteed dependence.
>>> +
>>> + Return true if this function does all the work necessary to avoid
>>> + an alias or false if the caller should use the dependence distances
>>> + to limit the vectorization factor in the usual way. LOOP_DEPTH is
>>> + the depth of the loop described by LOOP_VINFO and the other arguments
>>> + are as for vect_analyze_data_ref_dependence. */
>>> +
>>> +static bool
>>> +vect_analyze_possibly_independent_ddr (data_dependence_relation *ddr,
>>> + loop_vec_info loop_vinfo,
>>> + int loop_depth, int *max_vf)
>>> +{
>>> + struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
>>> + lambda_vector dist_v;
>>> + unsigned int i;
>>> + FOR_EACH_VEC_ELT (DDR_DIST_VECTS (ddr), i, dist_v)
>>> + {
>>> + int dist = dist_v[loop_depth];
>>> + if (dist != 0 && !(dist > 0 && DDR_REVERSED_P (ddr)))
>>> + {
>>> + /* If the user asserted safelen >= DIST consecutive iterations
>>> + can be executed concurrently, assume independence.
>>> +
>>> + ??? An alternative would be to add the alias check even
>>> + in this case, and vectorize the fallback loop with the
>>> + maximum VF set to safelen. However, if the user has
>>> + explicitly given a length, it's less likely that that
>>> + would be a win. */
>>> + if (loop->safelen >= 2 && abs_hwi (dist) <= loop->safelen)
>>> + {
>>> + if (loop->safelen < *max_vf)
>>> + *max_vf = loop->safelen;
>>> + LOOP_VINFO_NO_DATA_DEPENDENCIES (loop_vinfo) = false;
>>> + continue;
>>> + }
>>> +
>>> + /* For dependence distances of 2 or more, we have the option
>>> + of limiting VF or checking for an alias at runtime.
>>> + Prefer to check at runtime if we can, to avoid limiting
>>> + the VF unnecessarily when the bases are in fact independent.
>>> +
>>> + Note that the alias checks will be removed if the VF ends up
>>> + being small enough. */
>>> + return vect_mark_for_runtime_alias_test (ddr, loop_vinfo);
>>> + }
>>> + }
>>> + return true;
>>> +}
>>> +
>>> +
>>> /* Function vect_analyze_data_ref_dependence.
>>>
>>> Return TRUE if there (might) exist a dependence between a memory-reference
>>> @@ -305,6 +359,12 @@ vect_analyze_data_ref_dependence (struct
>>> }
>>>
>>> loop_depth = index_in_loop_nest (loop->num, DDR_LOOP_NEST (ddr));
>>> +
>>> + if (DDR_COULD_BE_INDEPENDENT_P (ddr)
>>> + && vect_analyze_possibly_independent_ddr (ddr, loop_vinfo,
>>> + loop_depth, max_vf))
>>> + return false;
>>> +
>>> FOR_EACH_VEC_ELT (DDR_DIST_VECTS (ddr), i, dist_v)
>>> {
>>> int dist = dist_v[loop_depth];
>>> @@ -2878,6 +2938,44 @@ vect_no_alias_p (struct data_reference *
>>> return false;
>>> }
>>>
>>> +/* Return true if the minimum nonzero dependence distance for loop LOOP_DEPTH
>>> + in DDR is >= VF. */
>>> +
>>> +static bool
>>> +dependence_distance_ge_vf (data_dependence_relation *ddr,
>>> + unsigned int loop_depth, unsigned HOST_WIDE_INT vf)
>>> +{
>>> + if (DDR_ARE_DEPENDENT (ddr) != NULL_TREE
>>> + || DDR_NUM_DIST_VECTS (ddr) == 0)
>>> + return false;
>>> +
>>> + /* If the dependence is exact, we should have limited the VF instead. */
>>> + gcc_checking_assert (DDR_COULD_BE_INDEPENDENT_P (ddr));
>>> +
>>> + unsigned int i;
>>> + lambda_vector dist_v;
>>> + FOR_EACH_VEC_ELT (DDR_DIST_VECTS (ddr), i, dist_v)
>>> + {
>>> + HOST_WIDE_INT dist = dist_v[loop_depth];
>>> + if (dist != 0
>>> + && !(dist > 0 && DDR_REVERSED_P (ddr))
>>> + && (unsigned HOST_WIDE_INT) abs_hwi (dist) < vf)
>>> + return false;
>>> + }
>>> +
>>> + if (dump_enabled_p ())
>>> + {
>>> + dump_printf_loc (MSG_NOTE, vect_location,
>>> + "dependence distance between ");
>>> + dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (DDR_A (ddr)));
>>> + dump_printf (MSG_NOTE, " and ");
>>> + dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (DDR_B (ddr)));
>>> + dump_printf (MSG_NOTE, " is >= VF\n");
>>> + }
>>> +
>>> + return true;
>>> +}
>>> +
>>> /* Function vect_prune_runtime_alias_test_list.
>>>
>>> Prune a list of ddrs to be tested at run-time by versioning for alias.
>>> @@ -2908,6 +3006,10 @@ vect_prune_runtime_alias_test_list (loop
>>>
>>> comp_alias_ddrs.create (may_alias_ddrs.length ());
>>>
>>> + unsigned int loop_depth
>>> + = index_in_loop_nest (LOOP_VINFO_LOOP (loop_vinfo)->num,
>>> + LOOP_VINFO_LOOP_NEST (loop_vinfo));
>>> +
>>> /* First, we collect all data ref pairs for aliasing checks. */
>>> FOR_EACH_VEC_ELT (may_alias_ddrs, i, ddr)
>>> {
>>> @@ -2917,6 +3019,11 @@ vect_prune_runtime_alias_test_list (loop
>>> tree segment_length_a, segment_length_b;
>>> gimple *stmt_a, *stmt_b;
>>>
>>> + /* Ignore the alias if the VF we chose ended up being no greater
>>> + than the dependence distance. */
>>> + if (dependence_distance_ge_vf (ddr, loop_depth, vect_factor))
>>> + continue;
>>> +
>>> dr_a = DDR_A (ddr);
>>> stmt_a = DR_STMT (DDR_A (ddr));
>>> dr_group_first_a = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_a));
>>> @@ -2993,10 +3100,6 @@ vect_prune_runtime_alias_test_list (loop
>>> return false;
>>> }
>>>
>>> - /* All alias checks have been resolved at compilation time. */
>>> - if (!comp_alias_ddrs.length ())
>>> - LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo).truncate (0);
>>> -
>>> return true;
>>> }
>>>
>>> Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-3.c
>>> ===================================================================
>>> --- /dev/null 2017-07-27 10:25:31.671280760 +0100
>>> +++ gcc/testsuite/gcc.dg/vect/vect-alias-check-3.c 2017-07-27
>> 13:10:33.022912357 +0100
>>> @@ -0,0 +1,120 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-require-effective-target vect_int } */
>>> +/* { dg-additional-options "--param
>> vect-max-version-for-alias-checks=0 -fopenmp-simd" } */
>>> +
>>> +/* Intended to be larger than any VF. */
>>> +#define GAP 128
>>> +#define N (GAP * 3)
>>> +
>>> +struct s { int x[N + 1]; };
>>> +struct t { struct s x[N + 1]; };
>>> +struct u { int x[N + 1]; int y; };
>>> +struct v { struct s s; };
>>> +
>>> +void
>>> +f1 (struct s *a, struct s *b)
>>> +{
>>> + for (int i = 0; i < N; ++i)
>>> + a->x[i] += b->x[i];
>>> +}
>>> +
>>> +void
>>> +f2 (struct s *a, struct s *b)
>>> +{
>>> + for (int i = 0; i < N; ++i)
>>> + a[1].x[i] += b[2].x[i];
>>> +}
>>> +
>>> +void
>>> +f3 (struct s *a, struct s *b)
>>> +{
>>> + for (int i = 0; i < N; ++i)
>>> + a[1].x[i] += b[i].x[i];
>>> +}
>>> +
>>> +void
>>> +f4 (struct s *a, struct s *b)
>>> +{
>>> + for (int i = 0; i < N; ++i)
>>> + a[i].x[i] += b[i].x[i];
>>> +}
>>> +
>>> +void
>>> +f5 (struct s *a, struct s *b)
>>> +{
>>> + for (int i = 0; i < N; ++i)
>>> + a->x[i] += b->x[i + 1];
>>> +}
>>> +
>>> +void
>>> +f6 (struct s *a, struct s *b)
>>> +{
>>> + for (int i = 0; i < N; ++i)
>>> + a[1].x[i] += b[2].x[i + 1];
>>> +}
>>> +
>>> +void
>>> +f7 (struct s *a, struct s *b)
>>> +{
>>> + for (int i = 0; i < N; ++i)
>>> + a[1].x[i] += b[i].x[i + 1];
>>> +}
>>> +
>>> +void
>>> +f8 (struct s *a, struct s *b)
>>> +{
>>> + for (int i = 0; i < N; ++i)
>>> + a[i].x[i] += b[i].x[i + 1];
>>> +}
>>> +
>>> +void
>>> +f9 (struct s *a, struct t *b)
>>> +{
>>> + for (int i = 0; i < N; ++i)
>>> + a->x[i] += b->x[1].x[i];
>>> +}
>>> +
>>> +void
>>> +f10 (struct s *a, struct t *b)
>>> +{
>>> + for (int i = 0; i < N; ++i)
>>> + a->x[i] += b->x[i].x[i];
>>> +}
>>> +
>>> +void
>>> +f11 (struct u *a, struct u *b)
>>> +{
>>> + for (int i = 0; i < N; ++i)
>>> + a->x[i] += b->x[i] + b[i].y;
>>> +}
>>> +
>>> +void
>>> +f12 (struct s *a, struct s *b)
>>> +{
>>> + for (int i = 0; i < GAP; ++i)
>>> + a->x[i + GAP] += b->x[i];
>>> +}
>>> +
>>> +void
>>> +f13 (struct s *a, struct s *b)
>>> +{
>>> + for (int i = 0; i < GAP * 2; ++i)
>>> + a->x[i + GAP] += b->x[i];
>>> +}
>>> +
>>> +void
>>> +f14 (struct v *a, struct s *b)
>>> +{
>>> + for (int i = 0; i < N; ++i)
>>> + a->s.x[i] = b->x[i];
>>> +}
>>> +
>>> +void
>>> +f15 (struct s *a, struct s *b)
>>> +{
>>> + #pragma omp simd safelen(N)
>>> + for (int i = 0; i < N; ++i)
>>> + a->x[i + 1] += b->x[i];
>>> +}
>>> +
>>> +/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 15 "vect" } } */
>>> Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-4.c
>>> ===================================================================
>>> --- /dev/null 2017-07-27 10:25:31.671280760 +0100
>>> +++ gcc/testsuite/gcc.dg/vect/vect-alias-check-4.c 2017-07-27
>> 13:10:33.022912357 +0100
>>> @@ -0,0 +1,35 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-require-effective-target vect_int } */
>>> +/* { dg-additional-options "--param vect-max-version-for-alias-checks=0" } */
>>> +
>>> +#define N 16
>>> +
>>> +struct s1 { int a[N]; };
>>> +struct s2 { struct s1 b; int c; };
>>> +struct s3 { int d; struct s1 e; };
>>> +union u { struct s2 f; struct s3 g; };
>>> +
>>> +/* We allow a and b to overlap arbitrarily. */
>>> +
>>> +void
>>> +f1 (int a[][N], int b[][N])
>>> +{
>>> + for (int i = 0; i < N; ++i)
>>> + a[0][i] += b[0][i];
>>> +}
>>> +
>>> +void
>>> +f2 (union u *a, union u *b)
>>> +{
>>> + for (int i = 0; i < N; ++i)
>>> + a->f.b.a[i] += b->g.e.a[i];
>>> +}
>>> +
>>> +void
>>> +f3 (struct s1 *a, struct s1 *b)
>>> +{
>>> + for (int i = 0; i < N - 1; ++i)
>>> + a->a[i + 1] += b->a[i];
>>> +}
>>> +
>>> +/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */
>>> Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-5.c
>>> ===================================================================
>>> --- /dev/null 2017-07-27 10:25:31.671280760 +0100
>>> +++ gcc/testsuite/gcc.dg/vect/vect-alias-check-5.c 2017-07-27
>> 13:10:33.022912357 +0100
>>> @@ -0,0 +1,19 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-require-effective-target vect_int } */
>>> +
>>> +/* Intended to be larger than any VF. */
>>> +#define GAP 128
>>> +#define N (GAP * 3)
>>> +
>>> +struct s { int x[N]; };
>>> +
>>> +void
>>> +f1 (struct s *a, struct s *b)
>>> +{
>>> + for (int i = 0; i < GAP * 2; ++i)
>>> + a->x[i + GAP] += b->x[i];
>>> +}
>>> +
>>> +/* { dg-final { scan-tree-dump-times "consider run-time aliasing" 1
>> "vect" } } */
>>> +/* { dg-final { scan-tree-dump-times "improved number of alias checks
>> from 1 to 0" 1 "vect" } } */
>>> +/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 1 "vect" } } */