This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible


On Thu, Mar 31, 2016 at 6:43 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Tue, Mar 29, 2016 at 9:37 AM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Mon, Mar 28, 2016 at 9:57 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>> Sorry, Should have replied to gcc-patches list.
>>>
>>> Thanks,
>>> bin
>>>
>>> ---------- Forwarded message ----------
>>> From: "Bin.Cheng" <amker.cheng@gmail.com>
>>> Date: Tue, 29 Mar 2016 03:55:04 +0800
>>> Subject: Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking
>>> DR against its innermost loop bahavior if possible
>>> To: Richard Biener <richard.guenther@gmail.com>
>>>
>>> On 3/17/16, Richard Biener <richard.guenther@gmail.com> wrote:
>>>> On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>>>> On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener
>>>>> <richard.guenther@gmail.com> wrote:
>>>>>>
>>>>>> Hmm.
>>>>> Hi,
>>>>> Thanks for reviewing.
>>>>>>
>>>>>> +  equal_p = true;
>>>>>> +  if (e1->base_address && e2->base_address)
>>>>>> +    equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0);
>>>>>> +  if (e1->offset && e2->offset)
>>>>>> +    equal_p &= operand_equal_p (e1->offset, e2->offset, 0);
>>>>>>
>>>>>> surely better to return false early.
>>>>>>
>>>>>> I think we don't want this in tree-data-refs.h also because of ...
>>>>>>
>>>>>> @@ -615,15 +619,29 @@
>>>>>> hash_memrefs_baserefs_and_store_DRs_read_written_info
>>>>>> (data_reference_p a)
>>>>>>    data_reference_p *master_dr, *base_master_dr;and REALPART) before
>>>>>> creating the DR (or adjust the equality function
>>>>> and hashing
>>>>>>    tree ref = DR_REF (a);
>>>>>>    tree base_ref = DR_BASE_OBJECT (a);
>>>>>> +  innermost_loop_behavior *innermost = &DR_INNERMOST (a);
>>>>>>    tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
>>>>>>    bool exist1, exist2;
>>>>>>
>>>>>> -  while (TREE_CODE (ref) == COMPONENT_REF
>>>>>> -        || TREE_CODE (ref) == IMAGPART_EXPR
>>>>>> -        || TREE_CODE (ref) == REALPART_EXPR)
>>>>>> -    ref = TREE_OPERAND (ref, 0);
>>>>>> +  /* If reference in DR has innermost loop behavior and it is not
>>>>>> +     a compound memory reference, we store it to innermost_DR_map,
>>>>>> +     otherwise to ref_DR_map.  */
>>>>>> +  if (TREE_CODE (ref) == COMPONENT_REF
>>>>>> +      || TREE_CODE (ref) == IMAGPART_EXPR
>>>>>> +      || TREE_CODE (ref) == REALPART_EXPR
>>>>>> +      || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a)
>>>>>> +          || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a)))
>>>>>> +    {
>>>>>> +      while (TREE_CODE (ref) == COMPONENT_REF
>>>>>> +            || TREE_CODE (ref) == IMAGPART_EXPR
>>>>>> +            || TREE_CODE (ref) == REALPART_EXPR)
>>>>>> +       ref = TREE_OPERAND (ref, 0);
>>>>>> +
>>>>>> +      master_dr = &ref_DR_map->get_or_insert (ref, &exist1);
>>>>>> +    }
>>>>>> +  else
>>>>>> +    master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1);
>>>>>>
>>>>>> we don't want an extra hashmap but replace ref_DR_map entirely.  So we'd
>>>>>> need to
>>>>>> strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART
>>>>>> and REALPART) before creating the DR (or adjust the equality function
>>>>>> and hashing
>>>>>> to disregard them which means subtracting their offset from DR_INIT.
>>>>> I am not sure if I understand correctly.  But for component reference,
>>>>> it is the base object that we want to record/track.  For example,
>>>>>
>>>>>   for (i = 0; i < N; i++) {
>>>>>     m = *data++;
>>>>>
>>>>>     m1 = p1->x - m;
>>>>>     m2 = p2->x + m;
>>>>>
>>>>>     p3->y = (m1 >= m2) ? p1->y : p2->y;
>>>>>
>>>>>     p1++;
>>>>>     p2++;
>>>>>     p3++;
>>>>>   }
>>>>> We want to infer that reads of p1/p2 in condition statement won't trap
>>>>> because there are unconditional reads of the structures, though the
>>>>> unconditional reads are actual of other sub-objects.  Here it is the
>>>>> invariant part of address that we want to track.
>>>>
>>>> Well, the variant parts - we want to strip invariant parts as far as we can
>>>> (offsetof (x) and offsetof (y))
>>>>
>>>>> Also illustrated by this example, we can't rely on data-ref analyzer
>>>>> here.  Because in gathering/scattering cases, the address could be not
>>>>> affine at all.
>>>>
>>>> Sure, but that's a different issue.
>>>>
>>>>>>
>>>>>> To adjust the references we collect you'd maybe could use a callback
>>>>>> to get_references_in_stmt
>>>>>> to adjust them.
>>>>>>
>>>>>> OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple
>>>>>> as
>>>>> Is this a part of the method you suggested above, or is it an
>>>>> alternative one?  If it's the latter, then I have below questions
>>>>> embedded.
>>>>
>>>> It is an alternative to adding a hook to get_references_in_stmt and
>>>> probably "easier".
>>>>
>>>>>>
>>>>>> Index: tree-if-conv.c
>>>>>> ===================================================================
>>>>>> --- tree-if-conv.c      (revision 234215)
>>>>>> +++ tree-if-conv.c      (working copy)
>>>>>> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo
>>>>>>
>>>>>>    for (i = 0; refs->iterate (i, &dr); i++)
>>>>>>      {
>>>>>> +      tree *refp = &DR_REF (dr);
>>>>>> +      while ((TREE_CODE (*refp) == COMPONENT_REF
>>>>>> +             && TREE_OPERAND (*refp, 2) == NULL_TREE)
>>>>>> +            || TREE_CODE (*refp) == IMAGPART_EXPR
>>>>>> +            || TREE_CODE (*refp) == REALPART_EXPR)
>>>>>> +       refp = &TREE_OPERAND (*refp, 0);
>>>>>> +      if (refp != &DR_REF (dr))
>>>>>> +       {
>>>>>> +         tree saved_base = *refp;
>>>>>> +         *refp = integer_zero_node;
>>>>>> +
>>>>>> +         if (DR_INIT (dr))
>>>>>> +           {
>>>>>> +             tree poffset;
>>>>>> +             int punsignedp, preversep, pvolatilep;
>>>>>> +             machine_mode pmode;
>>>>>> +             HOST_WIDE_INT pbitsize, pbitpos;
>>>>>> +             get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos,
>>>>>> &poffset,
>>>>>> +                                  &pmode, &punsignedp, &preversep,
>>>>>> &pvolatilep,
>>>>>> +                                  false);
>>>>>> +             gcc_assert (poffset == NULL_TREE);
>>>>>> +
>>>>>> +             DR_INIT (dr)
>>>>>> +               = wide_int_to_tree (ssizetype,
>>>>>> +                                   wi::sub (DR_INIT (dr),
>>>>>> +                                            pbitpos / BITS_PER_UNIT));
>>>>>> +           }
>>>>>> +
>>>>>> +         *refp = saved_base;
>>>>>> +         DR_REF (dr) = *refp;
>>>>>> +       }
>>>>> Looks to me the code is trying to resolve difference between two (or
>>>>> more) component references, which is DR_INIT in the code.  But DR_INIT
>>>>> is not the only thing needs to be handled.  For a structure containing
>>>>> two sub-arrays, DR_OFFSET may be different too.
>>>>
>>>> Yes, but we can't say that if
>>>>
>>>>   a->a[i]
>>>>
>>>> doesn't trap that then
>>>>
>>>>   a->b[i]
>>>>
>>>> doesn't trap either.  We can only "strip" outermost
>>>> non-variable-offset components.
>>>>
>>>> But maybe I'm missing what example you are thinking of.
>>> Hmm, this was the case I meant.  What I don't understand is current
>>> code logic does infer trap information for a.b[i] from a.a[i].  Given
>>> below example:
>>> struct str
>>> {
>>>   int a[10];
>>>   int b[20];
>>>   char c;
>>> };
>>>
>>> void bar (struct str *);
>>> int foo (int x, int n)
>>> {
>>>   int i;
>>>   struct str s;
>>>   bar (&s);
>>>   for (i = 0; i < n; i++)
>>>     {
>>>       s.a[i] = s.b[i];
>>>       if (x > i)
>>>         s.b[i] = 0;
>>>     }
>>>   bar (&s);
>>>   return 0;
>>> }
>>> The loop is convertible because of below code in function
>>> ifcvt_memrefs_wont_trap:
>>>
>>>   /* If a is unconditionally accessed then ... */
>>>   if (DR_RW_UNCONDITIONALLY (*master_dr))
>>>     {
>>>       /* an unconditional read won't trap.  */
>>>       if (DR_IS_READ (a))
>>>         return true;
>>>
>>>       /* an unconditionaly write won't trap if the base is written
>>>          to unconditionally.  */
>>>       if (base_master_dr
>>>           && DR_BASE_W_UNCONDITIONALLY (*base_master_dr))
>>>         return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
>>>       else
>>>         {
>>>           /* or the base is know to be not readonly.  */
>>>           tree base_tree = get_base_address (DR_REF (a));
>>>           if (DECL_P (base_tree)
>>>               && decl_binds_to_current_def_p (base_tree)
>>>               && ! TREE_READONLY (base_tree))
>>>             return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
>>>         }
>>>     }
>>> It is the main object '&s' that is recorded in base_master_dr, so
>>> s.b[i] is considered not trap.  Even it's not, I suppose
>>> get_base_address will give same result?
>>
>> Well, for this case it sees that s.b[i] is read from so it can't be an
>> out-of-bound
>> access.  And s.a[i] is written to unconditionally so 's' cannot be a
>> readonly object.
>> With both pieces of information we can conclude that s.b[i] = 0 doesn't trap.
>
> Hi Richard,
> Attachment is the updated patch.  I made below changes wrto your
> review comments:
> 1) Moved hash class for innermost loop behavior from tree-data-ref.h
> to tree-if-conv.c.
>     I also removed check on innermost_loop_behavior.aligned_to which
> seems redundant to me because it's computed from offset and we have
> already checked equality for offset.
> 2) Replaced ref_DR_map with new map innermost_DR_map.
> 3) Post-processed DR in if_convertible_loop_p_1 for compound reference
> or references don't have innermost behavior.  This cleans up following
> code using DR information.
> 4) Added a vectorization test for PR56625.
>
> I didn't incorporate your proposed code because I think it handles
> COMPONENT_REF of structure array like struct_arr[i].x/struct_arr[i].y.

But the previous code already handled it, so not handling it would be
a regression.
Note that I think your patch will handle it as well given you hash innermost
behavior.

> Looks to me it is another ifcvt opportunity different from PR69489.
> Anyway, fix is easy, I can send another patch in GCC7.
>
> Bootstrap and test on x86_64/AArch64, so any comments on this version?

Looks good to me.

Richard.

> Thanks,
> bin
>
> 2016-03-30  Bin Cheng  <bin.cheng@arm.com>
>
>     PR tree-optimization/56625
>     PR tree-optimization/69489
>     * tree-data-ref.h (DR_INNERMOST): New macro.
>     * tree-if-conv.c (innermost_loop_behavior_hash): New class for
>     hashing struct innermost_loop_behavior.
>     (ref_DR_map): Remove.
>     (innermost_DR_map): New map.
>     (baseref_DR_map): Revise comment.
>     (hash_memrefs_baserefs_and_store_DRs_read_written_info): Store DR
>     to innermost_DR_map accroding to its innermost loop behavior.
>     (ifcvt_memrefs_wont_trap): Get DR from innermost_DR_map according
>     to its innermost loop behavior.
>     (if_convertible_loop_p_1): Remove intialization for ref_DR_map.
>     Add initialization for innermost_DR_map.  Record memory reference
>     in DR_BASE_ADDRESS if the reference is compound one or it doesn't
>     have innermost loop behavior.
>     (if_convertible_loop_p): Remove release for ref_DR_map.  Release
>     innermost_DR_map.
>
> gcc/testsuite/ChangeLog
> 2016-03-30  Bin Cheng  <bin.cheng@arm.com>
>
>     PR tree-optimization/56625
>     PR tree-optimization/69489
>     * gcc.dg/vect/pr56625.c: New test.
>     * gcc.dg/tree-ssa/ifc-pr69489-1.c: New test.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]